Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Locked
Total posts in this thread: 17
Posts: 17   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1696 times and has 16 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
angry Serious bug with BOINC/HPF

One of my two computers just crashed, and one of your Human Proteome Folding jobs appears to be responsible.

Its BOINC completed some job or another, started a Rosetta task, and the Rosetta promptly crashed with some sort of "C++ runtime error". That machine's BOINC moved on to a second Rosetta job, and within a few minutes it crashed in the same manner. Then the following all crashed within a few minutes of starting: an FAAH job, another FAAH job, and another HPF job. Then OTHER applications started crashing with the same error message. Just about everything crashed, including Explorer, and Explorer restarted. The computer had become unusable however -- running anything would cause it to immediately crash with the same error message. Not a normal "This program has encountered a problem..." but some sort of C++ runtime error message, with only slight variations.

I had a task manager running, and noticed that everything that crashed first leaked -- no, hemmorhaged memory, bloating to over 1GB process size(!) before crashing.

The computer was working fine right up until that first Rosetta job crash. It was working fine yesterday at this time, and the day before. I hadn't changed anything. As far as I can tell, the first Rosetta crash started some kind of chain reaction that corrupted the system in some way.

In other words, Rosetta has a severe bug, and work units that have some particular X-factor in them can trigger this bug, which will cause Rosetta to crash and corrupt the system.

The truly amazing thing is that this isn't some poky Windows ME box. It's running XP Pro, a fully 32 bit protected-mode operating system that is supposed to prevent application faults from corrupting the entire system. I've never seen one crash before other than due to kernel-mode failures, usually in a video driver, when the crash takes the form of a blue screen. This wasn't a BSOD, yet the system did become corrupted, and it started with a Rosetta binary running and then crashing.

Please look into this. Anything that can take down a protected-mode operating system without having to run in kernel mode is a serious problem indeed, and might even be abused. There's clearly also a bug in WinXP for this to even be possible. With any luck, MS will fix it soon, since it has security implications (obviously it enables an unprivileged user to launch a denial of service attack on an XP box, and it might enable one to gain privilege depending on the exact nature of the bug). But until then it behooves WCG to try not to crash members' computers; widespread occurrences of what just happened to me will discourage participation, which I doubt you would enjoy. Even if MS fixes *their* bug, Rosetta crashing sometimes will rob users of points, WCG of usable HPF results, and the whole community of CPU cycles that go down the drain.

Oh, and the "runtime error" thing steals focus, which is annoying.
[Jan 29, 2008 6:48:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Serious bug with BOINC/HPF

It gets worse.

Whatever your crashing HPF job did to that computer, it has *survived a reboot*.

The computer, in other words, is going to need a complete reinstall of Windows because of you.

I don't know what you did, but the effects are drastic.

The computer, after rebooting, was more usable than immediately before rebooting, but after running only a short time one of your HPF tasks crashed again, followed almost immediately by everything else in BOINC's queue on that machine, and other applications started failing to start up. I also saw the excessive memory leak behavior again -- Explorer bloated up suddenly to a 1.2GB process size in a matter of seconds.

I rebooted it again and shut down BOINC right after startup. Worked with the machine for no more than ten minutes before Explorer went tits-up with the same error message as before. This time it had suddenly bloated up to 1.3GB.

What in Christ's name has your crashing Rosetta job done to my computer?!
[Jan 29, 2008 7:04:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Serious bug with BOINC/HPF

Oh, and by the way, after the next reboot the affected machine wouldn't even start up completely. Desktop comes up and a blank taskbar but the tray and Start button are just blue rectangles. I can fire up Task Manager with the three-finger salute and see that nothing much is happening.

Congratulations. You seem to have killed one of my PCs. I hope for your sake that my other computer doesn't catch one of these bad HPF jobs. Perhaps I should preemptively shutdown BOINC on it and quit WCG, just to be safe.

I may just do that. Unless, of course, you can convince me that this will not happen again.
[Jan 29, 2008 7:12:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Serious bug with BOINC/HPF

Sounds like a major borg event is my first impression. A full reformat and repartitioning seems in order, that is, don't you have a good restore point to go back to. My XP Pro has always done very well in that department.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jan 29, 2008 7:22:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Serious bug with BOINC/HPF

Hello twisted0n3,
This is the first crash of this sort ever reported.
Just about everything crashed, including Explorer, and Explorer restarted.

This sounds like something basic in the system went kablooie. I don't think that anyone will ever be able to definitively pin this on Rosetta. I cannot remember ever seeing a crash report like this over at Rosetta@home either.

Lawrence
[Jan 29, 2008 7:25:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Serious bug with BOINC/HPF

Do you happen to have the name of the WU that you suspect caused the initial crash? Maybe the techs could take a look at it for you.
[Jan 29, 2008 8:27:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Serious bug with BOINC/HPF

Hi twisted0n3.

Have you solved the existing problems with your computer?

We can't take your report seriously if it is just a continuation of your existing problems.

The last three times you reported a problem, you refused to take our advice. This makes us predisposed to ignore you entirely.
[Jan 29, 2008 8:31:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Serious bug with BOINC/HPF

I have been ignoring that **** since he posted...

**Edited for intolerance**

TKH
----------------------------------------
[Edit 1 times, last edit by TKH at Jan 30, 2008 1:57:44 PM]
[Jan 29, 2008 8:49:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Serious bug with BOINC/HPF

Do you happen to have the name of the WU that you suspect caused the initial crash? Maybe the techs could take a look at it for you.

hi joneill003,

The standard user-side routine is to check up in the WU detail on the Result Status page. If the others are properly returned and sitting in Pending Validation or, better still with HPF2, see that the minimum quorum of 15 was achieved and 'Valid', the probability of it not being a specific host problem is very small.

cheers
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jan 29, 2008 8:57:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
retsof
Former Community Advisor
USA
Joined: Jul 31, 2005
Post Count: 6824
Status: Offline
Project Badges:
Re: Serious bug with BOINC/HPF

Is twisted0n3 still playing broken games? He should look there first before blaming it on HPF.

Google Results 1 - 10 of about 4,680 for twisted0n3. (0.05 seconds)

----------------------------------------
SUPPORT ADVISOR
Work+GPU i7 8700 12threads
School i7 4770 8threads
Default+GPU Ryzen 7 3700X 16threads
Ryzen 7 3800X 16 threads
Ryzen 9 3900X 24threads
Home i7 3540M 4threads50%
[Jan 29, 2008 10:52:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 17   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread