Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 7
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2835 times and has 6 replies Next Thread
cjslman
Master Cruncher
Mexico
Joined: Nov 23, 2004
Post Count: 2082
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
sad 2 large HFCC WUs errored out !!!

I just had 2 HFCC WUs error out. I would like to know if this was due to something that my computer did or these 2 are a batch of funky WUs that got loose. One of the symptoms is that these two WUs started to take a long time (normal time on my cpu is 6-10 hrs) and these ended up with over 20 hrs each (so I have over 40 hrs of crunching down the drain). I don't mind if I loose a couple of hours, but more than 40 hrs is too much for me (I don't have that much cpu power to begin with). I tried to list the result log, but I get an SQL error when trying to post the message crying (I assume it's too big). I need an answer soon, because I have another 4 WUs running on another CPU that are starting to look like these two WUs... taking more than time to crunch than normally it would.

Thanks,
CJSL

EDIT: i just noticed I posted in the wrong thread... duh! biggrin . Admins, please move to the HFCC thread.
----------------------------------------
I follow the Gimli philosophy: "Keep breathing. That's the key. Breathe."
Join The Cahuamos Team


----------------------------------------
[Edit 1 times, last edit by cjslman at Jun 5, 2012 11:20:07 PM]
[Jun 5, 2012 11:08:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2 large HFCC WUs errored out !!!

I just returned an HFCC WU and got validated as Valid, so probably something wrong from your end.

As for the result log, maybe you should at least post the WU name or the stuff from the "Workunit Status" window?
[Jun 6, 2012 5:07:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
cjslman
Master Cruncher
Mexico
Joined: Nov 23, 2004
Post Count: 2082
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 large HFCC WUs errored out !!!

Here's the WU's names:

HFCC_ target-9_ 00841388_ target-9_ 0001_ 2--
HFCC_ target-9_ 00842327_ target-9_ 0000_ 0-

Since the result log is too big to post, I'll try to place here the error:

Failed to get VersionInfo size: 1812

This message is repeated many, many times thru out the log. Not sure if it has to do with the WUs ending in error.

Since I haven't seen any other posts about this problem (except that the WU are taking longer to complete), I'll assume this was due to a problem on my CPU (bummer... 5 days away from ruby and I loose over 40 hours of crunching time crying ).

Thanks,
CJSL
----------------------------------------
I follow the Gimli philosophy: "Keep breathing. That's the key. Breathe."
Join The Cahuamos Team


[Jun 6, 2012 1:14:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2 large HFCC WUs errored out !!!

The 1812 normally appears only one time at start of AutoDock jobs, and is benign. See http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=15646

Smells like these jobs resumed from start, over and over again. That would appear as event/message log entries in the client too. Visit the stdoutdae.txt file to retrace what happened... if interested.

--//--
[Jun 6, 2012 1:27:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 large HFCC WUs errored out !!!

CJSL,

Can you post the end of the result log? As SekeRob mentioned the Failed to get VersionInfo size is nothing to worry about except that it occurs only at the start so if you have many of them your client is restarting the work unit over and over.

Thanks,
armstrdj
[Jun 6, 2012 1:40:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2 large HFCC WUs errored out !!!

I have a problem with a WU, too. It crunched as normal, but at the last restart of my PC, the WU restarted from the beginning. Now it`s crunching again, but I lost more than 10 hrs crunching time. And I´m not sure, whether it will get to the end this time.
Here is the part of stderr.txt:

Finished Docking number 159
Finished Docking number 160
Failed to get VersionInfo size: 2
INFO:[07:03:53] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
INFO:[07:06:01] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
_maxGenSeenSoFar changed: 6750
About to enter main loop...(dockings already completed: 0)
Updating Best Energy for WU: 0.00
Finished Docking number 0
Finished Docking number 1
Finished Docking number 2

Should I stop the WU or will it finish this time?

Thx, Julia
[Jun 13, 2012 3:13:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2 large HFCC WUs errored out !!!

Hi Julia,

Don't tell us you just had Windows do it's monthly update. My suggestion is to abort these tasks. Damaged goods, so it's probably better to let another cruncher start them clean.

Recommend: Stop BOINC after the monthly update [you can set Windows to download the updates, but only apply them when you're ready]. Then before giving the MS update the OK to boot, exit BOINC via the BOINC Manager and tell it to stop the service too, so it can shut down properly before the MS Update sequence takes control, usually intense before booting autonomously.

--//--
[Jun 13, 2012 3:33:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread