Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2311 times and has 5 replies Next Thread
p3nguin53
Advanced Cruncher
USA
Joined: Dec 8, 2008
Post Count: 95
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
HFCC WU keeps restarting [Resolved]

After spending about 40 hrs on wu HFCC_ s2_ 00169487_ s2_ 0001_ 0-- , I finally aborted it since it kept restarting back at 0%. Symptoms are similar to another thread about a wu restarting. That thread was tagged as resolved so I opened this new one.

I've been running 2 HFCC wu's at a time for weeks now and this is the only wu that had a problem. The last restart was the only one that showed a 'restarting' msg in the Message Tab. Don't know why it keeps restarting - nothing special going on with the laptop when the restarts happen. It restarted at a different point each time.

Not sure if this WU has a problem or if it is something with my machine. Since I aborted it, I thought I should report the problem.

Here is the stderr info captured before the abort (condensed):
Failed to get VersionInfo size: 2
INFO:[09:05:32] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
INFO:[09:05:53] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
_maxGenSeenSoFar changed: 6750
About to enter main loop...(dockings already completed: 0)
Updating Best Energy for WU: 0.00
Finished Docking number 0
Updating Best Energy for WU: -7.54
Finished Docking number 1
Updating Best Energy for WU: -7.93
Finished Docking number 2
.
.
Finished Docking number 18
Finished Docking number 19
Updating Best Energy for WU: -8.23
Finished Docking number 20
.
.
Finished Docking number 237
Finished Docking number 238
Finished Docking number 239
Failed to get VersionInfo size: 2
INFO:[03:04:42] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
INFO:[03:05:20] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
_maxGenSeenSoFar changed: 6750
About to enter main loop...(dockings already completed: 0)
Updating Best Energy for WU: 0.00
Finished Docking number 0
Updating Best Energy for WU: -7.54
Finished Docking number 1
Updating Best Energy for WU: -7.93
Finished Docking number 2
.
.
Finished Docking number 18
Finished Docking number 19
Updating Best Energy for WU: -8.23
Finished Docking number 20
.
.
Finished Docking number 124
Finished Docking number 125
Finished Docking number 126
Failed to get VersionInfo size: 2
INFO:[11:49:21] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
INFO:[11:49:55] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
_maxGenSeenSoFar changed: 6750
About to enter main loop...(dockings already completed: 0)
Updating Best Energy for WU: 0.00
Finished Docking number 0
Updating Best Energy for WU: -7.54
Finished Docking number 1
Updating Best Energy for WU: -7.93
Finished Docking number 2
.
.
Finished Docking number 18
Finished Docking number 19
Updating Best Energy for WU: -8.23
Finished Docking number 20
.
.
Finished Docking number 158
Finished Docking number 159
Finished Docking number 160
Failed to get VersionInfo size: 2
INFO:[00:48:09] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
INFO:[00:48:45] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
_maxGenSeenSoFar changed: 6750
About to enter main loop...(dockings already completed: 0)
Updating Best Energy for WU: 0.00
Finished Docking number 0
Updating Best Energy for WU: -7.54
Finished Docking number 1
Updating Best Energy for WU: -7.93
Finished Docking number 2
Finished Docking number 3
Finished Docking number 4
Finished Docking number 5
Finished Docking number 6
Finished Docking number 7


Here is beginning of Message Tab info:
2/24/2010 12:41:33 AM||Starting BOINC client version 6.2.28 for windows_intelx86
2/24/2010 12:41:33 AM||log flags: task, file_xfer, sched_ops
2/24/2010 12:41:33 AM||Libraries: libcurl/7.19.0 OpenSSL/0.9.8i zlib/1.2.3
2/24/2010 12:41:33 AM||Data directory: C:\ProgramData\BOINC
2/24/2010 12:41:33 AM||Running under account Karen
2/24/2010 12:41:34 AM||Processor: 2 GenuineIntel Intel(R) Core(TM)2 Extreme CPU X7900 @ 2.80GHz [x86 Family 6 Model 15 Stepping 11]
2/24/2010 12:41:34 AM||Processor features: fpu tsc pae nx sse sse2 pni mmx
2/24/2010 12:41:34 AM||OS: Microsoft Windows Vista: Business x86 Editon, Service Pack 2, (06.00.6002.00)
2/24/2010 12:41:34 AM||Memory: 3.18 GB physical, 6.56 GB virtual
2/24/2010 12:41:34 AM||Disk: 220.30 GB total, 176.36 GB free
2/24/2010 12:41:34 AM||Local time is UTC -6 hours
2/24/2010 12:41:34 AM|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 780115; location: (none); project prefs: default
2/24/2010 12:41:34 AM||General prefs: from World Community Grid (last modified 10-Mar-2009 01:49:58)
2/24/2010 12:41:34 AM||Host location: none
2/24/2010 12:41:34 AM||General prefs: using your defaults
2/24/2010 12:41:34 AM||Reading preferences override file
2/24/2010 12:41:34 AM||Preferences limit memory usage when active to 2445.86MB
2/24/2010 12:41:34 AM||Preferences limit memory usage when idle to 2445.86MB
2/24/2010 12:41:34 AM||Preferences limit disk usage to 9.31GB
----------------------------------------
[Edit 1 times, last edit by p3nguin53 at Mar 1, 2010 4:11:27 PM]
[Feb 28, 2010 8:44:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: HFCC WU keeps restarting

Hi,

Yes, with a manual abort the error code that otherwise would have recorded is lost.

If you look through the stdoutdae.txt log of BOINC are there any "zero status exits" recorded around that timeframe?

Plz monitor the Result Status page work unit quorum detail to see how the repair job is doing. Was this originally a init 1 quorum 1 or a init 2 quorum 2. The sent time stamps will tell what order the work was distributed.

Sorry for your 40 hours cycling through the task.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 28, 2010 1:18:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
p3nguin53
Advanced Cruncher
USA
Joined: Dec 8, 2008
Post Count: 95
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HFCC WU keeps restarting

The only related msgs in the stdoutdae file were:
26-Feb-2010 09:05:32 [World Community Grid] Starting HFCC_s2_00169487_s2_0001_0
26-Feb-2010 09:05:32 [World Community Grid] Starting task HFCC_s2_00169487_s2_0001_0 using hfcc version 610
28-Feb-2010 00:48:09 [World Community Grid] Restarting task HFCC_s2_00169487_s2_0001_0 using hfcc version 610
28-Feb-2010 01:40:57 [World Community Grid] Computation for task HFCC_s2_00169487_s2_0001_0 finished

There were no msgs for the other 2 restarts.

Original quorum was just 1. One Repair copy has been sent out. Will keep an eye on it.

Since the WU kept stoppng/restarting at different points, I was hoping that it would eventually be successful. No such luck. sad
[Feb 28, 2010 5:12:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: HFCC WU keeps restarting

Make sure your AV software has an exclusion for the BOINC Data dir

2/24/2010 12:41:33 AM||Data directory: C:\ProgramData\BOINC

It's a source of problems at times, it being save since BOINC checks file integrity too and gets told by the projects your client is attached too what it's supposed to have in task files.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 28, 2010 5:27:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
p3nguin53
Advanced Cruncher
USA
Joined: Dec 8, 2008
Post Count: 95
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HFCC WU keeps restarting

I just checked my exclusions. Thought maybe I lost the BOINC entry since I recently upgraded to a newer version of Norton. But it's still in there.

Also checked the AV history. There were no AV msgs at the time of the restarts.
[Feb 28, 2010 5:45:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
p3nguin53
Advanced Cruncher
USA
Joined: Dec 8, 2008
Post Count: 95
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HFCC WU keeps restarting

My wingman finished the repair WU successfully without any restarts so I will tag this problem as resolved. Must have been a problem with my machine.
[Mar 1, 2010 4:10:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread