| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 11
|
|
| Author |
|
|
davidhobbs
Senior Cruncher England Joined: Dec 30, 2004 Post Count: 152 Status: Offline Project Badges:
|
I have taken the plunge and switched all my devices from UD to BOINC so I am just beginning to climb up a new learning curve! They all seem quite happy apart from one particular device, a 1GHz 256MB XP-Pro machine. It has now returned six results, three of which are valid and three in error. The error messages include the lines "Failed to get version info size 1812" and "Unhandled exception detected".
How should I go about understanding what has happened? This device performed quite happily for many months running the UD agent. Thanks, David. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
David --
----------------------------------------Please clip the messages from BOINC Manager and post them here. Thanks, [Edit 1 times, last edit by Former Member at Sep 16, 2007 10:40:13 AM] |
||
|
|
davidhobbs
Senior Cruncher England Joined: Dec 30, 2004 Post Count: 152 Status: Offline Project Badges:
|
Hello Dave Bell,
This is what BOINC manager currently shows: 15/09/2007 22:03:16||Starting BOINC client version 5.8.15 for windows_intelx86 15/09/2007 22:03:16||log flags: task, file_xfer, sched_ops 15/09/2007 22:03:16||Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3 15/09/2007 22:03:16||Data directory: C:\Program Files\BOINC 15/09/2007 22:03:16||Processor: 1 AuthenticAMD AMD Duron(tm) Processor [x86 Family 6 Model 7 Stepping 1] [fpu tsc sse 3dnow mmx] 15/09/2007 22:03:16||Memory: 255.48 MB physical, 617.50 MB virtual 15/09/2007 22:03:16||Disk: 18.64 GB total, 15.09 GB free 15/09/2007 22:03:16|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 283134; location: (none); project prefs: default 15/09/2007 22:03:16||General prefs: from World Community Grid (last modified 2007-08-25 16:55:58) 15/09/2007 22:03:16||Host location: none 15/09/2007 22:03:16||General prefs: using your defaults 15/09/2007 22:03:18|World Community Grid|Restarting task lh249_00139_8 using hpf2 version 518 16/09/2007 06:03:15|World Community Grid|Sending scheduler request: To fetch work 16/09/2007 06:03:15|World Community Grid|Requesting 81 seconds of new work 16/09/2007 06:03:20|World Community Grid|Scheduler RPC succeeded [server version 509] 16/09/2007 06:03:20|World Community Grid|Deferring communication for 5 min 3 sec 16/09/2007 06:03:20|World Community Grid|Reason: requested by project 16/09/2007 06:03:23|World Community Grid|[file_xfer] Started download of file lh280-289_lh288.fasta.gz 16/09/2007 06:03:23|World Community Grid|[file_xfer] Started download of file lh280-289_lh288.psipred.gz 16/09/2007 06:03:24|World Community Grid|[file_xfer] Finished download of file lh280-289_lh288.fasta.gz 16/09/2007 06:03:24|World Community Grid|[file_xfer] Throughput 218 bytes/sec 16/09/2007 06:03:24|World Community Grid|[file_xfer] Finished download of file lh280-289_lh288.psipred.gz 16/09/2007 06:03:24|World Community Grid|[file_xfer] Throughput 1844 bytes/sec 16/09/2007 06:03:24|World Community Grid|[file_xfer] Started download of file lh280-289_lh288.psipred_ss2.gz 16/09/2007 06:03:24|World Community Grid|[file_xfer] Started download of file lh280-289_aalh28803_05.075_v1_3.gz 16/09/2007 06:03:25|World Community Grid|[file_xfer] Finished download of file lh280-289_lh288.psipred_ss2.gz 16/09/2007 06:03:25|World Community Grid|[file_xfer] Throughput 7772 bytes/sec 16/09/2007 06:03:25|World Community Grid|[file_xfer] Started download of file lh280-289_aalh28809_05.075_v1_3.gz 16/09/2007 06:03:31|World Community Grid|[file_xfer] Finished download of file lh280-289_aalh28803_05.075_v1_3.gz 16/09/2007 06:03:31|World Community Grid|[file_xfer] Throughput 75508 bytes/sec 16/09/2007 06:03:36|World Community Grid|[file_xfer] Finished download of file lh280-289_aalh28809_05.075_v1_3.gz 16/09/2007 06:03:36|World Community Grid|[file_xfer] Throughput 96664 bytes/sec 16/09/2007 10:48:42|World Community Grid|Task lh249_00139_8 exited with zero status but no 'finished' file 16/09/2007 10:48:42|World Community Grid|If this happens repeatedly you may need to reset the project. 16/09/2007 10:48:49|World Community Grid|Restarting task lh249_00139_8 using hpf2 version 518 ... but the results in question were returned before the dates shown here. The error messages I referred to were the ones shown in My Grid, Results Status. I'm not clear if these would have been repeated in BOINC manager on the actual device? This device runs overnight and would have gone into hibernation at 06:30 today. I restarted it at about 10:48 to get the data you asked for. Sorry for the confusion, but I find this BOINC stuff utterly confusing at the moment. When I was young I used to find change stimulating and refreshing... now it's just something else to be endured! David. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
David --
----------------------------------------I believe the "exited with zero status but no 'finished' file" is one you will often see when exiting hibernation. I believe the "UNHANDLED EXCEPTION" message should be followed by a statement about the exception that was not handled. I am not sure, but the unable to get version info sounds like it may relate to a communications problem. Perhaps another of the CA's or another member might be able to shed more light on these. [Edit 1 times, last edit by Former Member at Sep 16, 2007 12:09:54 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
The FAQ section maintainede by the CA's on WCG called 'Start Here', has an item on the VFAQ item of '1812' (always reminds me of a Napoleontic event). http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=15646
----------------------------------------Shortly I'll be posting an FAQ listing down the various existing error logs, which are all found in the BOINC program directory, the key being stdoutdae.txt. On slower crunchers it holds months of entries. Also stderrdae.txt can hold the interesting stuff to post, so we can have a communal look without having the second guess. From looking at the log above, hope your virtual memory is auto-expand. It being too tight could in combination with other processes lead to issues, otherwise 3 invalid, 3 error would recommend a memtest86 run and verification you got all the latest video drivers. The [zero status] error is amongst as Dave indicates a classic hibernation notification. I think you wrote the manual. Timing important, i switched off any auto system time synching getting rid of most of these messages. You'll see them at mostly at restart.... just ignore them unless you see many appearing in a short time frame. The unhandled exception i've seen in every DDD-T log. Not sure, but maybe it's looking for a checkpoint, when at start there is no checkpoint. Added: The link of course :O
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at Sep 16, 2007 1:28:38 PM] |
||
|
|
davidhobbs
Senior Cruncher England Joined: Dec 30, 2004 Post Count: 152 Status: Offline Project Badges:
|
Thanks Sek,
I can confirm my virtual memory is set to system-determined size, and I believe I have all the latest drivers installed, I will perform a memory check as you suggest but this device has been running the UD agent happily at 100% throttle. David. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Throttle, yea, are yu running 'Minimum Impact' or 'Maximum Output' profile? The log suggests you've not attached to any device profile, suggesting default, suggesting you're running at 60%. It's been a source for some to have problems.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
davidhobbs
Senior Cruncher England Joined: Dec 30, 2004 Post Count: 152 Status: Offline Project Badges:
|
I'm running a custom profile, where I have set all devices to run all the time at 100% CPU and 100% memory usage.
David. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
David,
I have an older device that was also getting a lot of invalids. I noticed that the only stable project that the device was not getting invalids was FAAH. Once I changed that device's profile to crunching just FAAH's, I have not had a single invalid. I think the HPF2 and DDDT projects do not like older systems. Look at your valids and invalids and see if a pattern similar to mine emerges. It may be as simple as just changing your profile. Dan |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Older systems may have *not up to date* drivers and libraries. It's a bit contradictory as FA@H and DDDT use the identical underlying science engine 'AutoDock 4.0 / 2007 Update). Also their memory footprint is close to each other.
----------------------------------------Any pertinent correlation would be most helpful. E.g. did the work-unit flunking occur for a WU where the graphics were viewed, even briefly?
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
|