Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 22
|
![]() |
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have found an odd behavior recently on this project, it would pause without any warning, 1 WU gave me computation error. I don't think is my PC, I'm a PC tech and I make sure is clean, optimized and smoothly running. All other projects run fine.
----------------------------------------I'm downloading the latest BOINC Client to see if this solves it. I'm suspending in the client this project until I re-install the new client. Any light on this would be appreciated. [Edit 1 times, last edit by Former Member at Feb 17, 2010 2:52:57 PM] |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Where does is stop? In the BOINC Manager view or in the Task Manager?
----------------------------------------If a science process runs and looses comms with the core client for longer than 30 seconds it could crash, or reset to last checkpoint or even back to the beginning. If you visit My Grid > Result Status and click on the error link and post the content, we could maybe get an indication in what direction to look. Security software is often the culprit or an extremely busy computer. edit: inserted missing words
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Jan 14, 2010 4:02:49 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The Task Manager goes flat (only 0-4% normal activity).
The Rice WU would eventually continue and pause again. It takes far more time than the calculated by the client to finish Rice WUs now. Result Name Device Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit R00503_ dab94c04be28b5f2518a1ce21238d25d_ 02_ 000_ 13-- saul Error 1/9/10 22:02:52 1/13/10 03:48:13 0.81 6.2 / 0.0 R00507_ c8cf69c9928d23cb4a8037f17aaf35b4_ 02_ 2-- saul Error 1/6/10 22:03:28 1/9/10 01:23:37 0.69 4.8 / 0.0 My security Software is not activated since I'm mostly the one who uses the PC and I know who uses when I'm not. I only torrent besides running the WCG. so I don't have a big load of processes, precisely to give Tech WCG the most of my CPU. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
As per my request, please click on your error in the RS result list and post log content. Then we could maybe take it forward.
----------------------------------------Pausing and resuming all by itself... I'd like to see the client message log too from when that occurred (stored in stdoutdae.txt file) BTW 6.10.18 is really just a beta that got mislabeled [my private opinion] as good for public release. It's laden with bugs... so now I'm on 6.10.25 alpha, which from a science stable running perspective is ok.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As per my request, please click on your error in the RS result list and post log content. Then we could maybe take it forward. I already did it, but you didn't acknowledge it, so here it is the log info again: Result Name Device Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit R00503_ dab94c04be28b5f2518a1ce21238d25d_ 02_ 000_ 13-- saul Error 1/9/10 22:02:52 1/13/10 03:48:13 0.81 6.2 / 0.0 R00507_ c8cf69c9928d23cb4a8037f17aaf35b4_ 02_ 2-- saul Error 1/6/10 22:03:28 1/9/10 01:23:37 0.69 4.8 / 0.0 Pausing and resuming all by itself... I'd like to see the client message log too from when that occurred (stored in stdoutdae.txt file) stdoutdae.7z (65 KB, 1.8 MB uncompressed).BTW 6.10.18 is really just a beta that got mislabeled [my private opinion] as good for public release. It's laden with bugs... so now I'm on 6.10.25 alpha, which from a science stable running perspective is ok. link?I must let you know that I'm having problems with my PC, some programs is locking and using the HDDs and causing disk access errors on my torrent client and in general, consuming all resources of my PC as a message box says when I try to Open the Task Manager; this is only when I leave the PC unattended for several hours. I came back from work a few ago and I got a BSOD mins. ago after the message, I don't have any software scheduled for running except Avast! VRDB (Virus Recovery DataBase) indexing tool, but I have installed Avast! several months ago and this would be the first time it would be stalling my PC, it makes no sense. My firewall is blocking important ports (0-1023) and only letting authorized apps to access the Internet, so I don't think is a hack case. Something that has changed in my PC recently is that I changed my monitor from a 19" CRT to a 16" LCD one and added a 500GB Seagate SATA HDD. I doubt any of them has anything to do with it. I also make Registry cleaning sweeps once ina while, maybe this erased important BOINC/WCG entries so I'll try to install locate and install the 6.10.25 alpha client, maybe this would fix the problem. [Edit 2 times, last edit by Former Member at Jan 15, 2010 1:24:11 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As per my request, please click on your error in the RS result list and post log content. Then we could maybe take it forward. I already did it, but you didn't acknowledge it, so here it is the log info again: Result Name Device Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit R00503_ dab94c04be28b5f2518a1ce21238d25d_ 02_ 000_ 13-- saul Error 1/9/10 22:02:52 1/13/10 03:48:13 0.81 6.2 / 0.0 R00507_ c8cf69c9928d23cb4a8037f17aaf35b4_ 02_ 2-- saul Error 1/6/10 22:03:28 1/9/10 01:23:37 0.69 4.8 / 0.0 This is not what Sekerob means. Please click the links at the position of the two "Error"'s I've bolded above, and when the popup window appears, paste the contents of that window here. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
My mistake, should have highlighted the word in "Click on your Error" (hyperlink). That is the link in the status column on the Result Status page.
----------------------------------------The log indicates multiple client exits and starts and there is a seeming scheduled interruption of networking Suspending network activity - time of day 13-Jan-2010 00:09:40 [World Community Grid] Restarting task R00503_d895417bd83086b805ea99651a4eda04_03_002_2 using rice version 617 13-Jan-2010 00:09:40 [World Community Grid] Restarting task R00503_dab94c04be28b5f2518a1ce21238d25d_02_001_4 using rice version 617 13-Jan-2010 00:44:00 [---] Exit requested by user What I don't understand is why that shows for multiple times of the day as you can set it for only 1 time segment per day. 13-Jan-2010 01:59:02 [---] Suspending network activity - time of day 13-Jan-2010 02:08:40 [---] Suspending network activity - time of day 13-Jan-2010 11:31:17 [---] Suspending network activity - time of day 13-Jan-2010 11:40:57 [---] Suspending network activity - time of day 13-Jan-2010 11:58:56 [---] Suspending network activity - time of day 13-Jan-2010 13:13:33 [---] Suspending network activity - time of day 13-Jan-2010 13:32:59 [---] Suspending network activity - time of day etc And most of them are followed by 13-Jan-2010 13:14:10 [---] Exit requested by user then BOINC being restarted. You do that or are some/all result of BSOD's? That there is trouble with the PC is now obvious. No project likes to be started and stopped multiple times and BOINC is programmed to kill jobs when they've done that 100 times in a certain time-frame. Possible your torrent program is the source, so maybe you should consider a rebuild of the installation, OS inclusive as they could be an infection of some sort. Certainly there is a science-process / core client intermittent connecting/disconnecting, but the message that go with that are different, so the result log is what may tell us a bit more. If it quit by itself. If you canceled it, some information is lost. The problem is I think not with the client version i.e. no need to try 6.10.25, just sticking to WCG's 6.2.28 is fine. edit: 3 for several clarifications.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 3 times, last edit by Sekerob at Jan 15, 2010 8:13:16 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Actually was my mistake, I thought those were the logs, but here they are, both of them (thx for the directions):
Result Name: R00503_ dab94c04be28b5f2518a1ce21238d25d_ 02_ 000_ 13-- <core_client_version>6.2.28</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1263831292.000000 Skipping: /computation_deadline wcg_seed 649860724 running time: 357.187500 wcg_seed 929333433 running time: 727.390625 wcg_seed 251287303 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7C938BEB read attempt to address 0x382D200A Engaging BOINC Windows Runtime Debugger... Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1263831292.000000 Skipping: /computation_deadline wcg_seed 1050628979 running time: 1037.578175 wcg_seed 969633792 running time: 1265.375050 wcg_seed 391833656 running time: 1515.156300 wcg_seed 67704598 running time: 1753.406300 wcg_seed 400777005 running time: 1988.984425 wcg_seed 441802039 running time: 2223.437550 wcg_seed 601431410 running time: 2458.875050 wcg_seed 937707896 running time: 2690.765675 wcg_seed 698952487 </stderr_txt> Result Name: R00507_ c8cf69c9928d23cb4a8037f17aaf35b4_ 02_ 2-- <core_client_version>6.2.28</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1263572128.000000 Skipping: /computation_deadline wcg_seed 628333672 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7C938BEB read attempt to address 0x000900F9 Engaging BOINC Windows Runtime Debugger... Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1263572128.000000 Skipping: /computation_deadline wcg_seed 660220897 running time: 361.000000 wcg_seed 973791960 running time: 712.265625 wcg_seed 656715217 running time: 1065.218750 wcg_seed 494929177 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7C920ED4 read attempt to address 0x38313429 Engaging BOINC Windows Runtime Debugger... Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1263572128.000000 Skipping: /computation_deadline wcg_seed 582312961 running time: 1419.609125 wcg_seed 363120889 running time: 1762.187250 wcg_seed 300690482 running time: 2110.359125 wcg_seed 773206155 Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x7C920ED4 read attempt to address 0x3935342A Engaging BOINC Windows Runtime Debugger... Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1263572128.000000 Skipping: /computation_deadline wcg_seed 944801083 running time: 2466.093250 wcg_seed 183893932 </stderr_txt> Seldom time I start another copy of the client but I realize that almost immediately, mostly, and exit both of them and start only 1 copy. I sometimes have to 'snooze' the client and then I realize I have to exit it, but sometimes I just 'Resume' it (System Tray Icon). I have scheduled WCG client network activity so I ca stop my torrent and let it communicate without competing for the connection, but is of little use since the WCG Client uploads the results as soon as it finish them, but this is the reason, I think, you read network activity is suspended. I have a 3 day queue. I only got 1 BSOD so far, but my PC has restarted unexpectedly, yeah, my windows my be damaged somehow I can't diagnose, everything seem to work just fine, but of course I have infected and desinfecte my PC many time because of my PC Tech job (clients' flah stick drives). I think the best thing to to is, yes, a clean WinXP installation and a newer version of the client. BTW, I recognized an error message in the logs: 0xc0000005, that is memory access violation, and that is sometimes malware trying to access/infect WCG Rice project in memory, some kind of injection and the malware is very into Windows core system, so much that the Task Manager won't show it nor Security software will ever detect it, so I bet is malware because if the Hardware and software is working OK, you have had infections and you have never had a problem with any program at all, which is my case, then this is a bet I would probably win if I could know more about digital forensics. I'll let you know if the problem persists, thx for the assist, cheers. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Your log reflects the BSOD / system issues in this line
----------------------------------------- exit code -1073741819 (0xc0000005) which is why it went south. There´s an FAQ for this error, item 4AK in the FAQ index, which boils down to: Restore general system stability.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Which maybe a damaged windows bad misuse of it (don't think is my case) or a malware infection of some kind which tries to infect everything it finds in memory, I think is this last one because I have seen it before in my job clients' PCs and mine has all the symptoms.
|
||
|
|
![]() |