Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Computing for Sustainable Water Forum Thread: Computing for Sustainable Water Problems Thread |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 254
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Has anyone else encountered this problem? If you run BOINCTasks and check it once a day or so, the stalled work units are pretty-apparent. e.g. It seems to happen more often on Intel processors than AMD for me... on the linux machines I just run # /sbin/service boinc-client restart On windows machines I use File->Exit in the manager with the 'stop running' box checked in the confirmation dialog, then restart it (which restarts the service/client, too, in windows)... suspend doesn't fix it for me, usually. Maybe because I have 'suspend to memory' selected in Preferences? This hanging was fixed per last betas and also 4x faster geared release, then further ported to a 64 bit version for all supported platforms --//-- edit: Don't know why I keep mixing SN2S with CFSW... too many S's in there [Edit 2 times, last edit by Former Member at Jun 22, 2012 6:23:12 PM] |
||
|
joeperry39@gmail.com
Advanced Cruncher USA Joined: Nov 22, 2006 Post Count: 140 Status: Offline Project Badges: |
I now getting mostly CFSW* units, although an occasional other project will sneak in. I generally abort them, unless it's another project I'm interested in.
----------------------------------------I find that both my slower dual-core and my faster quad are completing a wu in about 1.25 to 1.5 hours. Only variance is when I'm using the dual to burn CDs or DVDs or convert video files. I'm very pleased with the way this project is running so far. Living in the Chesapeake Bay watershed (Maryland) I'm very interested in this project. The Bay is a real American treasure that needs to be protected and restored to much better health. Hopefully this research will help with other bodies of water not only in the US but world-wide. *edit corrected project name acronym "Everything in moderation, including moderation" -- Mark Twain [Edit 1 times, last edit by osugrad at Jun 23, 2012 12:03:53 PM] |
||
|
astroWX
Advanced Cruncher USA Joined: Sep 1, 2007 Post Count: 56 Status: Offline Project Badges: |
Several tasks were irretrievably lost to hardware failure: They'll have time-out for the server to react. Apologies to all waiting for me to uphold my end of the bargain. --> The only good news in that is my additional cache is 0.01, so it could have been worse.
(Meanwhile, what was a Q9550 is now an i5-3550 running stock, with a new 10,000 RPM HDD.) |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This hanging was fixed per last betas and also 4x faster geared release, then further ported to a 64 bit version for all supported platforms Can't prove it by what I'm seeing... can't get this one WCG 6.11 CfSW cfsw_5840_05840773_0 C2Q6660 02:02:49 (00:16:54) 13.7 25.556 03:30:11 2012-07-02 22:52:49 Running to move on even after restarting the service, so I guess I'm going to have to abort that work unit. edit 1: (elapsed was over 9 hours when I restarted the service; it's been 2 hours since then, with CPU time still at 16:54.) edit 2: If it helps, here were the Properties just before I aborted it edit 3: https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=478202878 is the result status page for it, though I'm not certain that's a workable link for everyone. (?) edit4: changed image hosting location [Edit 4 times, last edit by Former Member at Aug 11, 2012 1:57:41 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
edit: Don't know why I keep mixing SN2S with CFSW... too many S's in there <comment>Sorry about that, Sek. My bad. That "serverStatus" is the culprit. The term has an "s" at the beginning and at the end... and an "S" at the middle even... SN2S with CFSW = serverStatu s ; </comment> |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Can't prove it by what I'm seeing... can't get this one WCG 6.11 CfSW cfsw_5840_05840773_0 C2Q6660 02:02:49 (00:16:54) 13.7 25.556 03:30:11 2012-07-02 22:52:49 Running to move on even after restarting the service, so I guess I'm going to have to abort that work unit. edit 1: (elapsed was over 9 hours when I restarted the service; it's been 2 hours since then, with CPU time still at 16:54.) edit 2: If it helps, here were the Properties just before I aborted it http://my.core.com/~zoso/BOINCTasks/Properties-cfsw_5840_05840773_0.png edit 3: https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=478202878 is the result status page for it, though I'm not certain that's a workable link for everyone. (?) Well, that's a 6.11... the new-old application sped up 4x in 32 bit. The newest/even faster/64bit is 6.12). If it's not making progress after a client restart, abort is the better action and have another cruncher take a shot [don't know if this copy had a wingman, to see if it finished proper]. edit: Links to the result detail page only give us the header, not the quorum, unfortunately [what's the secret, je ne sais pas]. The link will break too about 24 hours after validation for anyone to check, but techs. --//-- [Edit 1 times, last edit by Former Member at Jun 24, 2012 6:47:56 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
[don't know if this copy had a wingman, to see if it finished proper]. Result Name: cfsw_ 5840_ 05840773_ 1-- <core_client_version>6.12.33</core_client_version> It was a single quorum, but the replacement/wingman finished it in 3249 seconds, with a lots-newer client. I'm using the 6.10.45 client/manager available from EPEL, if I correctly recall which repo it's in (sitting at a windows machine at the moment). |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: |
12 errors after driver update restarts.
----------------------------------------Result Log Result Name: cfsw_ 6095_ 06095959_ 0-- <core_client_version>7.0.27</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> [15:04:00] INFO:Beginning simulation: 1990:240:1303278345 [15:06:46] INFO: Finished tick number 4 [15:08:06] INFO: Finished tick number 9 [15:09:16] INFO: Finished tick number 14 [15:10:36] INFO: Finished tick number 19 [15:11:50] INFO: Finished tick number 24 [15:13:14] INFO: Finished tick number 29 [15:19:15] DEBUG: Restarting from checkpoint. [15:19:15]PctComplete = 0.0666667 [15:19:15]ticks:currentTick:modules:currentModule:restart:seed240:16:6:0:0:12290 </stderr_txt> ]]> First time I've seen this. XP Pro SP3
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
|
||
|
Thargor
Veteran Cruncher UK Joined: Feb 3, 2012 Post Count: 1291 Status: Offline Project Badges: |
A few CFSW units, recently, I've noticed have got to 100% and then just sat there for ages (this latest one hit 100% over an hour ago), taking up a slot but not uploading.
----------------------------------------Had one yesterday that did it, eventually uploaded and credited for, but not sure how long it took to actually get to the "Uploading" stage as I'd gone to sleep... Current unit: cfsw_6713_06713933_1 (Computing for Sustainable Water 6.12) Edit: it finally went to Uploading, about half an hour after a full restart of the client (don't know whether it was a coincidence or whether that finally prompted it to finish). [Edit 2 times, last edit by Thargor at Jul 3, 2012 11:01:24 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've got 2 nearly identical laptops... about the only differences are one has a Core Duo T2500 (2.0GHz) running XP Pro 32-bit with 2GB of DDR2.
----------------------------------------The other is a Core2 Duo T7200 (2.0GHz), which has 2x the L2 cache as T2500, running Win7 HP x64 with 4GB of DDR2. So, it's essentially 6.11 versus 6.12 The 32-bit app is crunching them about 16 minutes slower, but claiming/getting ~10x as many points. Can anyone explain the points discrepancy? Thanks! edit: changed hosting location [Edit 1 times, last edit by Former Member at Aug 11, 2012 1:46:52 PM] |
||
|
|