Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Help Cure Muscular Dystrophy - Phase 2 Forum Thread: Monster WU on the loose... |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 98
|
Author |
|
Van Fanel
Cruncher Joined: Dec 27, 2006 Post Count: 42 Status: Offline Project Badges: |
One of my behemoths finished successfully:
CMD2_ 0001-1HCI_ A.clustersOccur-1YDI_ A.clustersOccur_ 6953_ 3-- 613 Valid 03/06/09 09:40:30 05/06/09 09:51:25 41.72 860.8 / 839.9 Unfortunately, I still have three more to go. Their current status is: - 51.25 hours @ 25.15% with deadline 2:27:16 09-06-2009 - 26.5 hours @ 3.278% with deadline 3:55:16 10-06-2009 - 0.9 hours @ 1.229% with deadline 19:20:08 11-06-2009 I really don't mind to wrestle these behemoths, and I WILL carry on crunching them; but I'll need the powers that be give me the time these WUs require to get their calculations completed. |
||
|
Van Fanel
Cruncher Joined: Dec 27, 2006 Post Count: 42 Status: Offline Project Badges: |
One behemoth has busted:
----------------------------------------CMD2_ 0001-1HCI_ A.clustersOccur-2O72_ A.clustersOccur_ 4584_ 5-- tenkai-linux Error 03/06/09 11:03:16 06/06/09 11:45:05 65.84 1,360.6 / 0.0 <core_client_version>6.4.5</core_client_version> <![CDATA[ <message> Maximum CPU time exceeded </message> <stderr_txt> INFO: Initializing Platform. INFO: No state to restore. Start from the beginning. INFO: Initializing Platform. INFO: Initializing Platform. INFO: Initializing Platform. </stderr_txt> ]]> Is it possible for me to manually adjust the maximum CPU time? [Edit 1 times, last edit by Van Fanel at Jun 6, 2009 3:16:34 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I only have 2 more to finish. ! is at 77hrs 44% 59hr ETA. The other at 53hrs 25% 78hrs ETA.
The last that finished only took 76hrs. It is listed as inconclusive, hope it comes out valid. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Well one of the buggers just got aborted for computational errors and the other is due in several hours less than the time remaining.
What a waste of time and electricity. |
||
|
mclaver
Veteran Cruncher Joined: Dec 19, 2005 Post Count: 566 Status: Offline Project Badges: |
I must have been luckier. The longest one I have processed so far took 41.26 hours on a I7 920 under Ubuntu 9.04. I just have to hope my wingman eventually finishes it because the claimed credit is 565. :).
----------------------------------------CMD2_ 0002-1R46_ B.clustersOccur-2DAG_ A.clustersOccur_ 22_ 1-- MSI-I7-920 Pending Validation 5/31/09 16:31:47 6/2/09 22:01:12 41.26 565.0 / 0.0 |
||
|
mclaver
Veteran Cruncher Joined: Dec 19, 2005 Post Count: 566 Status: Offline Project Badges: |
I was doing more research on long WUs and noticed a Pending Validation WU that I returned May 16th, that took me 4.56 hours with a claimed credit of 62.1.
----------------------------------------4 other wingman have attempted this WU. One aboted, one ended in error after 86.82 hours and a claimed credit of 1182, another ended in error after 68.06 hours and a claimed credit if 1380. The other is still in progress. Mine was on an I7 920, which is a pretty fast machine, but I can't believe it is that faster. If I completed in 4.56 hours, why would someone else error after 86.82 hours, and no one else has been able to complete this WU yet? What happens of no one else can finish it? I have a couple of others, Pending Validation, over 20 days old, with similiar symptons of errors by wingman, with significantly higher CPU time then mine. Workunit Status Project Name: Help Cure Muscular Dystrophy - Phase 2 Created: 5/15/09 Name: CMD2_0001-1HCI_A.clustersOccur-2O72_A.clustersOccur_949 Minimum Quorum: 2 Replication: 2 Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit CMD2_ 0001-1HCI_ A.clustersOccur-2O72_ A.clustersOccur_ 949_ 4-- - In Progress 6/5/09 18:50:08 6/11/09 09:14:08 0.00 0.0 / 0.0 CMD2_ 0001-1HCI_ A.clustersOccur-2O72_ A.clustersOccur_ 949_ 3-- 613 Error 6/2/09 09:46:39 6/5/09 18:43:01 68.06 1,380.1 / 0.0 CMD2_ 0001-1HCI_ A.clustersOccur-2O72_ A.clustersOccur_ 949_ 2-- 613 Error 5/29/09 16:01:49 6/2/09 09:42:23 86.82 1,182.2 / 0.0 CMD2_ 0001-1HCI_ A.clustersOccur-2O72_ A.clustersOccur_ 949_ 1-- 611 Pending Validation 5/15/09 14:17:09 5/16/09 07:54:22 4.56 62.1 / 0.0 CMD2_ 0001-1HCI_ A.clustersOccur-2O72_ A.clustersOccur_ 949_ 0-- 611 Aborted 5/15/09 14:17:07 6/1/09 14:39:35 0.00 0.0 / 0.0 |
||
|
Van Fanel
Cruncher Joined: Dec 27, 2006 Post Count: 42 Status: Offline Project Badges: |
The second of my behemoths just busted out:
CMD2_ 0001-GPDAA.clustersOccur-IMB1A.clustersOccur_ 121_ 5-- tenkai-linux Error 04/06/09 12:31:16 07/06/09 07:07:04 60.91 1,266.9 / 0.0 <core_client_version>6.4.5</core_client_version> <![CDATA[ <message> Maximum CPU time exceeded </message> <stderr_txt> INFO: Initializing Platform. INFO: No state to restore. Start from the beginning. </stderr_txt> ]]> And in relation with the post just above this one: yes, I've noticed too that what used to be a fast WU, has given origin to these behemoths. The only difference that I can see just by looking at all of my monster WU, is that the 'good' WU ran on app. version 6.11, while the monster WU is running on 6.13. All of my monsters share this trend, and the same can be seen on the previous post. My monsters are: CMD2_ 0001-1HCI_ A.clustersOccur-2DDF_ A.clustersOccur_ 651_ 4-- CMD2_ 0002-IF4A2A.clustersOccur-SKP1A.clustersOccur_ 125_ 3-- CMD2_ 0001-1HCI_ A.clustersOccur-2O72_ A.clustersOccur_ 4584_ 5-- CMD2_ 0001-GPDAA.clustersOccur-IMB1A.clustersOccur_ 121_ 5-- My personal opinion: more testing/development is required for app. version 6.13 for Linux machines. |
||
|
mclaver
Veteran Cruncher Joined: Dec 19, 2005 Post Count: 566 Status: Offline Project Badges: |
And in relation with the post just above this one: yes, I've noticed too that what used to be a fast WU, has given origin to these behemoths. The only difference that I can see just by looking at all of my monster WU, is that the 'good' WU ran on app. version 6.11, while the monster WU is running on 6.13. All of my monsters share this trend, and the same can be seen on the previous post. My personal opinion: more testing/development is required for app. version 6.13 for Linux machines. An interesting observation. Of all my Pending Validation, where my wingmen had multiple errors with long CPU times, They were on 613 and I was on 611. But, the WU above, which is Pending and took 41.26 hours, which is my longest WU on this machine, completed under 613. All of these examples are on an I7 920 running Ubuntu 9.04 and BOINC version 6.6.20. I am not sure when I was upgrade to 613, but I have had multiple successes on this version. Workunit Status Project Name: Help Cure Muscular Dystrophy - Phase 2 Created: 5/28/09 Name: CMD2_0002-1R46_B.clustersOccur-2DAG_A.clustersOccur_22 Minimum Quorum: 2 Replication: 2 Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit CMD2_ 0002-1R46_ B.clustersOccur-2DAG_ A.clustersOccur_ 22_ 0-- - In Progress 5/31/09 16:33:09 6/14/09 16:33:09 0.00 0.0 / 0.0 CMD2_ 0002-1R46_ B.clustersOccur-2DAG_ A.clustersOccur_ 22_ 1-- 613 Pending Validation 5/31/09 16:31:47 6/2/09 22:01:12 41.26 565.0 / 0.0 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The one I have left seems to be progressing but the ETA does not seem to be going down. It is now at 75:30 at 36% after running 74:13. Due 6-09 at about 8AM. Aint no way.
CMD2_ 0001-1HCI_ A.clustersOccur-1YDI_ A.clustersOccur_ 6739_ 3-- is the bugger. There appears to be 1 other working? it but due date was the 3rd and marked no reply. This is the only thing I have on the box. Obviously I don't need anything else (although I have 3 other cpus). I do hope it ends, one way or another, someday. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
OK, may as well use this as a learning experience.
You guys keep talking about WU app versions. How do you tell? My current, according to the /var/lib/boinc-client/stdoutae.txt file is: CMD2_0001-1HCI_A.clustersOccur-1YDI_A.clustersOccur_6739_3 Could someone parse that? |
||
|
|