Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Help Cure Muscular Dystrophy - Phase 2 Forum Thread: Monster WU on the loose... |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 98
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kremmen, if that can help you for your estimates, difficult positions on my Q6600 under Ubuntu 64 took between 4 and 5 hours each and the "worst" WU had 4 such difficult positions to pass. I think I'll give it up. 11.9% at 140 hours and it's showing no signs of improvement. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kremmen, if that can help you for your estimates, difficult positions on my Q6600 under Ubuntu 64 took between 4 and 5 hours each and the "worst" WU had 4 such difficult positions to pass. I think I'll give it up. 11.9% at 140 hours and it's showing no signs of improvement. I am very new to this and very hapy to see this thread and particularly happy to see someone else running Ubuntu-64 with Q6600. I am about to drop this whole project. Just abput every one of the WUs seem to work this way. I have one that stalled at .441%. I have it suspended and everything else except 2 to see if it will help the other from this project that stalled at 44.313% an hour ago and will probably stay there for ever. I aborted one that ran for 2 days and the only thing I had to show for it was rather warm cpus (never passed 19%). |
||
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3715 Status: Offline Project Badges: |
slade52,
----------------------------------------It seems that there have been more of those huge WUs distributed to Linux than to Windows machines, but if you have a Q6600 too you should let them go and they should finally complete. The longest one I have seen has needed 31.26 hours of runtime with its last tough position needing about 6 hours! The good news is that we should no longer receive such WUs. See details in this thread The fixed 4 hour HCMD2 seed jobs are out the pipeline!. There will still be positions needing hours to compute, but once you will have computed one in a WU the limiting feature should make this WU end. Cheers. Jean. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
slade52, It seems that there have been more of those huge WUs distributed to Linux than to Windows machines, but if you have a Q6600 too you should let them go and they should finally complete. The longest one I have seen has needed 31.26 hours of runtime with its last tough position needing about 6 hours! The good news is that we should no longer receive such WUs. See details in this thread The fixed 4 hour HCMD2 seed jobs are out the pipeline!. There will still be positions needing hours to compute, but once you will have computed one in a WU the limiting feature should make this WU end. Cheers. Jean. I don't mind the long time, if they work, but I was a litle put out by the one that lasted, in the same spot for 48 hours. I was afraid that the problem was not that it took a long time to do the calculation but that the packages from that project were just bad. I realize that our purpose here is to assist in large computation. Seems like a good idea to me or I wouldn't be doing it. I do think that the packages need designed to work. That was my concern and you have been reasuring and I will let them run. I have seen that the 0006 numbered WUs work very well indeed. By the way, if I was sending these WUs out, Linux boxs would get the bigger ones. I have used Win products for a long time. I wouldn't trust them to hold up without crashing and corrupting the data. Thank you very much. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm not sure but...
I've the feeling that I get all monster-WU's that everybody else is aborting. Got around 10 to 15 of them and at least one is already running for almost 140 hours, and is just on 6.61%. The thing still has 137 hours left and that number is increasing... The deadline is on the 11th of june, so it still has time. I leave it running and hopefully it finishes on time and gets valid |
||
|
Van Fanel
Cruncher Joined: Dec 27, 2006 Post Count: 42 Status: Offline Project Badges: |
I'm not sure but... I've the feeling that I get all monster-WU's that everybody else is aborting. I share your pain. I also have 15 or so of these leftover WUs. They are all falling on my Linux machine, while my Windows one is crunching WUs from the 0006 series already. The leftovers I have now running are from the 0001 and 0002 series with 6.13 app version. The two most serious ones are at 69% after 25.5 hours and 11.9% after 21.8 hours. I don't mind crunching behemoths, and I'm not complaining either; I'm just signalling the existence of these beasts. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It seems that there have been more of those huge WUs distributed to Linux than to Windows machines, but if you have a Q6600 too you should let them go and they should finally complete. The longest one I have seen has needed 31.26 hours of runtime with its last tough position needing about 6 hours! I have one going,right now, that claimed it woud be less than 2 hours that is now at 46.306% @36:56cputime + 20:59ETA. Another at 12.407 @ 26:53CPU + 26:21ETA. I suspect that these are going over 32 hrs. In the last 12 hours I have completed 1 of 4 WUs that were running. I still have one from this outfit that is not running yet. These really are silly. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Well, already around 20 WU's errored out because of... maximum CPU time exeeced! 168 hours gone...
This is a lot of waste runtime, because a WU Errors out on average 150 hours, so 20*150hours = 3500 hours = 125 days of work gone! The log of one: <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> Maximum CPU time exceeded </message> <stderr_txt> INFO: Initializing Platform. INFO: No state to restore. Start from the beginning. INFO: Initializing Platform. </stderr_txt> ]]> |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Strictly personal opinion, I think the monsters has a Linux base and nothing much to do with true monster positions. Remember, first there were the fails due a different optimization setting in the compiler to get them to run as fast as windows. There was a comment by knreed yesterday that further investigation is ongoing.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hey, finished one last night 39:59:20, and it is valid. Only 4 more of these buggers left to go. Due the 8th and 9th, not sure that 3 of them will make it.
|
||
|
|