| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 35
|
|
| Author |
|
|
D_S_Spence
Advanced Cruncher Canada Joined: Jan 5, 2017 Post Count: 118 Status: Offline Project Badges:
|
I currently have the following work unit: https://www.worldcommunitygrid.org/contribution/workunit/706401640
BETA_BETA30_9800030_0656_0 Microsoft Windows 11 Error 2025-04-26 00:33:37 UTC 2025-04-26 20:06:29 UTC The errors were all "- Unhandled Exception Record - Reason: Out Of Memory (C++ Exception)..." Someone's machine got through it and currently two of us are working on it. I will leave it because I'm curious. Unfortunately the machine of mine that currently has this work unit is scheduled to work only 8 hours per day. :/ |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7844 Status: Offline Project Badges:
|
Just had another one finish BETA_BETA30_9800029_0410. It has been the longest one yet 74 hours and it is valid.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Recently Active Project Badges:
|
Just saw this one, it had been running for five days and 5 hours and 'twas still at 0.500%:
Deadline---------------- CPUtime LastChkpnt Remaining Status WorkingSet AVN Name----------------So the projected remaining runtime was more than one thousand days. Managed to abort it and be done with it (Replication: 0). As it stands, another poor soul is still crunching away on it:
Adri |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7844 Status: Offline Project Badges:
|
Just saw this one, it had been running for five days and 5 hours and 'twas still at 0.500%: It appears that those units which get to 0.5% and do not progress past that point are doomed to failure. I had some in the 098000010 series and I had to abort those because they stuck at that point. Hopefully the powers that be have a way of noticing the pattern in the various batches and have the system abort them. What I wonder a bit about is why did work units with this kind of a problem get through alpha testing ? Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
Boca Raton Community HS
Senior Cruncher Joined: Aug 27, 2021 Post Count: 209 Status: Offline Project Badges:
|
Just saw this one, it had been running for five days and 5 hours and 'twas still at 0.500%: It appears that those units which get to 0.5% and do not progress past that point are doomed to failure. I had some in the 098000010 series and I had to abort those because they stuck at that point. Hopefully the powers that be have a way of noticing the pattern in the various batches and have the system abort them. What I wonder a bit about is why did work units with this kind of a problem get through alpha testing ? Cheers I share the same thought- some of them didn't work at all. Even some of the higher generation numbers (second wave) were INCREDIBLY slow versus the 098000008 series (maybe less calculations were being done in the 098000008 generation?). The 098000010 work units are.... terrible. We have about 15 that are still in progress with a prediction of like10 days left. |
||
|
|
PowerFactor
Ace Cruncher Joined: Dec 9, 2016 Post Count: 4033 Status: Offline Project Badges:
|
I share the same thought- some of them didn't work at all. Even some of the higher generation numbers (second wave) were INCREDIBLY slow versus the 098000008 series (maybe less calculations were being done in the 098000008 generation?). The 098000010 work units are.... terrible. We have about 15 that are still in progress with a prediction of like10 days left. No Kidding! I got a valid result in the 9800030 series (BETA_BETA30_9800030_0434). It look my Ubuntu Linux computer 31.14 hours to complete, and my wingman's computer took 77.86 hours to complete! |
||
|
|
D_S_Spence
Advanced Cruncher Canada Joined: Jan 5, 2017 Post Count: 118 Status: Offline Project Badges:
|
My WU BETA_BETA30_9800030_0656_7 (see above) has passed its deadline and is currently listed as a "No Reply", but it continues to run.
I have given that machine a few more than 8 hours for the past two days. Currently the elapsed time is 1d03:47 and it is almost 64% done. The machine will be working only 8h per day over the weekend (23h-7h). I hope the job does not get aborted. The progress does seem to go down a bit when the processing is restarted after a stop. It was stopped at 26% yesterday, and I changed the scheduling on the BOINC manager and after the task started it went down to 23%. That's if I remember correctly. |
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
My WU BETA_BETA30_9800030_0656_7 (see above) has passed its deadline and is currently listed as a "No Reply", but it continues to run. Most WUs, when interrupted, will restart at the last successful checkpoint. The percentage between checkpoints varies between projects at least, for example with ARP1 it is something like 12.5% (7.5%?). I just had one of those restart due to local power failure and lost pretty much several hours as it was just minutes from the next checkpoint. I have given that machine a few more than 8 hours for the past two days. Currently the elapsed time is 1d03:47 and it is almost 64% done. The machine will be working only 8h per day over the weekend (23h-7h). I hope the job does not get aborted. The progress does seem to go down a bit when the processing is restarted after a stop. It was stopped at 26% yesterday, and I changed the scheduling on the BOINC manager and after the task started it went down to 23%. That's if I remember correctly. Such is life... |
||
|
|
Paul Schlaffer
Senior Cruncher USA Joined: Jun 12, 2005 Post Count: 278 Status: Offline Project Badges:
|
My WU BETA_BETA30_9800030_0656_7 (see above) has passed its deadline and is currently listed as a "No Reply", but it continues to run. Most WUs, when interrupted, will restart at the last successful checkpoint. The percentage between checkpoints varies between projects at least, for example with ARP1 it is something like 12.5% (7.5%?). I just had one of those restart due to local power failure and lost pretty much several hours as it was just minutes from the next checkpoint. I have given that machine a few more than 8 hours for the past two days. Currently the elapsed time is 1d03:47 and it is almost 64% done. The machine will be working only 8h per day over the weekend (23h-7h). I hope the job does not get aborted. The progress does seem to go down a bit when the processing is restarted after a stop. It was stopped at 26% yesterday, and I changed the scheduling on the BOINC manager and after the task started it went down to 23%. That's if I remember correctly. Such is life... Mine were likewise still running well after the deadline, and initiating an update didn't change that. I had to abort those WU.
“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
----------------------------------------[Edit 1 times, last edit by Paul Schlaffer at May 2, 2025 5:04:14 PM] |
||
|
|
D_S_Spence
Advanced Cruncher Canada Joined: Jan 5, 2017 Post Count: 118 Status: Offline Project Badges:
|
Mine were likewise still running well after the deadline, and initiating an update didn't change that. I had to abort those WU. So now my question is: Is there any utility in letting my "No Reply" WU be completed, or should I abort it? Will the completed WU provide any useful information for the beta test? |
||
|
|
|