Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: Clean Energy Project - Phase 2 Beta May 23, 2016 [ Issues Thread ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 70
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I am over 14 hours in on a 3.4 GHz machine and the first checkpoint is yet to be reached. I'll raise you to BETA_ E236438_ 886_ S.400.C56F1H25N2S1.MVPFUQOYKWSINW-UHFFFAOYSA-N.19_ s1_ 14a_ 0-- Doh. [Edit 2 times, last edit by Former Member at May 25, 2016 11:01:57 AM] |
||
|
hiimebm
Senior Cruncher United States Joined: Oct 19, 2014 Post Count: 305 Status: Offline Project Badges: |
I was forced to abort one because it refused to checkpoint. I turn on my computer this morning, and it's back at 0 again! As soon as I did this, the manager said Computation Error, but the website says user aborted like what I really did. Weird...
---------------------------------------- |
||
|
hiimebm
Senior Cruncher United States Joined: Oct 19, 2014 Post Count: 305 Status: Offline Project Badges: |
I would have gone over on some other tasks, based on how slow it was going. I'll take the advice to not bail on cep2 next time by shutting down
---------------------------------------- |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Maybe the time has come [again], at this striding phase of project progress which is zero, to 1) consider a review of the cut-off time so 24/7 machines do get to at least 1 checkpoint and 2) make the project a 24/7 device recommendation to stop the bad feeling if not animosity over resetting to zero on device resume.
2 Eurocents |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'll raise you to Interesting - it went PVer at 18h; I was expecting Error . Maybe if/when a repair job succeeds?BETA_ E236438_ 886_ S.400.C56F1H25N2S1.MVPFUQOYKWSINW-UHFFFAOYSA-N.19_ s1_ 14a_ 0-- Doh. BETA_ E236438_ 886_ S.400.C56F1H25N2S1.MVPFUQOYKWSINW-UHFFFAOYSA-N.19_ s1_ 14a_ 1-- Microsoft Windows 8.1 Professional x64 Edition, (06.03.9600.00) - In Progress 25/05/16 11:00:35 29/05/16 11:00:35 0.00 0.0 / 0.0 BETA_ E236438_ 886_ S.400.C56F1H25N2S1.MVPFUQOYKWSINW-UHFFFAOYSA-N.19_ s1_ 14a_ 0-- Microsoft Windows 10 Core x64 Edition, (10.00.10586.00) 700 Pending Verification 24/05/16 10:20:16 25/05/16 11:00:30 18.00 213.0 / 0.0 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
We still have an RC = 0x1 exit in Job #0 not validating with an RC = 0x1 exit in Job #3; instead both go to PVer and a repair unit gets issued. The question is whether one of the original pair still ends up Invalid ... Ahh, the luck of the wingman draw again; mine went Invalid, despite being the more productive RC = 0x1 exit in Job #3.BETA_ E236439_ 314_ S.422.C44H18N4O2S6.PLTGJJHXMUKIKO-UHFFFAOYSA-N.12_ s1_ 14a_ 2-- Microsoft Windows 8.1 Professional x64 Edition, (06.03.9600.00) 700 Valid 25/05/16 03:55:21 25/05/16 10:34:01 1.34 37.5 / 50.4 BETA_ E236439_ 314_ S.422.C44H18N4O2S6.PLTGJJHXMUKIKO-UHFFFAOYSA-N.12_ s1_ 14a_ 1-- Microsoft Windows 10 Core x64 Edition, (10.00.10586.00) 700 Invalid 24/05/16 10:40:21 25/05/16 03:55:13 8.06 293.7 / 201.6 BETA_ E236439_ 314_ S.422.C44H18N4O2S6.PLTGJJHXMUKIKO-UHFFFAOYSA-N.12_ s1_ 14a_ 0-- Microsoft x64 Edition, (10.00.10586.00) 700 Valid 24/05/16 10:40:01 24/05/16 12:44:09 2.02 63.3 / 50.4 Maybe the validator needs an additional rule to always set the wingman canonical candidate to the one with the highest number of jobs completed. ... I agree, Rob. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
These are definitely not small. One of my machines got one because the previous machine was hit with error status for hitting 18 hours. Then mine was hit with error status at 18 hours. Now another job has gone out. (The first machine scraped it in in 15.12 hours.)
----------------------------------------*edited to appropriate forum content - ErikaT [Edit 1 times, last edit by ErikaT at May 26, 2016 1:19:09 PM] |
||
|
retsof
Former Community Advisor USA Joined: Jul 31, 2005 Post Count: 6824 Status: Offline Project Badges: |
Betas crunching fine and due to finish in about 18 hours EXCEPT I have just had a power cut (only a few minutes but enough to shut down my UPS). When back up I find that all of my Betas have reset to zero loosing me 64 hours (8 hours by 8 cores) of crunch time. Why oh why does this Beta not do a CPU checpoint. Please please ensure that future Betas include a checkpoint so that folks like me (subject to random power cuts) can preserve most of the work already completed. Checkpointing still seems to be an issue. I thought one of the tests for this new beta was to remedy that problem but I'm still seeing tasks run past 11 hours before the first checkpoint. Remedy that problem? I notice that the research says that CEP2 is 99% complete and ends May 2016. Why run any betas at all? I got one, 45 hour estimate and one hour into it. Yes, I asked for a few Zikas and it sent them into panic mode. Other things in the queue are waiting as usual.
SUPPORT ADVISOR
----------------------------------------Work+GPU i7 8700 12threads School i7 4770 8threads Default+GPU Ryzen 7 3700X 16threads Ryzen 7 3800X 16 threads Ryzen 9 3900X 24threads Home i7 3540M 4threads50% [Edit 1 times, last edit by retsof at May 25, 2016 1:42:24 PM] |
||
|
Seoulpowergrid
Veteran Cruncher Joined: Apr 12, 2013 Post Count: 815 Status: Offline Project Badges: |
Remedy that problem? I notice that the research says that CEP2 is 99% complete and ends May 2016. Why run any betas at all? It was somewhere in the 60s or 70s and then went back to 37% when the project officially was paused. As they are running new Betas it means they still have more they want to run. The 99% thing is probably because they wiped the "bad" unrun WUs from their servers. With those gone it looks like the project is completed despite having more they will run. |
||
|
pvh513
Senior Cruncher Joined: Feb 26, 2011 Post Count: 260 Status: Offline Project Badges: |
I am now at 14 WUs from this beta run. 6 are still running. 4 are valid, 1 is PVal, 2 are PVer and the Pver WU I reported earlier in this thread has gone invalid. One of the new PVer WUs looks like a carbon copy of the one that went invalid. Likely this one will also go invalid. This is BETA_ E236437_ 872_ S.372.C52H28S2.AZCCIDDKWPEAKB-UHFFFAOYSA-N.8_ s1_ 14a. The other PVer WU crashed into the 18h CPU limit for my wingman, mine looks OK as far as I can tell. This is BETA_ E236437_ 322_ S.372.C52H28S2.WEULLIFNMVXNAH-UHFFFAOYSA-N.3_ s1_ 14a. My rigs normally are extremely stable, so I strongly doubt that the invalids are due to hardware problems. From where I am standing it looks like CEP2 still hasn't solved their problems...
|
||
|
|