Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: Clean Energy Project - Phase 2 Beta Oct 6, 2016 [ Issues Thread ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 175
|
Author |
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Foxfire,
From what you are showing here, these workunits did not attempt to reply anything. Did they run on your machine? The only ones I've granted credit for were ones that came back as too late, No reply is a different status which i think turns into too late when you report the results. I have disabled the assimilator earlier today which will keep results in the database longer. If you can, please provide me with the output logs showing that you ran these results. I may have to rethink and grant more credit if it was ran on your machine but reported as no reply. Thanks -Uplinger |
||
|
BKraayev
Cruncher Joined: Mar 23, 2005 Post Count: 45 Status: Offline Project Badges: |
Only got one BETA - checkpointed after 44 mins, but no further checkpoint after
----------------------------------------[Edit 1 times, last edit by BKraayev at Oct 12, 2016 2:10:46 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This makes no sense to me... Valid already - now two of us have received the same WU already validated to process again... @TonyEllis, I can imagine the scenario that causes that situation. _0 and _1 completed processing but couldn't upload by the 4d deadline (becoming No Reply at that stage), so _2 and _3 were sent out. Shortly afterwards (15m to 1h24m), _0 and _1 were able to upload, report and became Valid. _2 and _3 are either processing and will be allowed to complete, or will be Server Aborted if they haven't started processing and a client communication occurs. All that's the normal WCG/BOINC protocol, just adversely affected by the upload problem. |
||
|
No.15
Advanced Cruncher Joined: Dec 25, 2015 Post Count: 50 Status: Offline Project Badges: |
While I share everyone's frustration with the issues we had this is a beta. I mean that is the purpose of a beta right? To find issues with the processes?
----------------------------------------Thanks Uplinger and everyone else who got this corrected. |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1932 Status: Offline Project Badges: |
While I share everyone's frustration with the issues we had this is a beta. I mean that is the purpose of a beta right? To find issues with the processes? Well, yes. And no.The main issue in this case was bad planning on the side of the research team, not issues with regards to the send out WUs. And I don't think that the purpose of a batch of WUs should be to test the project management... Ralf |
||
|
No.15
Advanced Cruncher Joined: Dec 25, 2015 Post Count: 50 Status: Offline Project Badges: |
While I share everyone's frustration with the issues we had this is a beta. I mean that is the purpose of a beta right? To find issues with the processes? Well, yes. And no.The main issue in this case was bad planning on the side of the research team, not issues with regards to the send out WUs. And I don't think that the purpose of a batch of WUs should be to test the project management... Ralf I see your point. Still I think beta's are not only about the work but also the process. Hopefully lessons are learned about capacity planning and resource management as well as if the programming works. JMO BTW just to be clear, I was just as upset about the issue and the length of time it took to correct. I think that means I am arguing both sides LOL |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Looking at my beta unit results, I'm pleased to see that they all ended up Valid apart from a couple that completed but were marked Too Late. There were also a few that were server aborted (I assume - they were removed without any processing), because I couldn't start them in time. An oddity with those server aborted ones is that they're actually marked Error in the Results Status; however, if I filter on Error, they don't show; if I filter on Aborted, they do show.
There's still the unfortunate issues of hitting the 18h limit and lack of checkpoints, but otherwise this first batch processed ok for me. |
||
|
TonyEllis
Senior Cruncher Australia Joined: Jul 9, 2008 Post Count: 259 Status: Recently Active Project Badges: |
Thanks tonyh205 - that makes sense - I like to understand the 'why' if I can...
----------------------------------------Mine completed and valid (second down) - the other (top) detached... Workunit Status Project Name: Beta - The Clean Energy Project - Phase 2 Created: 10/07/2016 20:20:58 Name: BETA_E299901_645_S.326.C30H20N2O6S4.HPLWNMBGPVVOAQ-UHFFFAOYNA-N.1_s1_14 Minimum Quorum: 2 Replication: 2 Result Name OS type OS version App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit BETA_ E299901_ 645_ S.326.C30H20N2O6S4.HPLWNMBGPVVOAQ-UHFFFAOYNA-N.1_ s1_ 14_ 3-- Linux 3.10.0-327.13.1.el7.x86_64 - Detached 10/11/16 20:23:32 10/12/16 13:18:29 0.00 0.0 / 0.0 BETA_ E299901_ 645_ S.326.C30H20N2O6S4.HPLWNMBGPVVOAQ-UHFFFAOYNA-N.1_ s1_ 14_ 2-- Linux 2.6.32-642.1.1.v6.i686 704 Valid 10/11/16 20:22:19 10/12/16 15:07:14 18.00 61.1 / 60.1 BETA_ E299901_ 645_ S.326.C30H20N2O6S4.HPLWNMBGPVVOAQ-UHFFFAOYNA-N.1_ s1_ 14_ 1-- Linux 3.13.0-96-generic 704 Valid 10/7/16 20:23:22 10/11/16 20:38:44 18.00 368.0 / 481.2 BETA_ E299901_ 645_ S.326.C30H20N2O6S4.HPLWNMBGPVVOAQ-UHFFFAOYNA-N.1_ s1_ 14_ 0-- Linux 4.4.0-38-generic 704 Valid 10/7/16 20:22:18 10/11/16 21:46:27 18.00 594.3 / 481.2 The other two similar WUs are now also valid 3 times... but all three hit the 18 hour limit.
Run Time Stats https://grassmere-productions.no-ip.biz/
|
||
|
CandymanWCG
Senior Cruncher Romania Joined: Dec 20, 2010 Post Count: 421 Status: Offline Project Badges: |
Managed to snatch one of the resends. Completed 7 hours faster than the wingmen and was validated right away. The "granted" credit was way off, but it's just a Beta, so I can live with that. The main thing is that we seem to be one step closer to getting this project back on track. I really want that Sapphire batch.
----------------------------------------Cheers! Knowledge is limited. Imagination encircles the world! - Albert Einstein [Edit 1 times, last edit by CandymanWCG at Oct 13, 2016 9:12:34 AM] |
||
|
foxfire
Advanced Cruncher United States Joined: Sep 1, 2007 Post Count: 121 Status: Offline Project Badges: |
Foxfire, From what you are showing here, these workunits did not attempt to reply anything. Did they run on your machine? The only ones I've granted credit for were ones that came back as too late, No reply is a different status which i think turns into too late when you report the results. I have disabled the assimilator earlier today which will keep results in the database longer. If you can, please provide me with the output logs showing that you ran these results. I may have to rethink and grant more credit if it was ran on your machine but reported as no reply. Thanks -Uplinger Didn't see this until a few minutes ago and the WUs have dropped off the Results Status page, sorry. Just for grins I'll go back and see what I can pick out of the device logs. Thanks for your assistance. |
||
|
|