Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: Uncovering Genome Mysteries Beta Test - Oct 30 2014 [ Issues Thread ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 21
|
Author |
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1317 Status: Offline Project Badges: |
The solving of the backwards restoring from checkpoints looks hopeful.
----------------------------------------Suspended twice between 2 checkpoints on 2 different tasks and this is what I see: 4000 query sequences compared. Checkpoint restored: 4168 Checkpoint restored: 4168 4500 query sequences compared. 5000 query sequences compared. Checkpoint restored: 5283 Checkpoint restored: 5283 5500 query sequences compared. [Edit 1 times, last edit by Crystal Pellet at Oct 30, 2014 10:49:46 PM] |
||
|
pramo
Veteran Cruncher USA Joined: Dec 14, 2005 Post Count: 703 Status: Offline Project Badges: |
BETA_ugm1_ugm1_00640_0266_1
----------------------------------------BETA_ugm1_ugm1_00640_0447_0 BETA_ugm1_ugm1_01126_0366_1 BETA_ugm1_ugm1_01126_0563_1 BETA_ugm1_ugm1_00478_0416_1 BETA_ugm1_ugm1_00640_0020_0 BETA_ugm1_ugm1_00640_0041_0 BETA_ugm1_ugm1_00964_0174_0 BETA_ugm1_ugm1_00964_0161_1 BETA_ugm1_ugm1_01126_0058_1 BETA_ugm1_ugm1_01126_0631_1 BETA_ugm1_ugm1_01424_0066_1 BETA_ugm1_ugm1_00640_0678_1 above restarted twice each after different checkpoints (laim off) looks like they went back to the checkpoint and carried on. Sorry for the extra lines in the c&p Fixed extra lines in the cut and paste:) [Edit 2 times, last edit by pramo at Oct 31, 2014 10:59:46 AM] |
||
|
ccandido
Senior Cruncher Joined: Jun 22, 2011 Post Count: 182 Status: Offline Project Badges: |
I'm not getting work. No more available?
---------------------------------------- |
||
|
BobCat13
Senior Cruncher Joined: Oct 29, 2005 Post Count: 295 Status: Offline Project Badges: |
Looks like checkpoint restarting is working better with multiple restarts after a checkpoint and prior to another checkpoint.
----------------------------------------Result Name: BETA_ ugm1_ ugm1_ 00640_ 0728_ 1-- <core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> Unable to open checkpoint file starting from 0 500 query sequences compared. 1000 query sequences compared. 1500 query sequences compared. 2000 query sequences compared. 2500 query sequences compared. 3000 query sequences compared. 3500 query sequences compared. 4000 query sequences compared. 4500 query sequences compared. Checkpoint restored: 4025 4500 query sequences compared. 5000 query sequences compared. 5500 query sequences compared. 6000 query sequences compared. 6500 query sequences compared. 7000 query sequences compared. 7500 query sequences compared. 8000 query sequences compared. 8500 query sequences compared. Checkpoint restored: 8004 8500 query sequences compared. 9000 query sequences compared. 9500 query sequences compared. 10000 query sequences compared. Checkpoint restored: 8004 8500 query sequences compared. 9000 query sequences compared. 9500 query sequences compared. 10000 query sequences compared. ........ Edit: This task has validated with a wingman that had no restarts from checkpoint. [Edit 1 times, last edit by BobCat13 at Oct 31, 2014 2:38:57 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Pleased to see that one of mine (batch 00313) has turned Valid despite the multiple closely spaced checkpoint-restores. All others are still PVal.
|
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1317 Status: Offline Project Badges: |
I returned all 60 Beta's from this run. 25 Valids and 35 Pending Validation.
----------------------------------------I monitored the last 4 tasks more closely. Cpu efficiency 98% with write to disk set every 120 seconds. The upload files were 263kB up to 304kB compressed from result.tmp's with a maximum size of 1,143,867 Bytes up to 1,358,965 Bytes. The writes to disk for each of those tmp-files was done every few seconds with an average of 766 Bytes/second. The checkpoint file is just a copy of the state of that result.tmp. Isn't it possible to keep the result.tmp in memory and only write to disk at checkpoint time? [Edit 1 times, last edit by Crystal Pellet at Oct 31, 2014 12:43:15 PM] |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1317 Status: Offline Project Badges: |
BETA_ ugm1_ ugm1_ 01126_ 0530_ 2-- - In Progress 10/31/14 16:37:49 11/4/14 16:37:49 0.00 0.0 / 0.0 <-- mine
----------------------------------------BETA_ ugm1_ ugm1_ 01126_ 0530_ 1-- 723 Pending Validation 10/30/14 22:31:00 10/31/14 05:50:05 6.26 157.3 / 0.0 BETA_ ugm1_ ugm1_ 01126_ 0530_ 0-- 723 Error 10/30/14 22:30:57 10/31/14 16:37:45 3.01 106.0 / 0.0 The end of the PVal result: 36500 query sequences compared. Run complete, CPU time: 22522.014632 05:47:45 (17237): called boinc_finish </stderr_txt> The end of the Error result: 36500 query sequences compared. Run complete, CPU time: 15851.737103 09:15:53 (32551): called boinc_finish Checkpoint restored: 36665 Run complete, CPU time: 10819.032671 16:36:34 (1445): called boinc_finish </stderr_txt> <message> finish file present too long </message> Edit: mine will run almost 8 hours. Bedtime far overdue then, so will not watch closely. [Edit 1 times, last edit by Crystal Pellet at Oct 31, 2014 5:25:24 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
One strange one:
BETA_ ugm1_ ugm1_ 01729_ 0035_ 2-- 723 Valid 31/10/14 13:26:21 31/10/14 16:02:03 2.56 89.5 / 91.4 << mine BETA_ ugm1_ ugm1_ 01729_ 0035_ 0-- 723 Invalid 30/10/14 22:43:13 31/10/14 13:26:16 6.09 155.8 / 91.4 BETA_ ugm1_ ugm1_ 01729_ 0035_ 1-- 723 Valid 30/10/14 22:43:12 31/10/14 05:39:50 5.42 93.3 / 91.4 The Result Log for _0: <core_client_version>7.0.59</core_client_version> <![CDATA[ <stderr_txt> Unable to open checkpoint file starting from 0 500 query sequences compared. 1000 query sequences compared. 1500 query sequences compared. 2000 query sequences compared. ... etc 27000 query sequences compared. 27500 query sequences compared. 28000 query sequences compared. Checkpoint restored: 0 500 query sequences compared. 1000 query sequences compared. 1500 query sequences compared. ... etc 57000 query sequences compared. 57500 query sequences compared. 58000 query sequences compared. Run complete, CPU time: 21919.484625 13:21:45 (4824): called boinc_finish Both _1 and _2 completed with 58000 query sequences compared and no checkpoint-restores. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just waiting for the outcome of 3 Beta workunits in PVal; all others have turned Valid, all of mine with multiple closely spaced checkpoint-restores. Looking good.
|
||
|
vepaul
Senior Cruncher Belgium Joined: Nov 17, 2004 Post Count: 261 Status: Offline Project Badges: |
Hello,
résultat Numéro de version de l'application Etat Heure d'envoi Heure de retour prévue / Heure de retour Temps d'unité centrale (heures) Crédit BOINC demandé/accordé BETA_ ugm1_ ugm1_ 00478_ 0582_ 0-- 723 Validation en attente 30/10/14 22:00:29 31/10/14 05:20:54 4,85 145,0 / 0,0 BETA_ ugm1_ ugm1_ 00478_ 0582_ 1-- - En cours 30/10/14 22:00:19 3/11/14 22:00:19 0,00 0,0 / 0,0 Most of mine look fine, vep |
||
|
|