| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 253
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
All my batch 37 results are showing zero CPU time in the Results Status list, e.g.
BETA_ OET1_ 0000037_ xZAGP_ 0134_ 1-- Valid 29/11/14 05:24:50 29/11/14 10:29:01 0.00 / 1.69 56.1 / 46.7 No other apparent problems, though. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Got 4 more which arrived around 5:20UTC and just started them. This time they're running OK. What was different?
I started them separately and not both at the same time, on each machine. Each machine was not running a browser or mail client. Otherwise everything was the same as earlier -- neither machine had been rebooted since the previous batch. Was it something the techs did? It would be nice to know! |
||
|
|
OldChap
Veteran Cruncher UK Joined: Jun 5, 2009 Post Count: 978 Status: Offline Project Badges:
|
Latest one is running long it seems ...estimates are off a fair bit
---------------------------------------- Almost a day long, first checkpoint 2+ hours ![]() |
||
|
|
deltavee
Ace Cruncher Texas Hill Country Joined: Nov 17, 2004 Post Count: 4894 Status: Offline Project Badges:
|
Latest one is running long it seems ...estimates are off a fair bit Here's two that haven't checkpointed after six hours and are showing less than 5% complete. BETA_ OET1_ 0000301_ xMBGP-L_ rig_ 0017_ 1 BETA_ OET1_ 0000301_ xMBGP-L_ rig_ 0061_ 1 |
||
|
|
Eric_Kaiser
Veteran Cruncher Germany (Hessen) Joined: May 7, 2013 Post Count: 1047 Status: Offline Project Badges:
|
I made the same observation with a rig-wu. Long running and long time between checkpoints.
----------------------------------------![]() |
||
|
|
genhos
Veteran Cruncher UK Joined: Apr 26, 2009 Post Count: 1108 Status: Offline Project Badges:
|
BETA_OET1_0000300_xMBGP-FA_rig_0017 is being an absolute beast of a unit. 15% done after 15hrs with a checkpoint approx every 4hours. 2 others have errored on this unit after 15 & 17hrs crunch. Looking at the stderr.txt for this unit and I'm on task 3 out of 22. Going on some very simple approximate maths that makes this unit a 90hr if it completes. The increase in progress is not obvious, I assume it is only updating the % complete on each checkpoint.
----------------------------------------Are these huge units correct? Should it be left to run? (I will be leaving it to run unless told to stop it) |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Similar, but not identical, to what tonyh205 said ...
The first pair on my lappie finished and validated, but both of us show zero CPU / elapsed time on one: Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit BETA_ OET1_ 0000037_ xZAGP_ 0823_ 1-- 705 Valid 29/11/14 05:20:07 29/11/14 15:59:37 0.00 83.6 / 77.7 BETA_ OET1_ 0000037_ xZAGP_ 0823_ 0-- 705 Valid 29/11/14 05:19:59 29/11/14 13:52:44 0.00 71.9 / 77.7 On the other, the wingman shows a surprisingly short CPU (?) time: Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit BETA_ OET1_ 0000037_ xZAGP_ 0821_ 1-- 705 Valid 29/11/14 05:20:07 29/11/14 15:10:50 0.30 7.4 / 41.1 BETA_ OET1_ 0000037_ xZAGP_ 0821_ 0-- 705 Valid 29/11/14 05:19:59 29/11/14 14:03:12 0.00 74.7 / 41.1 and the BoincTasks log on my machine shows a silly CPU time for both of them, too: World Community Grid 7.05 Beta Test BETA_OET1_0000037_xZAGP_0821_0 03:18:12 (00:00:05) 29/11/2014 14:00:07 29/11/2014 14:05:08 0.04 Reported: OK [machine-name] 43.90 MB 44.95 MB World Community Grid 7.05 Beta Test BETA_OET1_0000037_xZAGP_0823_0 03:10:52 (00:00:04) 29/11/2014 13:52:07 29/11/2014 13:55:07 0.03 Reported: OK * [machine-name] 43.96 MB 45.02 MB So it seems that the science code isn't quite working the way it should. The deskside is still busy ... |
||
|
|
yoro42
Ace Cruncher United States Joined: Feb 19, 2011 Post Count: 8979 Status: Offline Project Badges:
|
Still RUNNING:
----------------------------------------BETA_ OET1_ 0000300_ xMBGP-FA_ rig_ 0025_ 4-- Simone In Progress 11/28/14 19:45:29 12/2/14 07:45:29 0.00 / 0.00 0.0 / 0.0 Computer: Simone Project World Community Grid Name BETA_OET1_0000300_xMBGP-FA_rig_0025_4 Application beta20 7.04 Workunit name BETA_OET1_0000300_xMBGP-FA_rig_0025 State Running Received 11/28/2014 12:45:14 PM Report deadline 12/2/2014 12:45:29 AM Estimated app speed 2.04 GFLOPs/sec Estimated task size 8,251 GFLOPs CPU time at last checkpoint 16:37:12 CPU time 20:11:46 Elapsed time 21:03:48 Estimated time remaining 02:49:04 Fraction done 15.909% Virtual memory size 35.45 MB Working set size 36.78 MB Directory slots/4 Process ID 16280 ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The two on the deskside have finished. Both of these also show 0 CPU time on the results page, but they showed normal figures for both elapsed and CPU time while they were running. One has validated and this time the wingman shows a sensible CPU time.
|
||
|
|
genhos
Veteran Cruncher UK Joined: Apr 26, 2009 Post Count: 1108 Status: Offline Project Badges:
|
BETA_OET1_0000300_xMBGP-FA_rig_0017 is being an absolute beast of a unit. 15% done after 15hrs with a checkpoint approx every 4hours. 2 others have errored on this unit after 15 & 17hrs crunch. Looking at the stderr.txt for this unit and I'm on task 3 out of 22. Going on some very simple approximate maths that makes this unit a 90hr if it completes. The increase in progress is not obvious, I assume it is only updating the % complete on each checkpoint. Are these huge units correct? Should it be left to run? (I will be leaving it to run unless told to stop it) This one has also recently failed with a computation error. |
||
|
|
|