| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 107
|
|
| Author |
|
|
Randzo
Senior Cruncher Slovakia Joined: Jan 10, 2008 Post Count: 339 Status: Offline Project Badges:
|
Anhhai excellent description.
Thumbs up ;-) |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
[tongue in cheek]... But the last bit of the analogy is not accurate and real... the 44 year old will think it's critical to get the 20 of age single female's phone number ;P
Just posted in another thread. Had 2 that ran long long long and then were server aborted and I thought policy was that jobs were to be left running once started. E201101_ 002_ A.28.C22H15N3S2Si.428.2.set1d06_ 3-- 637 Valid 1/28/11 08:06:16 1/28/11 23:11:45 11.94 129.1 / 176.7 E201101_ 002_ A.28.C22H15N3S2Si.428.2.set1d06_ 2-- 637 Error 1/27/11 22:13:37 1/28/11 07:33:04 0.00 0.0 / 0.0 E201101_ 002_ A.28.C22H15N3S2Si.428.2.set1d06_ 0-- 637 Valid 1/27/11 18:00:20 1/28/11 07:18:16 12.00 224.4 / 176.7 E201101_ 002_ A.28.C22H15N3S2Si.428.2.set1d06_ 1-- 637 Server Aborted 1/27/11 17:54:01 1/27/11 21:56:05 0.13 3.8 / 0.0 < moi Actually after 18 hours only 463 seconds were logged. The rest was restarted each time from the beginning of job 16. Quit requested: Exiting [22:55:35] Number of jobs = 16 [22:55:35] Starting job 2,CPU time has been restored to 463.630000. [22:55:35] Starting new Job [22:55:35] Qink name = fldman [22:55:36] Qink name = gesman [22:55:36] Qink name = scfman Abort requested: Exiting </stderr_txt> ]]> It's been suggested before. If it's okay to use the part of the slow device to validate only the beginning part of the fast, fully completing device, then matching a slow with fast would be optimal (someone wrote to have understood it was, but my own CEP2 quorums don't indicate that at all... just chance is at work). Some samples E201105_ 557_ A.29.C23H13N3OS2.100.3.set1d06_ 0-- 637 Valid 1/28/11 08:55:06 1/29/11 06:25:54 4.76 89.4 / 107.9 E201105_ 557_ A.29.C23H13N3OS2.100.3.set1d06_ 1-- 637 Valid 1/28/11 08:29:18 1/28/11 16:21:31 6.69 126.4 / 107.9 E201102_ 910_ A.28.C22H15N3S2Si.432.3.set1d06_ 0-- 637 Valid 1/27/11 21:58:18 1/29/11 04:42:39 10.80 276.3 / 285.5 E201102_ 910_ A.28.C22H15N3S2Si.432.3.set1d06_ 1-- 637 Valid 1/27/11 21:55:45 1/28/11 08:55:05 10.30 294.7 / 285.5 It's a strategy to consider given that the present batch is 2.7 million results (before quorum 2????), and us delivering 15k validations daily. That's 360 days to complete this set. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I must be missing something. I have not seen any WUs for CEP2 with CPU time greater than 12 hours.
----------------------------------------E201018_ 509_ A.27.C21H15NS3Si2.256.0.set1d06_ 0-- DKT-0529E4878D Valid 1/16/11 18:07:08 1/19/11 17:56:48 12.00 131.0 / 127.0 This WU was sent at 18:07 on 1/16 and finished at 17:56 on 1/19. This means the time was over 71 hours from when the WU was sent. And this is not even close to the actual elapsed time for the WU. The elapsed time is closer to 12 hours. The WU started at 16:29 (day n-1) and finished at 09:43 (day n) which means the actual elapsed time is a little over 15 hours. The 23 hour job in your list appears to be the end time when the WU finished. For the 4 WUs you have listed, none of them exceed 12 hours of cpu time. The last one which was Server Aborted had CPU time of 0.13 or much less than an hour. The 21:56:05 is the end time which means there was a little more than 4 hours elapsed time. Edits are for corrections. [Edit 2 times, last edit by Former Member at Jan 30, 2011 8:19:36 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I must be missing something.]
Very probably 2 ships in the night passing each other ;>) The times you see in the message log are indeed as my system is set at i.e. 22:55:36 is when the job finished. The job started actually in the night and as said only managed to lock-in 463 CPU seconds. Send and Return time from the Result Status page have nothing at all to do with the CPU or Elapsed time to complete a task. Completed tasks can even sit 24 hours on the client before the reporting cycle is finished, that is if the client is on-line and does not ask for new work in that period. Whilst you read a little over 4 hours in this, note the date ;P 1/27/11 17:54:01 1/27/11 21:56:05 0.13 3.8 / 0.0 < moi where 21:56 was the time the server acknowledged reporting... not the time the result completed. The 4 listed were exactly why I listed them... no show to indicate a slow/fast match is in any way routinely made. It is not, but the techs may correct me. --//-- |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This seems a bit hokey. Should I assign CEP2 tasks to a "slow" machine I have (C2D instead of i5/i7 )knowing that 12 hours on the C2D will be "sufficient", while 12 hours on the i5 or i7 would produce extra details which are "nice to know" in high likelihood may not be strictly necessary to the research? Per anhhai's example, I am not exactly interested in computing the candidate's circumcision status...
-j |
||
|
|
anhhai
Veteran Cruncher Joined: Mar 22, 2005 Post Count: 839 Status: Offline Project Badges:
|
[tongue in cheek]... But the last bit of the analogy is not accurate and real... the 44 year old will think it's critical to get the 20 of age single female's phone number ;P . . . . It's a strategy to consider given that the present batch is 2.7 million results (before quorum 2????), and us delivering 15k validations daily. That's 360 days to complete this set. Sekerob, I don't know whose cheek your tongue is in , but I do know that the harvard people are working on something to reduce the waste associated with the current validation process.Written on 1-17-11: Hi SekeRob et al., these special cases are just not easy to implement in BOINC/WCG. But there is some good news: This thread has triggered a new idea on how to do the validation which would basically eliminate the waste of the current setup cool. It may take a month or two to implement, but it is high up on the agenda. We'll keep you posted on how this is going. Best wishes from Your Harvard CEP team ![]() |
||
|
|
kskjold
Senior Cruncher Norway Joined: May 20, 2008 Post Count: 469 Status: Offline Project Badges:
|
E201178_128_A.27.C24H16S2Se.68.0.set1d06
----------------------------------------This wu did run 13,5 hours. But in the result page it stands with only 12 hours. This is the second time this has happened with that client. |
||
|
|
anhhai
Veteran Cruncher Joined: Mar 22, 2005 Post Count: 839 Status: Offline Project Badges:
|
kskjold, the 12 hr limit is for CPU time. I am guessing here but the 13.5 hrs is the elapse time. If you are running boinc 6.xxx, it will display the elapse time instead of the CPU time.
----------------------------------------![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
13.5 hours is the time on the wallclock that you allowed BOINC to run. 12 hours is the time that the CPU was effectively used. If I'm doing heavy stuff and run the also heavy CEP2 simultaneously, the gap on my quad easily exceeds that 1.5 hours ''unrecorded/unrecognized'' time.
|
||
|
|
kskjold
Senior Cruncher Norway Joined: May 20, 2008 Post Count: 469 Status: Offline Project Badges:
|
Thanx
---------------------------------------- ![]() |
||
|
|
|