Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 179
|
![]() |
Author |
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1330 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It's very well possible the Elapsed time is dubious [did it go back by the same amount on restart as the CPU time?]. There's some fixes to the time-keeping in the very latest clients. I used a BOINC patch version 7.7.0 and got higher CPU-times than elapsed.I had over 100% efficiency on all tasks. The result page shows the same times for elapsed and cpu. BoincTasks showed for the last four tasks: Elapsed- / CPU-time 19:48:45 (20:07:39) Result page stored 20.13 / 20.13 19:50:19 (20:21:43) Result page stored 20.36 / 20.36 21:38:11 (22:09:27) Result page stored 22.16 / 22.16 19:38:00 (20:10:05) Result page stored 20.17 / 20.17 I'll install recommended version 7.6.9 to see how the times are with that version. Maybe there's a fix in it, that wasn't in the standalone patch. |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I've computed 7 Beta WUs so far. I've noticed following by WCG-only hosts with a CPU efficiency over 99%, no restart:
I have several remarks regarding the crazy credit/hour ratio as well as the duration.
Cheers, Yves I will be reviewing the points given on this, it is on my plate of things to investigate/improve upon. Thanks, -Uplinger |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the clarification. I didn't catch that many beta WUs in the past and wasn't aware that this is a known problem in the field of the beta test. As soon as I'm home I can look for a log and post the content. But the trickle messages shouldn't be the problem as I could observe a growth of elapsed time for the WU. In my understanding this can only mean that the trickle messages were received and validated by the server. Rarusu, What Sek has posted is correct. There are currently two bugs I'm working through right now on the validator and transitioner which are both backend systems. What you are seeing is your machine was not given a "hard stop" message before the deadline. In this case you would have been granted the credit for work done so far, then the next generation work unit would have been created off of how far you have gotten. I would suspect if your machine worked 24/7 on it, you got a pretty good chunk completed. I am hopeful I'll have that part fixed first. Then I will be moving on to the transitioner bug, which is less critical for lost work. Thanks, -Uplinger I have recently put into place the fix for the hard stop/soft stop script that runs on the backend. Members may start seeing more of these messages as they get closer to deadlines. Thanks, -Uplinger |
||
|
Rarusu
Advanced Cruncher Germany Joined: Feb 7, 2006 Post Count: 64 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have recently put into place the fix for the hard stop/soft stop script that runs on the backend. Members may start seeing more of these messages as they get closer to deadlines. Thanks, -Uplinger Thanks for the update, uplinger. I will keep an eye on this as soon as I receive a new beta WU. Cheers Rarusu
Cheers,
Rarusu ![]() |
||
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1680 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@Uplinger
----------------------------------------In advance, I thank you for your investigation. Cheers, Yves |
||
|
pvh513
Senior Cruncher Joined: Feb 26, 2011 Post Count: 260 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I recently received a beta WU and decided to test it by suspending it with LAIM disabled. Before the suspend, checkpoints were done every ~35 minutes
[18:08:19] INFO: Checkpointed. Progress 1000 of 100000 steps complete CPU time 2091.835000 After the resume that increased to every ~67 minutes: [22:48:01] INFO: Checkpointed. Progress 8000 of 100000 steps complete CPU time 18051.571000 So it appears that the suspend/resume cycle pretty much doubled the CPU time per checkpoint step! This client runs under openSUSE 13.2 on an Opteron 6168. WU name: BETA_avx101118-096_r11_1_wcgfahb00300000_0. As a result it will almost certainly not make the deadline, but I will let it continue running. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Same behaviour under Windows 7 sp 1 running on an i7 2600K.
Checkpoints done every ~700 seconds and then every ~1400 seconds after the restart : [19:43:59] INFO: Checkpointed. Progress 10000 of 100000 steps complete CPU time 6999.936471 WU name : BETA_avx101118-060_r4_1_wcgfahb00300000_0 |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The researchers have identified the problem with cpu time increasing. They have supplied us with a fix that we will be testing on alpha soon.
Thanks, -Uplinger |
||
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1311 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Great to hear Uplinger, thanks for the news
----------------------------------------![]() |
||
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1311 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
For the super micro-managers, though I don't think this overrides the "don't need, cache full". Hitting update while selecting WCG will 'request' work from WCG even though it's really not it's turn if you have more than one active project attached to the client: <fetch_on_update></fetch_on_update> There were some bugged point releases that actually would fetch 1 unit at the time, again and again and again, but that's for the silly who want to over-commit their client(s).When updating a project, request work even if not highest priority project. +New in 7.0.54 Anyway if this works, please keep it a [public] secret. ![]() Since this is a public secret can somebody remind me roughly where the fetch_on_update line goes please? ![]() |
||
|
|
![]() |