Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 110
|
![]() |
Author |
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Please post any issues or comments from this beta test here:
https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,38848 Thanks, -Uplinger |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Same version as presently in production, 7.0?
----------------------------------------BTW, still after all these years getting twice the same 67mb file on Linux with different names...aux or something.Time to fix this and introduce symlinking to improve space use and efficiency. [Edit 2 times, last edit by SekeRob* at Feb 19, 2016 8:38:49 PM] |
||
|
UBT - JohnR
Cruncher Joined: Apr 30, 2006 Post Count: 35 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I picked up one, running on Boinc 7.6.22 with Windows 10 Pro x64 10.00.10586
After about 16% and one hour I suspended the work leaving in memory fine. When I closed Boinc and restarted the job restarted from 0% so no checkpoint. |
||
|
Thyme Lawn
Cruncher Joined: Dec 9, 2008 Post Count: 46 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I picked up one, running on Boinc 7.6.22 with Windows 10 Pro x64 10.00.10586 After about 16% and one hour I suspended the work leaving in memory fine. When I closed Boinc and restarted the job restarted from 0% so no checkpoint. CEP2 has fixed checkpoints at the end of each of its jobs. I can't speak for the beta tasks, but looking at the last 25 production tasks on my hyperthreaded i7-6700K @ 4.00GHz Windows 10 system only 2 made their first checkpoint within an hour and 2 others took slightly more than 5 hours. If you have BOINC Manager in advanced mode you can see when the last checkpoint was made by selecting the task you're interested in on the Tasks tab and clicking the Properties button. If there's been no checkpoint "CPU time at last checkpoint" will be shown as "---" Alternatively, if you enable checkpoint debug in the client configuration dialog the event log will log every checkpoint made by all of your tasks. For the most recent of my longer tasks the event log has: 19/02/2016 02:54:47 | World Community Grid | Starting task E236140_439_S.288.C35H25N1S2Si1.CBKFYIQCJWAPMG-UHFFFAOYSA-N.2_s1_14_2 using cep2 version 700 Edit: Note that the checkpoint times match the "Finished Job" lines in that task's error log.
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
----------------------------------------[Edit 2 times, last edit by Thyme Lawn at Feb 20, 2016 2:13:24 AM] |
||
|
foxfire
Advanced Cruncher United States Joined: Sep 1, 2007 Post Count: 121 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
These betas are not enforcing the 1 task per core rule. I've picked up 195 of them on 6 hosts (48 cores). Were they supposed to work that way?
----------------------------------------The profile has Number of workunits per host for The Clean Energy Project - Phase 2? as unlimited ![]() [Edit 1 times, last edit by foxfire at Feb 20, 2016 2:23:34 AM] |
||
|
deltavee
Ace Cruncher Texas Hill Country Joined: Nov 17, 2004 Post Count: 4852 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This Beta seems to be behaving exactly like they were CEP2 workunits. It is observing the setting "Number of workunits per host for The Clean Energy Project - Phase 2?" in device profiles. And it is observing the CEP2 settings in app_config.xml.
----------------------------------------If you have "Number of workunits per host for The Clean Energy Project - Phase 2?" set to unlimited then that seems to be what you are going to get with this Beta if the WUs are available.
4720 Yrs
----------------------------------------[Edit 1 times, last edit by deltavee at Feb 20, 2016 4:28:06 AM] |
||
|
JayPi
Cruncher Joined: Feb 21, 2008 Post Count: 4 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have 6 running beta workunits at 26%. I suspend it and after stopping and restarting of Boinc 2 of them are restarting from checkpoit, the other 4 are starting from beginning.
----------------------------------------The CPU is an i7-4770K (not overclocked). ![]() |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Task progress indication for CEP2 is unrelated to real job progress... it's an expression of hours run as fraction of 18 hours maximum when the tasks are cut-off. So, 26% means the tasks ran approximately 4.68 hours. Running 4 concurrent as of last night and one had not checkpointed after 12 hours, 1 had only 1 checkpoint [did 2 more in the next 30 minutes], 2 had 6 checkpoints [the BOINCTasks tool shows the count]. Checkpoint recovery testing is useless unless a task has actually written/logged the first or you're guaranteed to loose many an hour on the longest of 8 jobs in a task, the first job #0.
|
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2104 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Pardon me for asking, I don't understand the purpose for this Beta test.
----------------------------------------uplinger said: "This test involved new work units"; so a question would be: is there any important difference with the 'old' WUs? uplinger also mentioned: "Please test checkpointing by stopping and starting results during computation."; so a question would be: what is it that has changed? Also, uplinger said: "we have sent out only 50 work units, but plan to send out a total of 5000 work units for testing." So, the 5000 WUs haven't been sent yet, or? In this thread foxfire said: "I've picked up 195 of them on 6 hosts", so the 5000 seem to have been sent. It sounds to me that some info went missing, did I really miss something? ![]() [Edit 4 times, last edit by adriverhoef at Feb 20, 2016 1:06:41 PM] |
||
|
UBT - JohnR
Cruncher Joined: Apr 30, 2006 Post Count: 35 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The one WU I have is now at 70% after 13Hrs 11 minutes. It has only checkpointed once at 11Hrs 15 minutes.
|
||
|
|
![]() |