| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 22
|
|
| Author |
|
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges:
|
Ingleside is correct, many of our projects (including the Autodock ones) have a short start up where progress does not progress. This is due to it being a relatively short sequence of events in comparison to the entire run of the application. Glad everything is working well now, let us know if you have some other issues.
-Uplinger |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Think this was one from the world of 'instant gratification' not guaranteed. To recount on progress which does not show immediate and progress saved, my personal pref being this particular cc_config.xml log_flags line,
----------------------------------------<cc_config> <log_flags> <checkpoint_debug>1</checkpoint_debug></log_flags> </cc_config> which I think should be a standard log entry. This way all volunteers know also when a progress state has been lastly saved to disc. Some checkpoints depending on speed of the device can be quite a distance apart, such as for the HCC project, particularly towards the last quart of the task. (See Start Here FAQ on checkpoints for more: http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=11332)
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges:
|
which I think should be a standard log entry. This way all volunteers know also when a progress state has been lastly saved to disc. This would be a bad idea, since you'll get useless logs like this: 24.10.2009 14:06:17 World Community Grid [checkpoint_debug] result R00387_791ef0d3b51d2ae77e08c463284c46c8_00_15 checkpointed This is just 10 minutes logging, but granted with shorter checkpoint-interwall than the default, giving 9.1 checkpoints/minute. With the default checkpoint-interwall of 1 minute, it would decrease some, to N checkpoints/minute, there N = number of cores (real or HT). For me this would be 8 checkpoints/minute, meaning any real problems would very likely drown in the huge amount of unneccessary checkpoint-logging. Some checkpoints depending on speed of the device can be quite a distance apart, such as for the HCC project, particularly towards the last quart of the task. Fixing the applications so there's not so huge amount of time between checkpoints would be a good idea, but it's not always easy to find a good place to checkpoint, and depending on application, writing checkpoints can give a large penalty, especially if has lots of info to write. While running with debug-logging on can be a good idea in some instances, for most users this would give too much useless info with little or no benefit. In v6.6.xx and later clients, users can check when last checkpointed by selecting the individual running tasks and hitting "properties", so can use this method instead to check if good time to shutdown or not. (See Start Here FAQ on checkpoints for more: http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=11332) "Checkpoints are saved at each 'position' and can take from just seconds..." => if you're 'unlucky', all tasks will checkpoint once/minute, and give you a ton of useless logging... ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
You full well know the frequency of writing can be tuned, Ingleside, depending on client version the target time set in interval per core or per system.
----------------------------------------With the frequency write DEFAULT 60 seconds... thought we discussed this before, the modernistic views of Dr. Anderson making it thus a minimum 4 minute interval per science on a regular quad, at least with 6.6 and up... but think that was reversed in latest 6.10. (you know the changes by heart, but it's truly not important until we reach a level that WCG can say "recommended" ;-) And as for the usual "fixing the application" comment... have you read up why that is in the particular HCC case, Ingleside? Do you know what the application does? Would it increase processing efficiency increasing the checkpoint save rates to also happen in mid step? So in all, "no good idea" is just another opinion. PS: The log function has progressed and is not just for the debuggers interest... those running simple view do look in too!
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges:
|
You full well know the frequency of writing can be tuned, Ingleside, depending on client version the target time set in interval per core or per system. With the frequency write DEFAULT 60 seconds... thought we discussed this before, the modernistic views of Dr. Anderson making it thus a minimum 4 minute interval per science on a regular quad, at least with 6.6 and up... but think that was reversed in latest 6.10. (you know the changes by heart, but it's truly not important until we reach a level that WCG can say "recommended" ;-) Most users seems to run with the default of 60 seconds, it's the reason the quick-and-dirty change in v6.6.xx was added, to greatly reduce the wasted time for the users most effected by this problem. And, it was removed again in v6.10.xx, since checkpoint-frequency isn't a bottleneck any longer. Also, it's not correct that checkpoint-interwall can always be controlled by user. And as for the usual "fixing the application" comment... have you read up why that is in the particular HCC case, Ingleside? Do you know what the application does? Would it increase processing efficiency increasing the checkpoint save rates to also happen in mid step? Hmm, "fixing" isn't the best word in this instance, "changing" would likely be better.But, who mentioned anything about HCC? No, I've not looked specifically on why HCC can't checkpoint any more frequently. I'm aware there's various reasons for not more frequent checkpointing in different applications, and these reasons normally means no change will be made. From an users point-of-view on the other hand, it would still be an advantage if all applicatons could checkpoint "now", when it's 1h since last checkpoint, and knows there's 1h before next checkpoint, and user wants to shutdown or reboot or whatever immediately and not in 1 hours time. Losing 1 hour of crunching isn't really a good option either... So in all, "no good idea" is just another opinion. If all tasks used 1h+ between checkpoints, including checkpoint-logging would be a good idea. But, with many tasks checkpointing once per minute as default, the checkpoint-logging would be useless, since it makes it more difficult to find any "real" problems among the logs, and there's no reason to wait until next checkpoint before shut-down or whatever, since so very little time would be lost anyway. ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
I mentioned HCC explicitly in the original post about the checkpoint writing flag on default and then reiterated that in the second post as you seemed to have missed the importance.
----------------------------------------WCG tries to strike a balance between easy near lossless saving of progress so contributors can shut down almost anytime they like and the sizes of those saves and the opportunities (very small on RICE and HCMD2). With more is better, and WCG sciences ALL ask for permission writing, without exception, the rate can be controlled by the write to disk, to personal preference. If all tasks used 1h+ between checkpoints, including checkpoint-logging would be a good idea. Setting at 1 hour+ is truly nonsense and completely part time cruncher unfriendly. Once the science only generates 1 per hour, no way to reduce the rate for these volunteers.We support WCG here and discuss WCG science features, that's what we do here! I'm truly not interested by projects that do not adhere to the checkpoint write call omission that knock them out every few seconds ignoring the 60 second minimum or whatever minimum the user sets and sizable files at that as I discovered. Moving on.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Anthony Owen
Cruncher United States Joined: Nov 18, 2004 Post Count: 3 Status: Offline Project Badges:
|
My problem is connecting to the server aftera clean install on Win 7. I set the firewall to allow WCG access and Vista offered no problems. It's the "Proxy server" error (I don't use a proxy server).
----------------------------------------This problem was resolved by downloading the latest software (unlike the previous install, this required a reboot). [Edit 1 times, last edit by Anthony Owen at Oct 28, 2009 5:31:23 PM] |
||
|
|
nasher
Veteran Cruncher USA Joined: Dec 2, 2005 Post Count: 1423 Status: Offline Project Badges:
|
i might be hijackin the thread a little but i was wondering on your opinions on Windows 7 and what computers should we upgrade to and what computers should we keep the current operating system on are
----------------------------------------sounds like there are a few growing pains to Windows 7 and BOINC and WCG too.. also should i upgrade to win 7 now or wait a few months ![]() |
||
|
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges:
|
i might be hijackin the thread a little but i was wondering on your opinions on Windows 7 and what computers should we upgrade to and what computers should we keep the current operating system on are sounds like there are a few growing pains to Windows 7 and BOINC and WCG too.. also should i upgrade to win 7 now or wait a few months If you should upgrade and when you should upgrade is your decision, and it depends on what OS you're currently running, and what you're going to use the computer for. If you've currently running a 32-bit OS, but has a 64-bit cpu and atleast 2 GB ram, it's an advantage to switch to 64-bit, just to speed-up crunching somewhat. If you're already running 64-bit, or don't meet requirements for 64-bit, there's little reason to change, atleast from a crunching-only point-of-view. As far as BOINC goes, I'm still running win7-RC, but my findings are: 1: WCG-sub-projects: Proteome folding can crap-out, there's many reports of this under vista, and win7 seems to be the same... 2: Appart for #1, no problems with running a cocktail of BOINC-projects, and all WCG's sub-projects except Proteome folding. 3: I'm not aware of any win7-specific problems with current BOINC-clients, if there's any problems they're also for XP and Vista-users. 4: Don't know (and don't care) if the customized WCG-v6.2.28-client has any specific win7-problems. ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello, I'm having problems after switching to Windows 7 x64.
All tasks except Rice will finish instantly, and give an error when I look at the results page on the website. As for Rice, the progress stays 0%, until it completes 7 hours later. I don't have a clue to what to do. Any help is appreciated. All results logs start like this: <core_client_version>6.10.17</core_client_version> <![CDATA[ <message> (0xe6) - exit code 230 (0xe6) </message> <stderr_txt> Unrecognized XML in parse_init_data_file: hostid Skipping: 1093571 Skipping: /hostid Unrecognized XML in parse_init_data_file: starting_elapsed_time Skipping: 0.000000 Skipping: /starting_elapsed_time Unrecognized XML in parse_init_data_file: computation_deadline Skipping: 1258044755.000000 Skipping: /computation_deadline Can't set up shared mem: -1 Will run in standalone mode. </stderr_txt> ]]> The startup log is like this: 2009/11/03 12:36:01 Starting BOINC client version 6.10.17 for windows_x86_64 2009/11/03 12:36:01 log flags: file_xfer, sched_ops, task 2009/11/03 12:36:01 Libraries: libcurl/7.19.4 OpenSSL/0.9.8k zlib/1.2.3 2009/11/03 12:36:01 Data directory: C:\ProgramData\BOINC 2009/11/03 12:36:01 Running under account home 2009/11/03 12:36:01 Processor: 4 AuthenticAMD AMD Processor model unknown [AMD64 Family 16 Model 4 Stepping 2] 2009/11/03 12:36:01 Processor: 512.00 KB cache 2009/11/03 12:36:01 Processor features: fpu tsc pae nx sse sse2 pni 2009/11/03 12:36:01 OS: Microsoft Windows 7: x64 Edition, (06.01.7600.00) 2009/11/03 12:36:01 Memory: 3.50 GB physical, 7.00 GB virtual 2009/11/03 12:36:01 Disk: 781.25 GB total, 744.18 GB free 2009/11/03 12:36:01 Local time is UTC +9 hours 2009/11/03 12:36:01 NVIDIA GPU 0: GeForce 9800 GT (driver version 19107, CUDA version 2030, compute capability 1.1, 512MB, 308 GFLOPS peak) 2009/11/03 12:36:01 Not using a proxy 2009/11/03 12:36:01 World Community Grid URL http://www.worldcommunitygrid.org/; Computer ID 1093571; resource share 100 2009/11/03 12:36:01 World Community Grid General prefs: from World Community Grid (last modified 27-Dec-2007 11:35:55) 2009/11/03 12:36:01 World Community Grid Computer location: home 2009/11/03 12:36:01 General prefs: using separate prefs for home 2009/11/03 12:36:01 Reading preferences override file 2009/11/03 12:36:01 Preferences limit memory usage when active to 3583.25MB 2009/11/03 12:36:01 Preferences limit memory usage when idle to 3583.25MB 2009/11/03 12:36:01 Preferences limit disk usage to 4.00GB 2009/11/03 12:36:01 World Community Grid Restarting task R00407_d28460efd4970246cc169e390a610adc_02_003_12 using rice version 617 2009/11/03 12:36:01 World Community Grid Restarting task R00407_d28460efd4970246cc169e390a610adc_02_004_1 using rice version 617 2009/11/03 12:36:01 World Community Grid Restarting task R00407_d28460efd4970246cc169e390a610adc_03_002_12 using rice version 617 2009/11/03 12:36:01 World Community Grid Restarting task R00407_d28460efd4970246cc169e390a610adc_03_003_13 using rice version 617 |
||
|
|
|