Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: CEP2 beta for windows - Version 6.25 |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 311
|
Author |
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: |
Got home last night to find my lone 32 bit machine had crashed. It's a hex core and received 6 betas. I restarted it and 10 seconds after boinc started it had a memory dump crash. I ran mem-test and chkdsk on the drive and both were error free. I restarted it again and suspended all but 1 beta. It ran about 10 minutes and crashed again. I restarted again but suspened all the betas and let it run the C4CW units in cache. Ran them fine.
----------------------------------------Win XP Pro SP3 with 4GB DDR2 PC5300 Ram. (yes I know about 32 bit limits) 200 GB HDD. I even doubled the size of the virtual memory after the first crash. Anyone have a suggestion or is this machine just chocking on the betas? I'm at a loss on this one.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
|
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hexcore with 4GB is tight for running this with 6 concurrent, but think XP-32 could not even address more, and probably cant even use more than 3.1GB.
----------------------------------------Set Swapfile to zero, boot, try again just one or two. Then do a disk-defrag and after create a new VM. Set that at 6GB (1.5x RAM) as fixed minimum and free expendable to bigger. With XP, you can still make use of the PageDefrag util by MS/Sysinternals. Does not work with W7 anymore. Whether it cures the issue... unlikely, but at least the system speed should improve. Let us know
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
mwgiii
Advanced Cruncher United States Joined: Aug 17, 2006 Post Count: 131 Status: Offline Project Badges: |
This has been the first problematic Beta for me.
----------------------------------------When I exit BOINC (stop work is checked) the Beta units do not stop running. This happens on both of my computers which currently have Beta units. All other non-beta Boinc work stops like it is supposed to. Of the 17 work units downloaded, Norton nuked 6 before I caught it. The problem here is I already have my ProgramData/Boinc directory excluded. My quad was running into this error. BETA_ exited with zero status but no 'finished' file 15.09.2010 19:10:55 World Community Grid If this happens repeatedly you may need to reset the project. CEP2 doesn't seem to play nice on my quad when I have VMWare/Ubuntu running. When I checked this morning, one Beta wu had an elapsed time of over 10 hours but To Completion had expanded to over 23 hours. When I look at the properties, it has only had 2 hours CPU time. I shut down VMWare and the To Completion is dropping rapidly. |
||
|
Somervillejudson@netscape.net
Veteran Cruncher USA Joined: May 16, 2008 Post Count: 1065 Status: Offline Project Badges: |
Curious about the Beta's I received as in past they superceded all other WU's but this time went to end of the que. Appears I will finish then in alotted time but tempted to suspend all other WU's and let them run first.
|
||
|
evilkats
Senior Cruncher USA Joined: May 4, 2007 Post Count: 162 Status: Offline Project Badges: |
Where were my brains when 10K tasks were issued for Linux.
----------------------------------------[Edit 1 times, last edit by evilkats at Sep 16, 2010 1:53:53 PM] |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Where were my brains when 10K tasks were issued for Linux. Off or on the Record? Well there were actually 2 Linux only passes for CEP2. The second one had 13,760 coming out including repairs, the first one 12,658 starting May 28. For whatever reason I had done a dual boot install of the brand new Lucid Lynx LTS just to get familiar with the platform and now seeing the crawl just for Clean Water running 60% longer on W7, got to go back to (happy feet) penguin territory.
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
martin64
Senior Cruncher Germany Joined: May 11, 2009 Post Count: 445 Status: Offline Project Badges: |
Curious about the Beta's I received as in past they superceded all other WU's but this time went to end of the que. Appears I will finish then in alotted time but tempted to suspend all other WU's and let them run first. Mine immediately started. It looks like my pile of WUs was big enough to make the system think it would not finish the (5-day deadline) betas in time and made them high priority. So I would *guess* that the betas automatically start when downloaded if you set the buffer to 5+ days. Regards, Martin |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Yes, it's reverting then to EDF state (Earliest Deadline First). There's another way to force them to the front... set the project switch time higher than the deadline... usually 4 days for Beta and repair jobs. Only advisable for WCG exclusive crunchers! It has all sorts of nasty effects if the client also computes for other grids that use a standard short deadline... you'd be crunching them till debt max stops all fetching for those, temporarily.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
No, don't think my duo is going to run this in production. Heartbeat issues when I'm doing heavy charting. Maybe I should experiment with switching on that new 6.10 option "whilst processor use is less than e.g. 90%" so BOINC backs off, but then that requires switching the Run based on prefs on. Well lets see... the last beta just started.
----------------------------------------btw, the task did validate properly and but for the heartbeat issue and resuming from last checkpoint, logs are identical to wingman... actually having computing time on all 16 job steps. It was paused for longer after 2 concurrent just showed to be way too heavy and allow using at the same time. Even with one and a C4CW it seems the water job incurs the same efficiency penalty. Result Name: BETA_ E200360_ 771_ A.24.C18H12N4S2.30.2.set1d06_ 1-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. [09:53:46] Number of jobs = 16 [09:53:46] Starting job 0,CPU time has been restored to 0.000000. [09:57:39] Finished Job #0 [09:57:39] Starting job 1,CPU time has been restored to 195.515625. [10:07:25] Finished Job #1 [10:07:25] Starting job 2,CPU time has been restored to 653.687500. [13:03:46] Finished Job #2 [13:03:46] Starting job 3,CPU time has been restored to 9764.843750. [13:14:09] Finished Job #3 [13:14:09] Starting job 4,CPU time has been restored to 10284.328125. [13:23:00] Finished Job #4 [13:23:00] Starting job 5,CPU time has been restored to 10730.984375. [13:30:50] Finished Job #5 [13:30:50] Starting job 6,CPU time has been restored to 11143.359375. [13:38:42] Finished Job #6 [13:38:42] Starting job 7,CPU time has been restored to 11542.218750. [13:48:29] Finished Job #7 [13:48:29] Starting job 8,CPU time has been restored to 12061.687500. [13:55:47] Finished Job #8 [13:55:47] Starting job 9,CPU time has been restored to 12435.828125. [14:03:19] Finished Job #9 [14:03:19] Starting job 10,CPU time has been restored to 12841.921875. [14:21:42] Finished Job #10 [14:21:42] Starting job 11,CPU time has been restored to 13717.859375. [14:34:04] Finished Job #11 [14:34:04] Starting job 12,CPU time has been restored to 14235.625000. [15:34:48] Finished Job #12 [15:34:48] Starting job 13,CPU time has been restored to 17231.281250. No heartbeat from core client for 30 sec - exiting No heartbeat: Exiting No heartbeat from core client for 30 sec - exiting [22:44:31] Number of jobs = 16 [22:44:31] Starting job 13,CPU time has been restored to 17231.281250. [00:42:19] Finished Job #13 [00:42:19] Starting job 14,CPU time has been restored to 23305.296875. [02:19:28] Finished Job #14 [02:19:28] Starting job 15,CPU time has been restored to 28842.328125. [04:10:06] Finished Job #15 called boinc_finish </stderr_txt> ]]>
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Alright, took the plunge... now running at a limit of 90%
----------------------------------------16/09/2010 17:31:57 suspend work if non-BOINC CPU load exceeds 90 % LAIM is on of course as else progress could be turning ugly.
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
|