| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 44
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Twice in the last day or so, I've noticed that my twin-processor machine has only been working on one WU. At some time it's tried to contact the WCG server, failed, and fallen back to a default config with a max of 1 processor as a result. This doesn't seem like a good situation, but I don't know whether you'd call it a failure in the client or the server.
eg. 2008-03-12 23:18:24 [World Community Grid] Requesting 23285 seconds of new work 2008-03-12 23:18:30 [World Community Grid] Scheduler RPC succeeded [server version 601] 2008-03-12 23:18:30 [World Community Grid] Message from server: Server can't open database 2008-03-12 23:18:30 [World Community Grid] New host venue: 2008-03-12 23:18:30 [---] General prefs: from World Community Grid (last modified 2008-03-12 03:00:03) 2008-03-12 23:18:30 [---] Host location: none 2008-03-12 23:18:30 [---] General prefs: using your defaults 2008-03-12 23:18:30 [---] Number of usable CPUs has changed from 2 to 1. Running benchmarks. 2008-03-12 23:18:30 [World Community Grid] Deferring communication for 1 hr 0 min 0 sec 2008-03-12 23:18:30 [World Community Grid] Reason: requested by project 2008-03-12 23:18:30 [World Community Grid] Deferring communication for 3 hr 24 min 32 sec 2008-03-12 23:18:30 [World Community Grid] Reason: no work from project 2008-03-12 23:18:30 [---] Running CPU benchmarks 2008-03-12 23:18:30 [---] Suspending computation - running CPU benchmarks 2008-03-12 23:19:32 [---] Benchmark results: 2008-03-12 23:19:32 [---] Number of CPUs: 1 2008-03-12 23:19:32 [---] 736 floating point MIPS (Whetstone) per CPU 2008-03-12 23:19:32 [---] 965 integer MIPS (Dhrystone) per CPU 2008-03-12 23:19:33 [---] Resuming computation |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
hmmm, can you check if there is a global_prefs_override.xml
----------------------------------------The line "General prefs: using your defaults" suggests there are. What BOINC version is this?
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
I'll dig into this - I'm 90% certain this will be due to the server code.
Quick question - on next successful communication with the server does it revert back to your proper settings? |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Sekerob,
The key thing I saw was this: 2008-03-12 23:18:30 [World Community Grid] Message from server: Server can't open database 2008-03-12 23:18:30 [World Community Grid] New host venue: 2008-03-12 23:18:30 [---] General prefs: from World Community Grid (last modified 2008-03-12 03:00:03) 2008-03-12 23:18:30 [---] Host location: none 2008-03-12 23:18:30 [---] General prefs: using your defaults 2008-03-12 23:18:30 [---] Number of usable CPUs has changed from 2 to 1. Running benchmarks. The 'new host venue' being empty. It looks like the server sends a empty value for the host venue due to it not being able to open a connection to the db and fetch the actual value. So the server is probably not handling a failure to connect to the db correctly. It should just return a back off message and the 'Server can't open database' message with nothing else. Kremman, Any chance you still have the reply message from the server from that communication attempt? Please post it after removing any personal information |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Ok - after seeing JmBoullier post I sure this is what it is.
Time to dig into the server code. thanks |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I just had an episode with one of my machines that sounds like about the opposite of Kremmen's problem:
----------------------------------------3/12/2008 1:57:43 PM|World Community Grid|Sending scheduler request: To fetch work. Requesting 3 seconds of work, reporting 0 completed tasks 3/12/2008 1:57:48 PM|World Community Grid|Scheduler request succeeded: got 0 new tasks 3/12/2008 1:57:48 PM|World Community Grid|Message from server: Server can't open database 3/12/2008 1:57:48 PM|World Community Grid|New host venue: 3/12/2008 1:57:48 PM||General prefs: from World Community Grid (last modified 12-Mar-2008 12:31:54) 3/12/2008 1:57:48 PM||Host location: none 3/12/2008 1:57:48 PM||General prefs: using your defaults 3/12/2008 1:57:48 PM||Preferences limit memory usage when active to 1944.33MB 3/12/2008 1:57:48 PM||Preferences limit memory usage when idle to 2046.66MB 3/12/2008 1:57:48 PM||Preferences limit disk usage to 5.20GB 3/12/2008 1:57:48 PM||Number of usable CPUs has changed from 1 to 2. Running benchmarks. 3/12/2008 1:57:48 PM||Running CPU benchmarks 3/12/2008 1:57:48 PM||Suspending computation - running CPU benchmarks 3/12/2008 1:58:20 PM||Benchmark results: 3/12/2008 1:58:20 PM|| Number of CPUs: 2 3/12/2008 1:58:20 PM|| 1409 floating point MIPS (Whetstone) per CPU 3/12/2008 1:58:20 PM|| 2718 integer MIPS (Dhrystone) per CPU 3/12/2008 1:58:21 PM||Resuming computation 3/12/2008 1:58:21 PM|World Community Grid|Starting faah3252_ZINC01720198_xMut_md00500_01_0 3/12/2008 1:58:21 PM|World Community Grid|Starting task faah3252_ZINC01720198_xMut_md00500_01_0 using faah version 542 3/12/2008 2:00:40 PM|World Community Grid|Sending scheduler request: Requested by user. Requesting 138634 seconds of work, reporting 0 completed tasks 3/12/2008 2:00:45 PM|World Community Grid|Scheduler request succeeded: got 0 new tasks 3/12/2008 2:00:45 PM|World Community Grid|Message from server: Server can't open database My profile on this machine is set to use 1 processor, not 2, since it is a pretend dual core, or HT. Are there still issues with the server code? On the flip side, my C2D laptop will only pickup 2 jobs at a time and wait till they are almost done before collecting more, even though my profile is set to cache 1.5 days of work. I had to re-install Windows XP though so that one might just be BOINC settling down after a new install, not too worried about the laptop, no bad results coming from there. [Edit 1 times, last edit by Former Member at Mar 12, 2008 6:12:52 PM] |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
In my case I have had no problem with the number of CPUs, but the number I want to use is coded in the global_prefs file of each machine and all my website profiles are set to 1 to be suitable for any machine including the Pentium HT. In fact I am using website profiles only for selecting projects.
----------------------------------------Cheers. Jean. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Can you modify the global_prefs with BOINC still open, or close BOINC, modify, and then launch BOINC again?
|
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Jean probably means the global_prefs_override.xml. Yes you can edit latter directly and use the advanced menu to read those settings from version 5.8.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Any chance you still have the reply message from the server from that communication attempt? Please post it after removing any personal information The above is what was logged. I'm not sure what reply message you're after. To answer the other questions asked here: No, I don't have a global_prefs_override.xml, just a global_prefs.xml. I'm using version 5.8.15. Yes, when I've noticed this, I've forced communication to the server again, and once it's able to get a non-blank venue value, it fixes itself up: 2008-03-13 02:53:48 [World Community Grid] New host venue: home 2008-03-13 02:53:48 [---] General prefs: from World Community Grid (last modified 2008-03-12 03:00:03) 2008-03-13 02:53:48 [---] Host location: home 2008-03-13 02:53:48 [---] General prefs: using separate prefs for home 2008-03-13 02:53:48 [---] Number of usable CPUs has changed from 1 to 2. Running benchmarks. 2008-03-13 02:53:48 [World Community Grid] Deferring communication for 1 min 1 sec 2008-03-13 02:53:48 [World Community Grid] Reason: requested by project 2008-03-13 02:53:48 [World Community Grid] Resuming task ach1_24_10_1 using acah version 514 2008-03-13 02:53:48 [---] Running CPU benchmarks |
||
|
|
|