| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 10
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
About 50% of the time BOINC starts up normally on system reboot.
About 50% of the time it says something like "failed to connect to client" and all BOINC Manager panes are empty. Advanced->select computer..., "localhost", and let it fill in a password seems to fix it. The annoying thing is about 50% of the time needing to do something manually to make BOINC start running. The really weird thing is the last time this happened, I'd noticed a fat FAAH job chewing up RAM with Task Manager shortly before the error message appeared. Apparently it was able to launch the science app successfully. It just required two tries instead of one to communicate with it -- and would not make the second try automatically. Firewall is configured to let BOINC Manager and BOINC client access localhost, LAN, and Internet. More importantly, firewall configuration is the same on the occasions when it works and the occasions when it requires manual intervention to get going properly. So is all other configuration. I seriously doubt the firewall is the cause, or any other security software, given the intermittency. Overzealous security software would cause it to fail consistently, and keep doing so with manual attempts using Advanced...select computer also failing, until the security software was reconfigured. Maybe it's dependent on the specific work unit it has; some of them cause the science app to be slow responding and cause intermittent timeouts to be experienced by BOINC Manager, or something like that. I'd be interested to know if the same inability to talk to the freshly-launched science app would cause UD to assume it had crashed. That would explain certain false positives that used to occur when I was running UD rather than BOINC. The modem, if it matters, is in bridge mode, and even if it were in NAT mode it shouldn't affect loopback. Strange stuff. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
How is BOINC installed (single/shared/service)?
----------------------------------------What is the command in the Launch short-cut in the AutoStart folder if any? Is there a duplicate short-cut (the user and the all-user)?
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Jun 26, 2007 7:50:07 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
How is BOINC installed (single/shared/service)? I can't recall, other than "not service". Is there a quick way to check? What is the command in the Launch short-cut in the AutoStart folder if any? I haven't changed anything to do with how it starts up from what the installer generated. If you mean the Startup folder: "C:\Program Files\BOINC\boincmgr.exe" /s Is there a duplicate short-cut (the user and the all-user)? I don't know. The user start menu folder definitely has one. The all users start menu folder hangs Explorer when I attempt to double-click it, preventing me from looking inside. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Think if you look in Taskmanager you can see who the owner is of the BOINC.exe process.
----------------------------------------The folders to check for the shortcuts is something like (translated from italian): C:\Documents and Settings\Twisted0n3\Menu Start\Programs\Startup if looking at all-user causes your explorer to hang, you got other issues.... probably you need to do a clean OS reinstall.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at Jun 26, 2007 8:23:07 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Task manager shows blank user name by everything except System Idle Process, which is owned by SYSTEM.
I was able to get a listing for that directory. No shortcut to BOINC under all users. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
*bump*
|
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Can you look at the follow files:
stderrdae.txt stdoutdae.txt And let us see what is in those files? It might be useful to stop BOINC, move the existing files to another location and then reboot your computer. That way we know that the contents in the file are from the current activity. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'll do this next reboot. If the problem happens on the subsequent startup I'll post the contents of the recreated files here. If it doesn't, it will have to wait until a later startup -- whichever is the next such occasion when the problem happens again. The first opportunity is probably in a few hours, as there's bad weather moving this way and I'll probably shut down for a while.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Well, it didn't happen this boot. I guess we're going to have to wait a while...
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It just happened again. Reboot, and Boinc Manager fails to connect to Boinc client.
Task Manager shows an HPF Rosetta task chewing up one core, so it actually started the client successfully but didn't connect. Manually selecting Advanced, Select computer..., localhost and letting it fill in the pw field for me worked as usual. Why does it sometimes require two tries to connect to localhost? stdoutdae yields: [08/01/07 22:37:20] TRACE [3436]: Event: CTRL-LOGOFF Event 2007-08-01 22:37:21 [---] Exit requested by user To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK 2007-08-01 22:44:26 [---] Starting BOINC client version 5.8.16 for windows_intelx86 2007-08-01 22:44:26 [---] log flags: task, file_xfer, sched_ops 2007-08-01 22:44:26 [---] Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3 2007-08-01 22:44:26 [---] Data directory: C:\Program Files\BOINC 2007-08-01 22:44:27 [---] Processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ [x86 Family 15 Model 35 Stepping 2] [fpu tsc pae nx sse sse2 3dnow mmx] 2007-08-01 22:44:27 [---] Memory: 1022.48 MB physical, 2.31 GB virtual 2007-08-01 22:44:27 [---] Disk: 224.87 GB total, 145.19 GB free 2007-08-01 22:44:28 [World Community Grid] URL: http://www.worldcommunitygrid.org/; Computer ID: 223584; location: (none); project prefs: default 2007-08-01 22:44:28 [---] General prefs: from World Community Grid (last modified 1969-12-31 19:00:01) 2007-08-01 22:44:28 [---] Host location: none To pause/resume tasks hit CTRL-C, to exit hit CTRL-BREAK 2007-08-01 22:44:28 [---] General prefs: using your defaults 2007-08-01 22:44:48 [World Community Grid] Restarting task lf058_00007_1 using hpf2 version 518 This covers from the reboot until I had it working normally again. Nothing looks especially screwy here except the timestamp on the general prefs looks like the start of the epoch or something like that, as though it were internally reset to zero, and "Host location: none". These also appear on a reboot where BOINC behaves normally however! Indeed the entire stdoutdae chunk from a normal reboot and from the just-occurred one where it failed to reconnect normally after reboot look identical up to Restarting task ... where the specific task details differ in the expected way. The stderrdae thing has only one item since the last time this occurred: "Another instance of BOINC is running", which was false (task manager showed and shows only one each of boinc, boincmgr, and wcg_foobar; also only one boinc item is in Startup, a shortcut to Boinc Manager; and lastly, I did a normal (not "run as a service") install and didn't do anything weird to the install that could reasonably have broken anything). System specs are mostly in the log chunk above; the only obviously salient thing missing there is that it's an XP SP2 machine. The inability to connect via loopback is very odd since the machine's TCP stack is definitely functioning normally (as evidenced by my posting this successfully). |
||
|
|
|