| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 10
|
|
| Author |
|
|
aegidius
Cruncher Joined: Aug 29, 2006 Post Count: 25 Status: Offline Project Badges:
|
Boinc agent freezes intermittently. There's no update of elapsed/remaining time for tasks, but currently running science tasks remain running. New tasks do not start when old ones finish, so if left, the system goes quiet. Restarting boincmgr fixes it. The system is responsive throughout, and RAM usage is normal (there's plenty of RAM)
Any idea what this could be? It has started happening only on one machine that worked fine for a long time, so I suspect hardware, but swaps of RAM don't seem to affect it. Windows, i5-2500 (4 cores/4 threads) 8GB RAM. WCG 7.14.3. |
||
|
|
Bill F
Advanced Cruncher USA Joined: Jan 16, 2008 Post Count: 53 Status: Offline Project Badges:
|
Boinc agent freezes intermittently. There's no update of elapsed/remaining time for tasks, but currently running science tasks remain running. New tasks do not start when old ones finish, so if left, the system goes quiet. Restarting boincmgr fixes it. The system is responsive throughout, and RAM usage is normal (there's plenty of RAM) Any idea what this could be? It has started happening only on one machine that worked fine for a long time, so I suspect hardware, but swaps of RAM don't seem to affect it. Windows, i5-2500 (4 cores/4 threads) 8GB RAM. WCG 7.14.3. You are running the current version of WCG... are all of your fan's clean and running Ok ? Bill F ![]() ![]() |
||
|
|
aegidius
Cruncher Joined: Aug 29, 2006 Post Count: 25 Status: Offline Project Badges:
|
Yes all clean... it's probably not thermal (CPU runs at 50C when doing 4 threads) and it also happens with the case open. Suspect a flaky mobo though, everything else has been swapped with no change. Not sure what else I can pursue.
|
||
|
|
BobbyB
Veteran Cruncher Canada Joined: Apr 25, 2020 Post Count: 638 Status: Offline Project Badges:
|
The only thing I can think of is to let the tasks run out then uninstall Boinc, rename the boinc data folder in ProgramData (or wherever it is), then reinstall boinc from scratch from the WCG site.
----------------------------------------Well I can think of another thing. Maybe the communications between the Boinc client (service) and BoincMgr so... Open a command prompt and navigate to the Boinc program folder. For help boinccmd --help To show gui info boinccmd --get_simple_gui_info It should show stuff and then running tasks. Field-> fraction done: will display a number Running the command again will show a different fraction done. This will demonstrate that communications with the Boinc client works. I'm presuming you know how to do the command prompt thingy. [Edit 3 times, last edit by BobbyB at Feb 15, 2021 5:34:49 PM] |
||
|
|
aegidius
Cruncher Joined: Aug 29, 2006 Post Count: 25 Status: Offline Project Badges:
|
Just tried this. The fractions advance as expected when the agent display is updating normally. But after it freezes, the boinccmd displays the same fractions (they do not advance). Restarting boincmgr usually (though not 100%) brings the display back to something reasonable, until it freezes again.
Will try a fresh restart after cleaning out the appdata next. |
||
|
|
sam6861
Advanced Cruncher Joined: Mar 31, 2020 Post Count: 107 Status: Offline Project Badges:
|
Check that the date and time isn't frozen and isn't going backwards. Make sure time zone is set correctly.
When I adjust the computer clock by going back 1 hour, BOINC task goes frozen. The computer's date time had incorrect daylight savings time adjustment in Windows 10. One hour later, it unfreeze, jumps forward huge amount of percent progress in existing tasks and starts working. |
||
|
|
aegidius
Cruncher Joined: Aug 29, 2006 Post Count: 25 Status: Offline Project Badges:
|
Now we are on to something :-) I noticed that the time zone had got lost and the time had jumped back 12-15 hours. Not sure if that correlated with a freeze, but it was just after booting, so I suspected the mobo battery might be dead.
I replaced the battery, allowed the system to sync time, and resumed Boinc. It has been running well for 7 hours now with no freezes. I'll declare it cured if it goes overnight. Thanks for your suggestion sam6861. |
||
|
|
BobbyB
Veteran Cruncher Canada Joined: Apr 25, 2020 Post Count: 638 Status: Offline Project Badges:
|
The clock thing makes sense a bit.
----------------------------------------The battery is there to keep the clock up to date when the power is off. When you reboot, the OS should sync up with a time server and get the correct time. Since the system is up 24/7 then the clock keeps on ticking correctly. I'll test this theory and set the clock back a year or so on a test PC to see if it syncs up on boot. I tested this now on a Win10 machine by setting the clock to 2001-02-17 00:00:32 and upon boot it had the right time. I could test with a Ubuntu machine but I expect it to do the same. [Edit 1 times, last edit by BobbyB at Feb 17, 2021 4:38:35 PM] |
||
|
|
aegidius
Cruncher Joined: Aug 29, 2006 Post Count: 25 Status: Offline Project Badges:
|
What is strange, is that freezing never happened at boot, but some random time later (half an hour or so). It must have lost time when powered up, which doesn't sound like the battery... but replacing the battery definitely fixed it.
----------------------------------------EDIT: latest theory: the system booted up with the wrong time, Boinc started tasks, then a time sync happened and confused Boinc some time later. [Edit 1 times, last edit by aegidius at Feb 18, 2021 12:42:57 AM] |
||
|
|
BobbyB
Veteran Cruncher Canada Joined: Apr 25, 2020 Post Count: 638 Status: Offline Project Badges:
|
Latest theory sounds plausible. I had been thinking about some kind of think like that but could not put a finger on it.
----------------------------------------I could test on a live WCG Win10 machine but since the battery did it I will leave it and attribute it to some conspiracy theory about an alien plot to invade Earth and eat our children for breakfast along with their Cheerios. [Edit 1 times, last edit by BobbyB at Feb 18, 2021 4:20:46 PM] |
||
|
|
|