Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: Africa Rainfall Project Thread: ARP is not able to save real worktime if i restart my PC |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 18
|
Author |
|
SaSiVa
Cruncher Joined: Aug 24, 2018 Post Count: 2 Status: Offline |
Hello, i had worked for the ARP many hours, but if i restart my PC the work unit lost every time i restart much of the done work to about a lot of hours and an rate about 5-6% from the rest time the work unit have done. With this issue there's no way to end up the work from my machine than mostly the half of work a day BOINC dosn't recognice. I had set the time for saving to every 60sec but that dosn't have any effect to the problem. I use Ubuntu Mate 20.04LTS with the newest software and drivers and BOINC-Manager 7.16.6 and wxWidgets Version: 3.0.4
Thanks in advence |
||
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3295 Status: Offline Project Badges: |
ARP checkpoints once every 12.5% of progress.
----------------------------------------To save your work, you should hibernate or suspend your computer. If that's not possible or desirable, you should de-select ARP and choose other projects. AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12146 Status: Offline Project Badges: |
Depending on your machine, 'Sleep' might be the option.
----------------------------------------Please also see my post under 'Work avaiable' earlier today. Mike [Edit 1 times, last edit by Mike.Gibson at Jul 21, 2021 3:14:34 PM] |
||
|
SaSiVa
Cruncher Joined: Aug 24, 2018 Post Count: 2 Status: Offline |
Thanks for the answer.
|
||
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 294 Status: Offline Project Badges: |
I have a similar problem except the computer locks up and a power cycle is required. This occurs erratically and without advance warning.
----------------------------------------The computer has been evaluated at several times without resolution or comment beyond - get a newer operating system. A screensaver is active to allow a visible means to know if it HAS locked up. Lack of network presence will also indicate it is locked up. I am more concerned with loss of work done (data processed) than I am of any loss of work unit credit. Running WinXP/pro, SP-3 and is dedicated to WCG's ARP. Thanks. Turns out the problem for the lockup was a defective Video Card. Once that was removed, there have been no further issues. Thanks for everyone's feedback. [Edit 2 times, last edit by bfmorse at Aug 18, 2021 2:03:13 PM] |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12146 Status: Offline Project Badges: |
I see you have Windows XP. No problem with that, but presumably it is an 'old' machine? How many cores/threads do you have? You say it is dedicated to ARP, so presumably you are using all cores/threads on ARP.
There are many, many postings on this forum warning not to run more than half your machine on ARP because of its requirements. That could be why it is locking up. Cut its usage to half and run something else if there are spare cores/threads. You might then have a machine which doesn't lock up. No guarantee, but it is one possible answer to your main problem. Mike |
||
|
sam6861
Advanced Cruncher Joined: Mar 31, 2020 Post Count: 107 Status: Offline Project Badges: |
For me, a limited checkpoints at 12.5% intervals is not much of a problem, when my computer are able to stay up with uptime 66 days without a computer restart.
computer locks up and a power cycle is required This freezing is often a hardware problem. Often this is overheating CPU, or failing RAM or failing HDD. Test your RAM with memtest86 or something.My computer, Asus M5A97 R2.0, AMD FX-4100, Linux Debian 64 bit had random freezing, ARP1 invalids, and computer freeze, caused by old DDR3 non-ECC. Got a new DDR3 UDIMM ECC 32GB 1600 MT/s, this fixed all my freezing and invalid task problems. Computer did randomly got 1 corrected memory after 66 days uptime, and continues to work fine. Note: CPU, Motherboard, and RAM must all support ECC to use it. |
||
|
nyanthiss
Cruncher Joined: Nov 23, 2012 Post Count: 15 Status: Offline Project Badges: |
So i'm guessing we should not expect the checkpoint interval to change anytime soon, if at all ?
----------------------------------------My problem is that i need to reboot into Windows every few days, and setting up hibernation on Ubuntu (where i run boinc) would require a new separate swap partition, which means repartitioning the drive, which is not really an option.
Intel Xeon E3-1231 v3
AMD A10 7800 AMD Ryzen 5 3500U AMD Ryzen 1700X AMD Ryzen 5900X 2x RaspberryPi, 1x Odroid |
||
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3295 Status: Offline Project Badges: |
So i'm guessing we should not expect the checkpoint interval to change anytime soon, if at all ? My problem is that i need to reboot into Windows every few days, and setting up hibernation on Ubuntu (where i run boinc) would require a new separate swap partition, which means repartitioning the drive, which is not really an option. Because of the simulations ARP does, it's not possible to change the checkpoint interval. AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
MJH333
Senior Cruncher England Joined: Apr 3, 2021 Post Count: 240 Status: Offline Project Badges: |
Hi nyanthiss,
From the discussions I have seen on the forum I think the answer to your first question is that there is no prospect of the checkpoint interval changing. I don't reboot my Linux machines very often, but when I need to (e.g. following a Kernel update) I suspend each ARP task just after a checkpoint (e.g. just after 12.5%, 25% etc). That means that you lose only a tiny bit of processing time. Obviously this can take a bit of time, especially if you are running many threads and they are at varying stages of progression. Whether that is practicable will depend on your own circumstances. You may find that, if you wait a short time, you can suspend some of the tasks just after a checkpoint but that there are others that have much longer to go. So if you have a limited time-window in which you can suspend tasks (or limited time to mess around with Boinc!), you might be able to save some of the progress, but not all. If suspending tasks after checkpoints is not practicable in your circumstances, the options would seem to be either: (1) shutdown as soon as you need to and live with the loss of processing time, or (2) run projects that don't have this checkpoint limitation. Cheers, Mark |
||
|
|