| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 15
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Several weeks ago I reinstalled one of my crunchers (I installed Ubuntu Server 20.04, before 18.04 was installed) and have a problem since. The client always holds a buffer of 1000 WU (which I believe is the maximum possible in any case) and there is no way to prevent this, except not allowing new tasks altogether.
I first thought, maybe this comes from reinstalling and the client needs some data how long wu's run, until it can properly limit the work to the buffer I set - but after weeks it is still the same. I set the "Connect to network about every days" in the device settings to 1.0 and the "Cache extra days of work" to 0.1. There are four machines attached to this profile, 3 run fine and limit the work, only one always downloads 1000 wu's. Assigning the machine to other profiles and changing the workunit cache settings multiple times had no effect. Setting the values locally by changing the global_prefs_override.xml to the following: <global_preferences> had no effect either. I don't know what to try next. Has anyone an idea? Is this a bug or am I missing something? |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Maybe the server thinks it's being contacted by a 256 thread machine ;o). The setting is on a per-thread basis, and everything then get's multiplied by the number of allowed threads on the client, which you can fake it to be anything you like (accidentally of course). Check the local settings in the cc_config.xml, global_prefs and global_prefs_override files.
----------------------------------------[Edit 1 times, last edit by Former Member at Aug 30, 2020 12:59:46 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It is possible you did this but you mention setting global settings locally ?? The one that is often forgotten is local preferences. Clear those settings just to be sure it is getting global preferences and applying those
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks for the input!
I checked if I accidently told the machine there are more threads than in reality, but I found nothing in the cc_config, global_prefs and global_prefs_override. I believe the <ncpus> option in the cc_config would be the option that could be set incorrectly? I don't have this option in my cc_config. Here is an excerpt from the client_state.xml, which seems to confirm Boinc got the number of threads correctly: <p_ncpus>16</p_ncpus> About the local preferences: yes, I also set the intended value in the local preferences (global_prefs_override.xml) but it had no effect. I also reloaded it with boinccmd --read_global_prefs_override. Also removing the file, which should cause the website preferences to be used, had no effect. So no luck so far :( |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Guess the next step is a little radical... project reset, but before a benchmark rerun. In client_state.xml you should find the below tags
<p_fpops>2547132783.235671</p_fpops> <p_iops>9195472079.934233</p_iops> This translates on mine as Whetstone(float) of 2547 and a Dhrystone(In) of 9195 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Ok, I did a Benchmark rerun:
4834 floating point MIPS <-- 20% more than my other similar machines 53484 integer MIPS <-- 40% less than my other similar machines This is a bit odd, because this is about 20% more floating MIPS and 40% less integer MIPS than my other more or less identical machines achieve. Several more reruns gave very similar results. Not sure if that has to do sth with the issue. When the curent WU's are finished, I will try a project reset. If this doesn't help, I may reinstall the OS. Somehow I suspect the OS is the reason, as Ubuntu 20.04 vs 18.04 on all my other machines is the only real difference and the issue appeared after installing 20.04. |
||
|
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 786 Status: Offline Project Badges:
|
Could the MIPS difference be due to Meltdown and/or Spectre patch differences between systems?
----------------------------------------Paul.
Paul.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As soon as I set OPN to unlimited, I downloaded about 9200 work units. All 10 machines now sit at ~1000 work units each. All are set at a 1 day buffer. It's definitely a client problem. I ran into this before while using the gianfranco repo and as soon as I switch back to the stand Ubuntu repo for BOINC it stopped. It seems to have now migrated to other platforms as well. I'm seeing this on Ubuntu and Centos Stream 8. 7.16.6 client on all machines. One machine has 256 hours of work (more than 10 days). Going to be lots of resends on the 10th
----------------------------------------[Edit 2 times, last edit by Former Member at Sep 3, 2020 6:31:08 PM] |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
The difference is benchmarks is due to Ubuntu. Very different values between 18.04 and 20.04.
----------------------------------------![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Running work fetch debug:
Thu 03 Sep 2020 02:10:19 PM CDT | | [work_fetch] ------- start work fetch state ------- Thu 03 Sep 2020 02:10:19 PM CDT | | [work_fetch] target work buffer: 86400.00 + 0.00 sec <=== One day work buffer set Thu 03 Sep 2020 02:10:19 PM CDT | | [work_fetch] --- project states --- Thu 03 Sep 2020 02:10:19 PM CDT | MLC@Home | [work_fetch] REC 303.079 prio -0.000 can't request work: "no new tasks" requested via Manager Thu 03 Sep 2020 02:10:19 PM CDT | PrimeGrid | [work_fetch] REC 521.750 prio -0.000 can't request work: "no new tasks" requested via Manager Thu 03 Sep 2020 02:10:19 PM CDT | QuChemPedIA@home | [work_fetch] REC 734.543 prio -0.000 can't request work: "no new tasks" requested via Manager Thu 03 Sep 2020 02:10:19 PM CDT | ralph@home | [work_fetch] REC 0.009 prio -1000.000 can't request work: "no new tasks" requested via Manager Thu 03 Sep 2020 02:10:19 PM CDT | Rosetta@home | [work_fetch] REC 0.713 prio -1000.000 can't request work: "no new tasks" requested via Manager Thu 03 Sep 2020 02:10:19 PM CDT | TN-Grid Platform | [work_fetch] REC 669.372 prio -0.000 can't request work: "no new tasks" requested via Manager Thu 03 Sep 2020 02:10:19 PM CDT | World Community Grid | [work_fetch] REC 28257.385 prio -3.696 can't request work: too many runnable tasks Thu 03 Sep 2020 02:10:19 PM CDT | | [work_fetch] --- state for CPU --- Thu 03 Sep 2020 02:10:19 PM CDT | | [work_fetch] shortfall 1665735.91 nidle 9.00 saturated 0.00 busy 0.00 <=== Don't know how it calculated this Thu 03 Sep 2020 02:10:19 PM CDT | MLC@Home | [work_fetch] share 0.000 Thu 03 Sep 2020 02:10:19 PM CDT | PrimeGrid | [work_fetch] share 0.000 Thu 03 Sep 2020 02:10:19 PM CDT | QuChemPedIA@home | [work_fetch] share 0.000 Thu 03 Sep 2020 02:10:19 PM CDT | ralph@home | [work_fetch] share 0.000 Thu 03 Sep 2020 02:10:19 PM CDT | Rosetta@home | [work_fetch] share 0.000 Thu 03 Sep 2020 02:10:19 PM CDT | TN-Grid Platform | [work_fetch] share 0.000 Thu 03 Sep 2020 02:10:19 PM CDT | World Community Grid | [work_fetch] share 0.000 Thu 03 Sep 2020 02:10:19 PM CDT | | [work_fetch] ------- end work fetch state ------- |
||
|
|
|