Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 18
Posts: 18   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3466 times and has 17 replies Next Thread
Luke081515
Cruncher
Joined: Apr 16, 2017
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

Normally I would not worry about the expected runtime, but if I take a look at the results, this is a clear. Before the crash, I had this statistics (just for this instance):
https://www.dropbox.com/s/ta46rhxccac3kjo/before.PNG?dl=0

After the crash, and still now, I get this: https://www.dropbox.com/s/xl6m89uy1101kkz/after.PNG?dl=0

There is no strong load on the host, if I turn BOINC off, I have a load less then 4 on xenial (instance has 16 Cores).
[May 26, 2017 9:18:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

Please post a screenshot of task manager from the "processes" tab. That may tell us something.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[May 27, 2017 12:45:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Luke081515
Cruncher
Joined: Apr 16, 2017
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

I tried another way too: apt-remove both, apt-purge, and then only install the client, and run it via boinccmd. I did this two days ago, it causes a lot of load, but in fact the whole instance returned no results since 02.06.17. Boinccmd only works fine on my second server: it has 4 cores and a constant load of 3.00

I took two screenshots:

This is with boinc running (high load, since it is via boinccmd only, normally the load would be between 12 and 14 otherwise):
https://www.dropbox.com/s/nti91q63mj3cjll/Boinc-running.PNG?dl=0

This screenshot was made after 15 minutes of suspending boinc: https://www.dropbox.com/s/4e53h76hk0m4qad/Boinc-paused.PNG?dl=0
[Jun 7, 2017 9:18:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

I see several anomalies in the first screen shot.

1. I count 17 WUs started on what looks like a 16 core machine
2. Processor 10 is dead for some reason which means only 15 can run
3. /usr/bin/boinc is using too much CPU time for some reason which will impact the other work units.

Second screen shot: there about 5 tasks that are using 100% CPU including /usr/bin/boinc (that task shouldn't use that much). That means when you resume, the BOINC tasks will be impacted by those other tasks which extends the runtime.
----------------------------------------
[Edit 1 times, last edit by Doneske at Jun 8, 2017 2:04:35 AM]
[Jun 8, 2017 2:03:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Luke081515
Cruncher
Joined: Apr 16, 2017
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

Concerning:
1) As I ran boinc with GUI, I configured it to do only 12 tasks at once, this returned no results too.
2) It's not really dead, if you take a look at htop, at the most time, one CPU is not running all the time, but not the same one. But there are also moments where all CPUs are at 100%
3) Any idea how can I fix that? I already tried reinstalling it several times.

Concerning screenshot two: That can happen for a second, but mostly the instance has nearly no load. Currently it has 0.03 0.07 0.02.
[Jun 8, 2017 6:41:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

I'm still not sure there isn't something wrong with the machine. Your htop screen shots don't look like mine. I ran htop for 15 minutes and it never showed htop using 100% of the CPU. Do you know why the machine crashed? I had one issue with Linux where the 4.8 kernel didn't work well with one model of Xeon processor. It wouldn't dispatch work on the CPU correctly but was only noticeable because WGC tasks use 100% of the processors and they were only using about 30%. I just continued to run on the 4.4 kernel for a while. Finally, a firmware update came out along with the 4.8-52 version of the kernel. After that it was back to normal again. I suspect somewhere in the earlier 4.8 kernel builds some firmware was dropped and then finally put back in the later builds. This was also about the time Intel was submitting updates for Kabylake. Since this problem has cropped up after a crash, I suspect there may be some residual effects somewhere.
[Jun 8, 2017 9:09:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Luke081515
Cruncher
Joined: Apr 16, 2017
Post Count: 8
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

Hm. I've reinstalled the server, now boinc (via boinccmd) and other are consuming a load about 14, and returning likely the same amount of results as before (I can't say it does, since boinc is not running more than 24 hours yet). What a strange behaviour...
[Jun 16, 2017 12:29:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC has weird behaviour concerning returning results

Maybe SELinux or whatever security software is excessively quizzing the BOINC processes. BOINC needs localhost IP 127.0.1.1 and port 31416 (and an assorted number of other IP ports to run it's RPC (yes local too). In amongst this is the in-between of the core client/daemon and the BOINC Manager GUI. At any rate, I've no BOINC issues on Ubuntu 14.04 16.04 LTS with the 4.11 kernel crowbarred in, but not using any host firewalling (the configurable router firewall protects all on the WLAN)

PS on well configured system, the core client (boinc) takes but few minutes a day to do it's business.
----------------------------------------
[Edit 2 times, last edit by SekeRob* at Jun 16, 2017 2:24:58 PM]
[Jun 16, 2017 1:59:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 18   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread