Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 8
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 973 times and has 7 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Problem after Kernel upgrade. [RESOLVED]

Hi.

As the heading says i let Ubuntu update the kernel this morning and now HCC will not run, it's staying at zero run time it was fine yesterday other projects seem to be running fine them being.

DOCKING
POEM
ROESTTA

I will have to abort the tasks i have now as there not going to run, i will switch to FAAH, HCMD & HFCC to see if they are going to run, here's hoping.

Sun 22 Aug 2010 07:50:48 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:50:48 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:50:48 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:51:29 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:51:29 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:51:29 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:52:11 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:52:11 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:52:11 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:52:51 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:52:51 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:52:51 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:53:33 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:53:33 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:53:33 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:54:15 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:54:15 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:54:15 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:54:57 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:54:57 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:54:57 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:55:38 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:55:38 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
Sun 22 Aug 2010 07:55:38 EST|World Community Grid|Restarting task X0000035030699200406290850_0 using hcc1 version 608
Sun 22 Aug 2010 07:55:40 EST|rosetta@home|Restarting task Rossmann2x3_abinitio_SAVE_ALL_OUT_design_f104_001_21727_1645_1 using minirosetta version 214
Sun 22 Aug 2010 07:56:20 EST|World Community Grid|Task X0000035030699200406290850_0 exited with zero status but no 'finished' file
Sun 22 Aug 2010 07:56:20 EST|World Community Grid|If this happens repeatedly you may need to reset the project.
----------------------------------------
[Edit 2 times, last edit by Former Member at Aug 23, 2010 7:43:04 AM]
[Aug 21, 2010 10:08:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
codes
Advanced Cruncher
Joined: Oct 20, 2009
Post Count: 142
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem after Kernel upgrade.

I've processed at least one HCCC WU since the kernel upgrade and it went okay.

Before I installed the kernel upgrade, I suspened Boinc, installed the upgrade, restarted Ubuntu, then resumed Boinc. Is this what you did?
[Aug 22, 2010 2:54:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem after Kernel upgrade.

I've processed at least one HCCC WU since the kernel upgrade and it went okay.

Before I installed the kernel upgrade, I suspened Boinc, installed the upgrade, restarted Ubuntu, then resumed Boinc. Is this what you did?


Hi codes.

Yes i ran the update after a cold start up in the morning before Boinc had been started,

which i always do anyway.

I have a FAAH task & a CMD2 task running now they both seem o.k. so far, i might try

reticking HCC later to see how it goes.
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 22, 2010 6:55:52 AM]
[Aug 22, 2010 3:04:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem after Kernel upgrade.

I have run this upgrade my usual way, i.e. without stopping anything non critical, then rebooting when convenient.**
And I have had no particular problem either before or after rebooting.

** "when convenient" for this particular one was related to checkpoints of two CEP2 WUs which were running at this time. smile
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Aug 22, 2010 5:55:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem after Kernel upgrade.

Hi P.P.L.

I've done multiple, automated, kernel updates and after the rebuilding of the grub file took very long, sometimes a whole day [when BOINC is running too]. Your log indicates your system was extremely busy with something else and BOINC does not like that [it's actually weak in that department] so when you observe such restarting of tasks, best is to stop BOINC in the terminal window with (for Lucid Lynx with Synaptic package provided client).

sudo /etc/init.d/boinc-client stop

then do the top command to find what eats the CPU time and let it finish and do a normal system restart to clean up.

The new client 6.10.33 and up has a field

whilst processor usage is less than X %

where when on one hand you've set BOINC to crunch 24/7 100%, you can tell it to pause when NON-BOINC processes cause a system load of X percent. The default is 25%, but tests have shown that is way to low and 75% OK. You have to set the Activity Menu to Run based on Preferences for the processor prefs to take effect.

BOINC will btw run before signing in as a daemon/service. You can configure to delay the computing. I've set it to 45 seconds with a line in the cc_config.xml (see FAQ in Start Here forum)

The moniker for Help Conquer Cancer is HCC. We also have HFCC, so using HCCC will lead to confusion.

PS: From the log it looks like HCC was the one running on the busy core. If that restart happens 100 times within a short period [I think an hour] without reaching checkpoint, BOINC considers the job to be bad and aborts it with "too many exits". Typically these restarts happen at a frequency of every 30 seconds, which is the interval that science app and client confirm the science is still alive. If that communication is interfered with, you see that message.

Let us know.

edit: corrected as per post Ingleside, below
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Aug 22, 2010 12:47:07 PM]
[Aug 22, 2010 6:18:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem after Kernel upgrade.

PS: From the log it looks like HCC was the one running on the busy core. If that restart happens 100 times within a short period [I think an hour], BOINC considers the job to be bad and aborts it with "too many exits".

There's no time-period on the restarts, but instead it's 100 restarts without reaching next checkpoint.
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Aug 22, 2010 12:24:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem after Kernel upgrade.

Of course, I keep bringing up that hour... it's just the count as the "too many exits" limited text implies.

Let's see if we can convince Fred of BOINCTasks to add the counts to some column given it's tallied somewhere (already shown for transfer attempts). Will tell the nervous cruncher if it's time to take serious measures before it's too late... not something you'd want to have on a job that's already been running in the 11th hour.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Aug 22, 2010 12:45:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem after Kernel upgrade.

Hi.

I've ran a couple of FAAH, HCMD2 & HCC on this rig with no more problems & all validated.

Just one of those things, i guess. biggrin

Thanks.
[Aug 23, 2010 7:47:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread