Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 8
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 9479 times and has 7 replies Next Thread
jgis
Cruncher
Joined: Dec 31, 2006
Post Count: 32
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
confused WU exceeding cpu time limit

on the 23rd I had an enormous incoming Wu for HPF2 which was estimated to take aprox 45 hours to complete on my computer which has Intel dual E8650 3Mhz under the hood.

23-Mar-2009 17:50:11 [World Community Grid] Starting mh743_00060_9

the strange thing happens: the WU is aborted due to cpu time limitation

25-Mar-2009 16:43:47 [World Community Grid] Aborting task mh743_00060_9: exceeded CPU time limit 138399.108939
25-Mar-2009 16:43:52 [World Community Grid] Computation for task mh743_00060_9 finished


the result log :

Result Log

<core_client_version>6.2.28</core_client_version>
<![CDATA[
<message>
Maximum CPU time exceeded
</message>
]]>


I get no credit for all the cpu time used to reach this maximum value. What limit is this?
[Mar 25, 2009 7:08:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU exceeding cpu time limit

Hi,

Fear you had one of these endless looping HPF2 tasks i.e. eating CPU time but from some point making no progress (%). If this was seen a simple BOINC restart would have let it complete near 100% sure. Since it was not, the job ran through to it's maximum allowed cycles which is 6 times 8 times the task internally set expected flops (Other WCG sciences are standard set to time out after 10x expected flops).

Anyway, look in the Result Status page Result Log and also the quorum detail. Latter will tell you if others had a problem with your task, but doubt it.

As for the estimated run time, was that before it started? BOINC occasionally gets completely confused and gives bogus ruin times many times greater than should.

Credit, I've got to disappoint you. Error results only by great exception get a grant.

edit: correct HPF2 time out factor
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Mar 26, 2009 11:23:50 AM]
[Mar 25, 2009 8:04:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
jgis
Cruncher
Joined: Dec 31, 2006
Post Count: 32
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU exceeding cpu time limit

The 45 hour estimated runtime was my estimate after a run time of 20 hours and 25 left to completion. At that point the progress bar was aprox. 26%
Result page:


Result Name Device Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit

mh743_ 00060_ 9-- Notandi-PC Error 22.3.2009 11:41:24 25.3.2009 16:45:08 0.00 0.0 / 0.0

[Mar 25, 2009 10:34:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
GIBA
Ace Cruncher
Joined: Apr 25, 2005
Post Count: 5374
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU exceeding cpu time limit

jgis,

I hope that you be more lucky than me.

Until now, I pick 3 of this bad WU's. In my case, after try all under my hands to solve the issue of progress, need abort all, and got the real sensation of waste a lot of hours on that.

Take a look at your WU's after restart BOPINC, but take a detailed look at the other replicas of this WU's too, once you can get some information about what happens with other coleagues crunching similar WU's like you are.

Anyway, good look for you. good luck coffee
----------------------------------------
Cheers ! GIB@ peace coffee
Join BRASIL - BRAZIL@GRID team and be very happy !
http://www.worldcommunitygrid.org/team/viewTeamInfo.do?teamId=DF99KT5DN1

[Mar 25, 2009 11:57:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU exceeding cpu time limit

jgis, the problem is explained here: http://wcg.wikia.com/wiki/Maximum_CPU_time_exceeded
[Mar 26, 2009 6:11:57 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU exceeding cpu time limit

The line missing in your wiki, Didactylos, is that any amending of the X factor should be done under guidance of the WCG techs. At any rate, this should be reported, so corrective action at the server side can be taken.

A backup should be taken of the client_state.xml. If corrupted, all it's lost.

Also, as I said, the factor for HPF2 is 6x 8x, unless this was changed. All other are 10 afaik.

For the looping of HPF2, no changing is required, just a client restart, an FAQ existing how to do this without need to exit client.

Edit: HPF2 has a time out factor of 8.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Mar 26, 2009 11:23:00 AM]
[Mar 26, 2009 7:32:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU exceeding cpu time limit

We are both wrong.
[Mar 26, 2009 7:59:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU exceeding cpu time limit

I've had 3 WUs running for around 30h before erroring on a computer I only check like once a month, one result is still visible:

mq059_ 00013_ 7-- mycomputername Error 29.06.09 13:58:43 01.07.09 09:35:35 30.04 654.5 / 0.0
[Jul 4, 2009 6:41:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread