Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 8
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1950 times and has 7 replies Next Thread
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Strange error - Maximum run time exceeeded

I believe this is the first error I have encountered on this project and it tells me the maximum time allowed was exceeded, yet the other two units completed were even longer.

Result Name App Version Number Status Sent Time Time Due /
Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit
OET1_ 0000331_ xEBGP-OM_ rig_ 0773_ 3-- - In Progress 2/15/15 12:02:30 2/19/15 00:02:30 0.00 0.0 / 0.0
OET1_ 0000331_ xEBGP-OM_ rig_ 0773_ 2-- 719 Error 2/15/15 03:08:30 2/15/15 12:02:25 4.41 51.6 / 0.0 <= Mine
OET1_ 0000331_ xEBGP-OM_ rig_ 0773_ 1-- 719 Pending Verification 2/11/15 13:08:55 2/15/15 03:08:22 16.88 199.1 / 0.0
OET1_ 0000331_ xEBGP-OM_ rig_ 0773_ 0-- 719 Pending Verification 2/11/15 13:08:49 2/12/15 19:36:25 7.90 199.6 / 0.0

Here is the result file:


Result Log

Result Name: OET1_ 0000331_ xEBGP-OM_ rig_ 0773_ 2--
<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[01:23:05] Number of tasks = 1
[01:23:05] Running task 0,CPU time at start of task 0 was 0.000000
[01:23:05] ./ZINC00057319_1.pdbqt size = 28 10 ../../projects/www.worldcommunitygrid.org/oet1.xEBGP-OM_rig.pdbqt size = 2391 0

</stderr_txt>

Result Log

Result Name: OET1_ 0000331_ xEBGP-OM_ rig_ 0773_ 1--
<core_client_version>7.0.27</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[03:30:43] Number of tasks = 1
[03:30:43] Running task 0,CPU time at start of task 0 was 0.000000
[03:30:43] ./ZINC00057319_1.pdbqt size = 28 10 ../../projects/www.worldcommunitygrid.org/oet1.xEBGP-OM_rig.pdbqt size = 2391 0
[21:09:22] Finished task #0 cpu time used 60750.332658
21:09:22 (21897): called boinc_finish

</stderr_txt>


Result Log

Result Name: OET1_ 0000331_ xEBGP-OM_ rig_ 0773_ 0--
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[10:46:56] Number of tasks = 1
[10:46:56] Running task 0,CPU time at start of task 0 was 0.000000
[10:46:56] ./ZINC00057319_1.pdbqt size = 28 10 ../../projects/www.worldcommunitygrid.org/oet1.xEBGP-OM_rig.pdbqt size = 2391 0
[18:46:31] Finished task #0 cpu time used 28452.830862
18:46:31 (21589): called boinc_finish

</stderr_txt>

The cpu tome/elapsed time in my results staus is listed as 4.41/4.58.

System is Core2Duo at 2.66ghz Linux Mint

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Feb 15, 2015 1:29:37 PM]
[Feb 15, 2015 1:27:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Strange error - Maximum run time exceeeded

If the device benchmark is optimistic compared to what the actual task is being processed for [integer/float/both], you can end up with being allowed less time than wingman.

Linux? I've always had a question mark about the 2-3-4 fold Dhrystone integer benchmark compared to Windows, where the Whetstone float tests are mostly closely aligned.
[Feb 15, 2015 2:04:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Strange error - Maximum run time exceeeded

Current restriction on a 462 task being queued is, where the bound value is 40 times the current average estimate [all platforms].

<rsc_fpops_est>20063117540850.000000</rsc_fpops_est>
<rsc_fpops_bound>802524701634000.000000</rsc_fpops_bound>

The higher the benchmark is, the shorter the max exceed time becomes.
[Feb 15, 2015 2:13:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Strange error - Maximum run time exceeeded

Old knreed post of what could be done, if you fear the time allowed is not enough:
http://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=178898

Of course re-benchmarking might bring an issue to light, such as the test having been done e.g. at 3.6GHz and the device running at 2.4Ghz. Cuts the allowed time 33%.

One note, some output files do grow too over time. There's an upload file size restriction as well, just in case.
[Feb 15, 2015 2:25:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Strange error - Maximum run time exceeeded

Current restriction on a 462 task being queued is, where the bound value is 40 times the current average estimate [all platforms].


OK, thanks. This would make sense since I had a whole slug (more than 150) of real shorties before this one. The current average estimate must have been less than the 10 minute range, if my math is right it would have been about 7 minutes. Unfortunate, but the third wingman should take care of the validation process.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Feb 15, 2015 4:18:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Strange error - Maximum run time exceeeded

What I'd like to emphasize, tried this before, is that the client v7 does not do any runtime adaptation, does not matter if it does 150 or 10. The projection entirely follows the server average stored in <rsc_fpops_est> and the pretty static benchmark values, with whatever lapse factor between generation, distribution and return.

With short work waves, any longer new work is quick to get an adapted header FPOPS, converse after a long work wave, the adaptation in the header is slow to respond, so short work will arrive at first with long work estimates. Another main snag in this story is, how much work is outstanding amongst the total pool of clients and the depth of cache within the buffers of each host [which then has the credit story impact at individual host level at different moments in the evolution].

As uplinger noted, he's uppeted the per-core IP to 35 from 25, as when there's a period of shorts, then flipping to long, an unlimited caching could lead to enormous overbuffering, the HCMD2 experience revisited. Also as DCF is locked to 1.000000 the cache variation is less aggressively responding, it follows the server runtime means, where with a functioning DCF of client v6 and earlier, the runtime reduction was slow to cause a response of the DCF and with runtime increase the DCF would react very quickly [over buffer protection by design].

Do you want the active DCF, downgrade to v6, but then it is even more strongly advisable to not have a multi-day buffer. Does not change the risk of max time exceed. Is the benchmark too optimistic, the host is with variable runtimes more at risk. With an active DCF that would have been countered quickly, which with v7 it much less is.

Sorry if causing silly, no it is not simple at all.
[Feb 16, 2015 5:13:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Strange error - Maximum run time exceeeded

It is an interesting discussion. There is no perfect feedback mechanism due to randomness introduced into the system by an unknown number of variables. If a person was well versed in chaos theory they might be able to buffer the feedback loops sufficiently to mitigate the swings. That would be a way deeper understanding than I possess. Suffice it to say the situation I encountered is most probably quite uncommon.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Feb 16, 2015 6:05:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Strange error - Maximum run time exceeeded

Undersigned is actually quite good at the chaos theory, the practice of it laughing
[Feb 16, 2015 6:15:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread