Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 36
Posts: 36   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7140 times and has 35 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

Why, Philip-in-hongkong, when you can read above that these jobs can run up to 12 hours? The percent progress you can happily ignore since we know it can go bonkers on HCMD2 (seen as high a 900%) and finish properly showing 100% in the last seconds.

What client version is this and was the job interrupted, paused, pre-empted in between?
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at May 12, 2010 1:26:00 PM]
[May 12, 2010 1:20:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
philip-in-hongkong
Cruncher
Joined: Jun 22, 2006
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

My BOINC client is 6.10.18. I run multiple projects and the PC does not run 24/7. Hence the job has been paused and restarted. In view of your advice, I have resumed it and see that it will finish properly.

Cheers,
Philip
[May 12, 2010 2:13:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

OK, plz let us know the outcome, so we can build some confidence.

On an aside, since client version 6.10.xx even the late alpha client 6.10.48 can loose count on elapsed time and thus report wrong percentage on resume, but so far found that the CPU time is unwavering correct. That bit is controlled by the science application at WCG.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 12, 2010 2:21:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
philip-in-hongkong
Cruncher
Joined: Jun 22, 2006
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

With much regretted, it errored out - max runtime exceeded.


<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
called boinc_finish
Finishing early because max runtime has been exceeded.43246.984375
called boinc_finish

</stderr_txt>
]]>
[May 14, 2010 12:57:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

philip,

That's not an indication of an error. That is 12:01 hours, the hard coded end if not all positions in a task were completed. What does the status field say that you clicked on to get this log report? Pending Validation, Inconclusive, Invalid, Error?
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 14, 2010 1:21:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bieberj
Senior Cruncher
United States
Joined: Dec 2, 2004
Post Count: 406
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

Saw this happen yesterday - ran over 100% and worked its way beyond 250%. It finished but resulted in Error. :( I did check the messages but didn't see anything that indicated that something was wrong with the work unit. I am using client 6.2.28.



Result Name App Version Number Status Sent Time Time Due /
Return Time CPU Time (hours) Claimed/ Granted BOINC Credit
CMD2_ 0538-2NRU_ A.clustersOccur-3CMQ_ A.clustersOccur_ 23_ 154107_ 155545_ 2-- - In Progress 6/18/10 03:19:41 6/22/10 03:19:41 0.00 0.0 / 0.0
CMD2_ 0538-2NRU_ A.clustersOccur-3CMQ_ A.clustersOccur_ 23_ 154107_ 155545_ 0-- 614 Inconclusive 6/16/10 15:49:06 6/17/10 09:55:49 4.72 43.1 / 0.0
CMD2_ 0538-2NRU_ A.clustersOccur-3CMQ_ A.clustersOccur_ 23_ 154107_ 155545_ 1-- 614 Error 6/16/10 15:44:24 6/18/10 03:16:56 10.89 180.5 / 0.0
----------------------------------------
[Edit 2 times, last edit by bieberj at Jun 18, 2010 11:48:46 AM]
[Jun 18, 2010 11:46:09 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

4 spots you may check whenever a result turns error:

1. The BOINC Manager active message log.
2. When older, the stdoutdae.txt and when even older the stdoutdae.old file from the time the result was reported (taking into account that the website is UTC time!)
3. The stderrdae.txt file (might have additional clues v.v. the client)
4. The error link on the Result Status page, providing the actual task log.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jun 18, 2010 12:28:59 PM]
[Jun 18, 2010 12:28:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bieberj
Senior Cruncher
United States
Joined: Dec 2, 2004
Post Count: 406
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

Sek,

My computer got rebooted yesterday afternoon so the message log details have been lost. I was unable to find the stdoutdae.txt or stderrdae.txt when using the search tool.

And the error link shows (nothing useful as far as I can tell)


Result Log

Result Name: CMD2_ 0538-2NRU_ A.clustersOccur-3CMQ_ A.clustersOccur_ 23_ 154107_ 155545_ 1--
<core_client_version>6.2.28</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
called boinc_finish
called boinc_finish

</stderr_txt>
]]>
[Jun 18, 2010 6:34:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

The BOINC data dir (printed in message log at start of client), can be hidden when looking in the File Explorer or searching, but the path can still be typed to get there. Twice a finish in the log likely indicates the result file got corrupted and no good resume was possible, so stdxxxdae.txt might tell more what was on.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 18, 2010 7:42:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bieberj
Senior Cruncher
United States
Joined: Dec 2, 2004
Post Count: 406
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU at 111,111% and still running

Here's the snippet that Slickedit was able to find:

16-Jun-2010 23:44:38 [World Community Grid] Starting CMD2_0538-2NRU_A.clustersOccur-3CMQ_A.clustersOccur_23_154107_155545_1
16-Jun-2010 23:44:38 [World Community Grid] Starting task CMD2_0538-2NRU_A.clustersOccur-3CMQ_A.clustersOccur_23_154107_155545_1 using hcmd2 version 614
17-Jun-2010 03:45:51 [World Community Grid] Restarting task CMD2_0538-2NRU_A.clustersOccur-3CMQ_A.clustersOccur_23_154107_155545_1 using hcmd2 version 614
17-Jun-2010 13:41:51 [World Community Grid] Computation for task CMD2_0538-2NRU_A.clustersOccur-3CMQ_A.clustersOccur_23_154107_155545_1 finished
17-Jun-2010 13:41:53 [World Community Grid] Started upload of CMD2_0538-2NRU_A.clustersOccur-3CMQ_A.clustersOccur_23_154107_155545_1_0
17-Jun-2010 13:42:09 [World Community Grid] Finished upload of CMD2_0538-2NRU_A.clustersOccur-3CMQ_A.clustersOccur_23_154107_155545_1_0
[Jun 18, 2010 10:46:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 36   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread