Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 119
Posts: 119   Pages: 12   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 13096 times and has 118 replies Next Thread
gb077492
Advanced Cruncher
Joined: Dec 24, 2004
Post Count: 96
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

Hi Sek,

I checked the logs and the PV wingman shows:

Result Name: ts02_ b483_ sqb000_ 3--
<core_client_version>6.2.28</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
called boinc_finish

</stderr_txt>
]]>

but one of the erroring wingmen shows:

Result Name: ts02_ b483_ sqb000_ 2--
<core_client_version>6.2.28</core_client_version>
<![CDATA[
<message>
Maximum CPU time exceeded
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
Copying wcgrestart.rst
Copying wcgrestart.rst
Copying wcgrestart.rst
Copying wcgrestart.rst
Copying wcgrestart.rst
Copying wcgrestart.rst


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7C90120E

The other just shows

Result Name: ts02_ b483_ sqb000_ 0--
<core_client_version>6.2.28</core_client_version>
<![CDATA[
<message>
Maximum CPU time exceeded
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x75AD22A1

A strange mix, but the "Copying wcgrestart.rst" suggests you're on the right track.

Thanks for your comments. I do have LAIM on. I'll bear your restart trick in mind if I do see this again (and my memory is good enough).

Mike
[Feb 1, 2011 12:33:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

Hi gb0077492

Caught one more "pr" in the act yesterday. At 4.5 Hours, this time progress was only 10.2% where normally this rig does them in just over 5 hours. After applying "the old trick", described above, the job resumed at 9.5%, and at 5.5 hours then stood at 39%, so decided it was okay and went for shuteye. This morning the product showed as valid:

ts02_ b481_ pr23a1_ 1-- 617 Valid 1/29/11 06:43:47 1/29/11 19:23:26 4.26 89.0 / 163.0
ts02_ b481_ pr23a1_ 0-- 617 Valid 1/29/11 06:43:41 2/2/11 02:58:33 8.99 163.0 / 163.0 < moi

The log shows the (re)start i.e. 2 were logged.

Result Name: ts02_ b481_ pr23a1_ 0--
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Calling gridPlatform.init()
INFO: No state to restore. Start from the beginning.
Calling gridPlatform.init()
INFO: No state to restore. Start from the beginning.
Calling gridPlatform.init()
Copying wcgrestart.rst
called boinc_finish

</stderr_txt>
]]>

Not saying it works always, but it does for me :D

Happy crunching.
[Feb 2, 2011 8:55:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
rilian
Veteran Cruncher
Ukraine - we rule!
Joined: Jun 17, 2007
Post Count: 1452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

Got same "grid" error on linux box after 3 hours of crunching

Result Name: ts02_ b195_ sr67a0_ 0--
<core_client_version>6.10.56</core_client_version>
<![CDATA[
<stderr_txt>
Calling gridPlatform.init()
INFO: No state to restore. Start from the beginning.
called boinc_finish

</stderr_txt>
]]>

----------------------
And this one on Mac 10.6.6 after 20 minutes:

Result Name: ts02_ a360_ pr23a1_ 2--
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
process exited with code 29 (0x1d, -227)
</message>
<stderr_txt>
Calling gridPlatform.init()
INFO: No state to restore. Start from the beginning.
CHARGE OUTSIDE INNER GSBP REGION
Encountered error. Exiting.

</stderr_txt>
]]>
----------------------------------------
[Feb 2, 2011 10:04:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

rilian, nope, not the same as the symptom discussed in last 5 posts. Your first log is one for a uninterrupted run and surprised of ending in error without sign, the second is an a known older fail 29. These tasks are crashing prematurely. Had one last night that went home in 0.3 hours.

edit: Logged per BOINCTasks:

6.17 dddt2 ts02_c259_pdb004_0 00:18:20 (00:18:08) 02-02-2011 04:17 02-02-2011 04:19 Reported: Computation error (29,)
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 2, 2011 10:21:16 AM]
[Feb 2, 2011 10:19:57 AM]   Link   Report threatening or abusive post: please login first  Go to top 
gb077492
Advanced Cruncher
Joined: Dec 24, 2004
Post Count: 96
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

There's definitely something wacky going on in DDT2-land. I just checked my result status and a fast, remote machine is showing 2 different units (ts02_ c223_ sr02b0_ 0-- and ts02_ c223_ sr78a0_ 1--) that were killed with "Maximum elapsed time exceeded" after nearly 12 hours CPU time. This type of unit normally takes just one hour on this box. All wingmen are still in progress.
[Feb 2, 2011 11:37:20 AM]   Link   Report threatening or abusive post: please login first  Go to top 
rilian
Veteran Cruncher
Ukraine - we rule!
Joined: Jun 17, 2007
Post Count: 1452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

SekeRob , thank you for investigation! :)
----------------------------------------
[Feb 2, 2011 1:39:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kskjold
Senior Cruncher
Norway
Joined: May 20, 2008
Post Count: 469
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

I have a repair unit that have ended with one error so far and one in progress.

ts02_c283_sqa002

The repair unit is estimated to run over 14 hour on a I7 850, and thats a long time.

So I wounder, what to do?
----------------------------------------
[Feb 2, 2011 4:04:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

I been getting these failures too on both a XP system and Windows 7 64Bit box.. HPF2 still messes up on Windows 7 compared to XP. hope this project gets stablized and thrown into the mainstream soon.. would be nice to see a steady stream of work flow in and out full time..
[Feb 2, 2011 8:19:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kskjold
Senior Cruncher
Norway
Joined: May 20, 2008
Post Count: 469
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

I have a repair unit that have ended with one error so far and one in progress.

ts02_c283_sqa002

The repair unit is estimated to run over 14 hour on a I7 850, and thats a long time.

So I wounder, what to do?


I aborted this one. It had then been running for over 5 hours and estimated time had raised too over 15 hours.
----------------------------------------
[Feb 2, 2011 8:37:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT2 Wu Failures of

I'm in the process of testing several of the work units mentioned in this thread (and the other thread) to attempt to recreate the problem of very long work units that some users have reported.

Seippel
[Feb 2, 2011 8:49:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 119   Pages: 12   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread