| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 28
|
|
| Author |
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
ARP1_0034171_091_5 has just suffered its 5th error
|
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
https://www.worldcommunitygrid.org/forums/wcg...ead,43480_offset,0#659482
----------------------------------------[or, in other words, that's insufficient information, Mike] ![]() [Edit 2 times, last edit by adriverhoef at Nov 21, 2021 12:23:21 PM] |
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
Error with ARP1_0031032_090 after over 8 computing hours.
----------------------------------------<core_client_version>7.6.31</core_client_version> Cheers, Yves |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
adriverhoef
That was an alert for the techs rather than asking for help here. Shortly afterwards they server aborted the outstanding copy. Mike |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1403 Status: Offline Project Badges:
|
ARP1_0018506_129 has 2 different error tasks now. 2 resends are in progress.
----------------------------------------https://www.worldcommunitygrid.org/contribution/workunit/150012803 My error task after 4 hours runtime is unusual and very rare on that machine. Access violation. Could have several reasons. For the code readers: Results log [Edit 1 times, last edit by Crystal Pellet at Feb 12, 2022 6:48:07 PM] |
||
|
|
sam6861
Advanced Cruncher Joined: Mar 31, 2020 Post Count: 107 Status: Offline Project Badges:
|
This 64 bit ARP1's Linux stack looks nearly the same as the other topic. Probably a floating point number overflow to NAN (Not a Number). More details on page 3 in this topic.
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,43681 Note: 32 bit ARP1 app can handle NAN and complete without error, but this appears to limit useful data to minimal or nothing. 32 bit and 64 bit may make slightly different calculation due to 32 bit app use a slower 80 bit float, to randomly avoid NAN, but NAN can still happen to 32 bit app. |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1403 Status: Offline Project Badges:
|
4 hours later, I had the same error again, now after 8 hours runtime: https://www.worldcommunitygrid.org/contribution/workunit/150012104
I think running ARP's together with 4-core VM's for LHC-ATLAS is sometimes conflicting. The amount of memory is not the problem. |
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
Two machines experience errors with 7 ARP1 WUs:
----------------------------------------<core_client_version>7.9.3</core_client_version> I don't know if boinc killed the WUs because of too short deadline or if the WUs have a failure. I noticed that some wingmen did experience the same failure while other wingmen computed the WUs successfully. Cheers, Yves |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
Yves, if all you see is only one line, specifying the core_client_version, then the task hasn't been started before the deadline.
----------------------------------------Either your task will be aborted by the server (Server Aborted) or your client will abort the task. Here are two examples: This is a case of Server Aborted: workunit 150448089 ARP1_0030745_125_0 CentOS Linux Valid 2022-02-12T05:52:44 2022-02-15T12:12:11 42.98/44.56 631.5/740.1 And here is a case where the BOINC client intercepted a task (ARP1_0003843_124_1) before it was started: workunit 150464865 ARP1_0003843_124_0 Linux Ubuntu Valid 2022-02-12T11:42:04 2022-02-14T00:48:52 13.69/13.69 741.0/537.3--------------------------------------------------------------------------------------------------------------------------------------- Details: ARP1_0003843_124_0 Linux Ubuntu Valid 2022-02-12T11:42:04 2022-02-14T00:48:52 13.69/13.69 741.0/537.3 [Edit 1 times, last edit by adriverhoef at Feb 17, 2022 11:33:18 AM] |
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
Hi Adri,
----------------------------------------I thank you for the confirmation. It was what I guessed: unable to complete (or even to start) the work prior the deadline. Cheers, Yves |
||
|
|
|