Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 24
Posts: 24   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5902 times and has 23 replies Next Thread
asdavid
Veteran Cruncher
FRANCE
Joined: Nov 18, 2004
Post Count: 521
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
2 HST tasks reported as error today

This is the first time i got error for HST
HST1_ 006698_ 000085_ KC0009_ T000_ F00081_ S00008_ 0-- IBM543-R90K5GMF Error 8/9/16 15:33:20 8/12/16 16:21:06 5.23 / 5.47 109.2 / 0.0
HST1_ 006698_ 000084_ KC0009_ T000_ F00079_ S00008_ 1-- IBM543-R90K5GMF Error 8/9/16 15:33:20 8/11/16 20:49:30 2.75 / 2.83 93.5 / 0.0

I checked the error window in the result status page and do not understand why ...
Here are the last lines (same for the second task)
[17:24:36] INFO: Run complete, CPU time: 18812.831394
17:24:36 (7328): called boinc_finish(0)

Where could i find more information to understand what occurred?
Thanks for your help
----------------------------------------
Anne-Sophie

[Aug 12, 2016 4:45:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

'Error' usually happens on the client where clicking on the Error link on the Result Status pages and in Client Event Log, around reporting timestamp (with UTC offset), reveals most often what happened, probably your time 17:24:36.
----------------------------------------
[Edit 1 times, last edit by SekeRob* at Aug 12, 2016 4:55:00 PM]
[Aug 12, 2016 4:49:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

BTW, for knreed, he might want the IBM machines to be set to Network suppress in cc_config so the RS pages show the 7 digit ID instead of the network searchable host name (If that is what it is what Anne posted.)
----------------------------------------
[Edit 1 times, last edit by SekeRob* at Aug 12, 2016 4:55:29 PM]
[Aug 12, 2016 4:53:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
asdavid
Veteran Cruncher
FRANCE
Joined: Nov 18, 2004
Post Count: 521
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

Here are the event log messages got with the name of the second task around error time:
12/08/2016 17:24:39 | World Community Grid | Computation for task HST1_006698_000085_KC0009_T000_F00081_S00008_0 finished
12/08/2016 17:24:41 | World Community Grid | Started upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_0
12/08/2016 17:24:41 | World Community Grid | Started upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_1
12/08/2016 17:24:44 | World Community Grid | Finished upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_0
12/08/2016 17:24:44 | World Community Grid | Started upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_2
12/08/2016 17:24:46 | World Community Grid | Finished upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_2
12/08/2016 17:24:46 | World Community Grid | Started upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_3
12/08/2016 17:24:48 | World Community Grid | Finished upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_1
12/08/2016 17:24:48 | World Community Grid | Started upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_4
12/08/2016 17:24:49 | World Community Grid | Finished upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_4
12/08/2016 17:24:49 | World Community Grid | Started upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_5
12/08/2016 17:24:50 | World Community Grid | Finished upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_5
12/08/2016 17:24:50 | World Community Grid | Started upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_6
12/08/2016 17:24:51 | World Community Grid | Finished upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_6
12/08/2016 17:25:00 | World Community Grid | Finished upload of HST1_006698_000085_KC0009_T000_F00081_S00008_0_r640156328_3

No match for the word error

Edit to correct spelling errors
----------------------------------------
Anne-Sophie

----------------------------------------
[Edit 1 times, last edit by asdavid at Aug 12, 2016 5:26:13 PM]
[Aug 12, 2016 5:25:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

Strange. The 3rd place a DIY works is doing an XML API pull into a spreadsheet of the Result Status pages (max 250 records per query), which gives items such as ExitStatus, Outcome, ServerSatus and ValidationState.
----------------------------------------
[Edit 1 times, last edit by SekeRob* at Aug 12, 2016 5:42:40 PM]
[Aug 12, 2016 5:39:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1684
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

Again "SIGSEGV: segmentation violation" after 8+ hours:
HST1_006904_000016_AC0012_T300_F00029_S00009
----------------------------------------
[Aug 14, 2016 12:12:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

Anne-Sophie,
I took a look at those two results and the issue wasn't an application error the issue was on validation. When we get output files back we run several checks on them and compare them to results from another machine. These two failed that process for some reason. It could have been a corrupt file or something but unfortunately the results have been deleted. If you see any more of these post back here and we can investigate further and perhaps catch one before it is deleted.

Thanks,
armstrdj
[Aug 15, 2016 3:07:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
asdavid
Veteran Cruncher
FRANCE
Joined: Nov 18, 2004
Post Count: 521
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

I will, I have some more HST tasks to run (i hope they will be valid wink )
I was thinking that validation checks give invalid not error. Thanks for your feedback.
----------------------------------------
Anne-Sophie

[Aug 16, 2016 7:41:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

Their are different possible outcomes from validation. Most of the time a result passes all checks and then is compared to another result. If this comparison is equal the result is marked valid and if it is not equal it goes into a wait state for another result to compare to. When a third result is returned the two matching results are marked valid and the different one is marked invalid. In a few cases the initial checks on an output file fail inspection and this is when it is marked in error.

Thanks,
armstrdj
[Aug 16, 2016 12:49:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1684
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2 HST tasks reported as error today

Again "SIGSEGV: segmentation violation" after 10+ hours
HST1_007008_000032_AT0023_T300_F00034_S00009
...
boring, boring, boring,
Yves
----------------------------------------
[Aug 17, 2016 6:48:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 24   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread