Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 48
Posts: 48   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 9915 times and has 47 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Faulty WUs

Thank you Jean,

I'm not too concerned about the answers here - and yours is certainly; very welcome to me. I have forgotten what these forums are like after a few years out of the picture, let alone the growing complexity of the infrastructure behind the crunching.
I did have in mind a simple "red-yellow-green" color code graphic for status of any given aspect of the environment. I can be a bit too simple-minded at times.

santé!
csw
[Apr 10, 2009 4:30:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Faulty WUs

For other details we will have to wait for what the techs will say.


This is a major, routine, confirmed recurrence.

Collect more donations! Hire more techs! smile

Thanks again,
csw
[Apr 10, 2009 4:37:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Trotador
Senior Cruncher
Joined: Mar 26, 2009
Post Count: 154
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Faulty WUs

Last report from my units is that the one that managed to start (X0000093940941200710181340_ 2-- ), has finished properly and has been validated, however the two units that arrived after (and before deselecting HCC in my projects list), failed at the very beginning just with the same message. The units that have failed are:

X0000057651494200509200956_ 2--
X0000057670270200509201107_ 3--
X0000057640362200509190817_ 0--
X0000057670639200509201101_ 3--
X0000057670328200509201106_ 2--
X0000057670304200509201107_ 2--
X0000057621102200510171120_ 2--
X0000057621346200510171116_ 0--

What .log file should I look at and dump here?
----------------------------------------

[Apr 10, 2009 4:47:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Faulty WUs

Trotador,

I've got a handful of HCC WUs running, but haven't dug up their IDs. By now, this issue has been escalated and I'm sure all the king's men are on top of it. I'm watching my coffee pot closer than I am watching this thread smile

arrivederci
csw
[Apr 10, 2009 4:58:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Faulty WUs

What .log file should I look at and dump here?

Trotador,
Thank you for asking.
The problem seems to be clearly on the techs side right now.
If you really want to help you can check by reading the first three two posts of this thread that your failing HCC WUs have failed the same as what is described in these posts. If so you don't need to post any more log. But if there are significant differences please describe them.
The Result Log can be seen by clicking on "Error" in your Results Status page.

Regarding your WU which has succeeded can you please confirm that it had been sent to you or to your peers (click on the WU name in your Results Status page to see the details) before 3:15 UTC.

Thanks for your help. Jean.

Edit: Changed "three" to "two". I had forgotten that you were the third original poster! I am getting tired... smile
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
----------------------------------------
[Edit 1 times, last edit by JmBoullier at Apr 10, 2009 5:15:35 PM]
[Apr 10, 2009 5:08:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Faulty WUs

Result Log:

<core_client_version>6.5.0</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1240211337.136000
Skipping: /computation_deadline
Invalid Parameter - /..
<CImgIOException>

CImg<unsigned char>::get_load_convert() : Failed to open image '../../projects/www.worldcommunitygrid.org/X0000057681117200509190906_X0000057681117200509190906.jp2.gzb'.

Path of 'convert' : "convert.exe"
Path of temporary filename : "C:\WINDOWS\Temp\CImg0590.ppm"

I don't see that mentioned (or I missed it), but, that raises a question, why is BOINC going out of the "sandbox" and could that be the source of the problem ... permissions don't allow wandering this far afield ...

It looks like the error happens almost immediately so aside from loading the servers some it looks to me like a non-issue ... heck, but 190 hours into a task and have it bomb and you have an issue to raise ... this is fly-droppings in the pepper ...
[Apr 10, 2009 10:07:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Faulty WUs

As you say, on the scale of problems, this rates low.

I noticed the temp path, but there are so many error messages that I decided it was useless to speculate.
[Apr 10, 2009 10:13:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
rkar22
Cruncher
Joined: Nov 17, 2004
Post Count: 48
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Faulty WUs

I got 2 "repair" units for those faulty WUs earlier today, and 2 more just a few minutes ago (all errored out in the meanwhile well known way), but - surprise - a brand new WU came in as well:

X0000057640603200510172147_ 0-- In Progress 4/10/09 22:07:09 4/20/09 22:07:09 0.00 0.0 / 0.0
X0000057640603200510172147_ 1-- In Progress 4/10/09 22:06:52 4/20/09 22:06:52 0.00 0.0 / 0.0

I forced it to start, just to see whether it is another of the faulty ones, but no, it started consuming CPU time. So maybe (hopefully!) the situation is back to normal.
[Apr 10, 2009 10:26:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Faulty WUs

What the heck, I'm going to speculate anyway.

It seems clear that one of the workunit parameters is malformed. This means that the input path gets mangled, so the software tries to use the default, which (unsurprisingly) doesn't work.

Hopefully the problem was limited to a single batch.
[Apr 10, 2009 10:37:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Faulty WUs

Path of 'convert' : "convert.exe"
Path of temporary filename : "C:\WINDOWS\Temp\CImg0590.ppm"

I don't see that mentioned (or I missed it), but, that raises a question, why is BOINC going out of the "sandbox" and could that be the source of the problem ... permissions don't allow wandering this far afield ...

Indeed we (rkar22 and me) had those two lines in our logs.
But anyway the problem is not in our machines, there is nothing we can do ourselves.
As Didactylos says the application is apparently looking for something which it has no chance to find where it looks.

Regarding the app going out of its normal playfield it's true and for each of these failing WUs ZAAV asked me for an unusual authorization. But even with answering "Allow" every time, that did not change anything to the end result.
At least it's good to see that ZAAV is doing its job. smile

Cheers. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Apr 10, 2009 10:56:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 48   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread