Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 19
Posts: 19   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2898 times and has 18 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Beta WU crashed at end of run

wcg_beta_img_5.17_i686-pc-linux-gnu BETA_X0000045460272200502081427_ 1-- appears to have made an illegal memory reference at the end of its run. (At least, it was due to end at rougly that time.)

<core_client_version>5.8.15</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>
About to call graphics init
dlopen() failed: libGL.so.1: cannot open shared object file: No such file or directory
No graphics.
INFO: No state to restore. Start from the beginning.
ERROR: Restoring checkpoint failed. Unable to restore state!
In ExtractGlcmFeatures: End of 0 iteration of outer loop.
In ExtractGlcmFeatures: End of 1 iteration of outer loop.
[... boring part edited to save space ... ]
In ExtractGlcmFeatures: End of 23 iteration of outer loop.
In ExtractGlcmFeatures: End of 24 iteration of outer loop.

</stderr_txt>
]]>

Any idea what went wrong, gurus?
[Oct 28, 2007 2:36:59 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

It's Beta, so the gurus will likely say: What's the status code on the Result Status page? If Pending Validation, all is 'normal' in the result log. Else, leave it to the techs to analyse so they can fix it before the project launch.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 28, 2007 7:34:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

Sorry if it wasn't explicit, but I wouldn't have posted were it not that it returned Error.

I was only posting as I'm curious as to whether they know what went wrong. Also curious as to whether we get credit in these cases.
[Oct 28, 2007 12:25:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

On 'error/invalid/valid' results: No the beta policy is that it gets the same treatment on credits as regular work.

Added: https://secure.worldcommunitygrid.org/ms/device/viewBetaProfiles.do
In either case, you will receive credit and points for Beta Test work just as you would for any other project.

----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Oct 28, 2007 1:09:53 PM]
[Oct 28, 2007 12:32:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

Else, leave it to the techs to analyse so they can fix it before the project launch.


Well, it's a shame they didn't. A project WU crashes exactly the same way. So what's the point of beta if you ignore the results?
[Nov 4, 2007 6:14:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

My reading tells me that signal 11 is usually caused by a hardware error. (I'm fairly sure segfaults generate a different error.) So, the first thing to do is to check the status of the other work units in your quorum. If they are returned as valid (or pending validation) then clearly you are alone in having this problem.

If that is the case, then it is time to do some hardware diagnostics.
[Nov 4, 2007 6:23:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

My reading tells me that signal 11 is usually caused by a hardware error. (I'm fairly sure segfaults generate a different error.)


Signal 11 is, by definition, segmentation violation. It means that a process tried to access memory out of its process space, or tried to write into a read-only location. Yes, this can be caused by hardware faults.

However, there are 3 reasons to believe this is not a hardware issue:
1) As far as I can remember, these two machines, which have devoted between them almost a year of CPU time to WCG, have never had an error on any other project.
2) They both crashed at exactly the same point of these WUs.
3) One of them is a dual-processor machine, and the other WU it was working on at the time was unaffected.

If the same project crashes in the same place on two different machines which are (apart from that one project) 100% reliable, I find the hardware fault hypothesis to be highly unlikely, to say the least.
[Nov 5, 2007 3:38:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

Yes, signal 11 is SIGSEGV - I just thought BOINC logged them differently. Maybe I've been in Windows-land for too long....

Please will you confirm that the other copies of this work unit succeeded? If they failed, the work unit is automatically reported to WCG. If not, then we need to work out what is special about the computer on which it failed.

I know you want to rule out hardware issues, but it is the first thing we have to check. Do you overclock?
[Nov 5, 2007 8:53:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

Please will you confirm that the other copies of this work unit succeeded? If they failed, the work unit is automatically reported to WCG. If not, then we need to work out what is special about the computer on which it failed.

I know you want to rule out hardware issues, but it is the first thing we have to check. Do you overclock?


Other copies of the WU succeeded.

The one that failed on beta is a stock standard dual-P3/866 Dell server running slightly underclocked at 860.9Mhz.

The one which failed most recently is an ancient Celeron overclocked to 75MHz FSB rather than 66MHz.
[Nov 5, 2007 10:26:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta WU crashed at end of run

I will ask the techs to look this over.

Be advised, though, that they will check the invalid return rate for the project, and if it is low then looking into this will take a low priority. So far, you are the only member to experience this problem.

Meanwhile, you can deselect the project - unless you feel like attaching a debugger and trying to get a stack trace to help the techs....
[Nov 5, 2007 10:38:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 19   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread