Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 11
Posts: 11   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3360 times and has 10 replies Next Thread
widdershins
Veteran Cruncher
Scotland
Joined: Apr 30, 2007
Post Count: 677
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
sad Second Linux Beta for CEP

Well I got 4 beta units yesterday for the second try at getting CEP to run on Linux. All four errored out and it looks as though the copies sent out to others have gone the same way. crying

Oh well, third time lucky.
[Dec 20, 2008 11:02:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

widdershins,
Is the Status Log of your Beta WU in error looking the same as what I have posted in this thread Beta Testing The Clean Energy Project for Linux and Mac or is it different? If so please post yours and say a little more about your crunching environment.

Thanks. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Dec 20, 2008 11:54:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Cazfi
Cruncher
Joined: Jan 1, 2005
Post Count: 2
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

I got only one beta WU this time, but it segfaulted just like all the previous ones. I'm yet to see workunit that doesn't error out on everybody in the quorum.
<core_client_version>6.2.12</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
Calling gridPlatform.init()
Calling initGraphics()
INFO: No state to restore. Start from the beginning.
Calling gridPlatform.init()
Calling initGraphics()
SIGSEGV: segmentation violation
Stack trace (9 frames):
[0x86d68eb]
[0x8740300]
[0xf7ffd400]
[0x872cd2f]
[0x8708eca]
[0x86d219a]
[0x804d60b]
[0x87423fa]
[0x8048131]

Exiting...

</stderr_txt>
]]>

----------------------------------------
[Edit 1 times, last edit by Cazfi at Dec 20, 2008 3:21:49 PM]
[Dec 20, 2008 3:20:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

I'm yet to see workunit that doesn't error out on everybody in the quorum.

You are right. For the Linux environment I don't remember having seen a happy cruncher saying "Mine was valid!". At least all mine in the other thread have gone till the end, maybe it means that the techs are making some progress toward a solution. And maybe one of my peers in quorum will be more successful than me. For the time being I am still the only one who has returned results for all three of them.

Cheers. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Dec 20, 2008 3:39:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
widdershins
Veteran Cruncher
Scotland
Joined: Apr 30, 2007
Post Count: 677
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

widdershins,
Is the Status Log of your Beta WU in error looking the same as what I have posted in this thread Beta Testing The Clean Energy Project for Linux and Mac or is it different? If so please post yours and say a little more about your crunching environment.

Thanks. Jean.

Substitute the just the word KBuntu for Ubuntu and change the unit numbers and you could use your post for each of my units and PC setup.
[Dec 20, 2008 4:02:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

OK, thank you.
Not too many different kinds of errors is rather good.
However we don't have much feedback either from testers, and two of my WUs have their repair WU "Waiting to be sent". I don't know if the techs are holding them or if we have not enough testers...

Cheers. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Dec 20, 2008 4:20:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
widdershins
Veteran Cruncher
Scotland
Joined: Apr 30, 2007
Post Count: 677
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

I haven't received any more WU's and all of the ones I returned are sitting with the resend units held up as "Waiting to be sent" also. It seems as though two copies were sent out initially, and then similar errors started coming back from different people.

It looks like the techs reacted quickly and put the brakes on the Betas being sent out to stop people wasting valuable crunching time whilst they fix whatever is wrong.
[Dec 20, 2008 6:02:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BobCat13
Senior Cruncher
Joined: Oct 29, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

The techs may already be aware of these, but I'm going to post them anyway. 2 BETA 6.23 CEP for Linux tasks received, both resulting in error status.

The first one had just checkpointed at 50%, when I decided to test the ability to restart from a checkpoint. I let it run to 50.031% and then stopped the daemon, waited a couple of minutes and started the daemon. This resulted in an immediate error with the following stderr.txt:

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
Calling gridPlatform.init()
Calling initGraphics()
INFO: No state to restore. Start from the beginning.
Calling gridPlatform.init()
Calling initGraphics()
SIGSEGV: segmentation violation
Stack trace (7 frames):
[0x86d68eb]
[0x8740300]
[0xffffe400]
[0x804fe4f]
[0x804dae2]
[0x87423fa]
[0x8048131]

Exiting...

</stderr_txt>
]]>


The second task ran to completion as I didn't stop the client. Upon reaching 100 percent, the following message is listed:

Sat 20 Dec 2008 01:06:18 PM EST|World Community Grid|Computation for task BETA_E000055_090A_000d0s00y_2 finished
Sat 20 Dec 2008 01:06:18 PM EST|World Community Grid|Output file BETA_E000055_090A_000d0s00y_2_2 for task BETA_E000055_090A_000d0s00y_2 absent

Watching the project's directory I noticed that the _2 output file was not changed upon completion when all of the other output files were and there are now two _3 output files (note the timestamps of the files):

140967 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_0
20 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_1
306858 2008-12-20 12:17 beta_e000055_090a_000d0s00y_2_2
322136 2008-12-20 13:06 beta_e000055_090a_000d0s00y_2_3
357704 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_3
40436 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_4

I am just guessing here, but I think the first _3 file listed (in lowercase) is actually the _2 file and was improperly renamed. All of the files were in lowercase until completion when they were changed to uppercase, other than the _2 and the first _3.
[Dec 20, 2008 6:52:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

I tried stopping and restarting one of mine, and it blew up exactly the same way as BobCat13's. The error trace is identical (even the stack trace addresses) to his.

log says:
21-Dec-2008 09:15:38 [World Community Grid] Restarting task BETA_E000055_079A_000d0s00n_0 using beta6 version 623
21-Dec-2008 09:15:39 [World Community Grid] Computation for task BETA_E000055_079A_000d0s00n_0 finished
21-Dec-2008 09:15:39 [World Community Grid] Output file BETA_E000055_079A_000d0s00n_0_2 for task BETA_E000055_079A_000d0s00n_0 absent
21-Dec-2008 09:15:39 [World Community Grid] Output file BETA_E000055_079A_000d0s00n_0_3 for task BETA_E000055_079A_000d0s00n_0 absent
[Dec 20, 2008 10:22:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Second Linux Beta for CEP

Thank you Bobcat13 and Kremmen for highlighting this specific failure when restarting a WU. I hope that will help the techs to find what is wrong. Looks like a problem with writing one of the necessary files, but I am just guessing...

Cheers. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Dec 20, 2008 11:06:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 11   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread