| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 11
|
|
| Author |
|
|
widdershins
Veteran Cruncher Scotland Joined: Apr 30, 2007 Post Count: 677 Status: Offline Project Badges:
|
Well I got 4 beta units yesterday for the second try at getting CEP to run on Linux. All four errored out and it looks as though the copies sent out to others have gone the same way.
Oh well, third time lucky. |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
widdershins,
----------------------------------------Is the Status Log of your Beta WU in error looking the same as what I have posted in this thread Beta Testing The Clean Energy Project for Linux and Mac or is it different? If so please post yours and say a little more about your crunching environment. Thanks. Jean. |
||
|
|
Cazfi
Cruncher Joined: Jan 1, 2005 Post Count: 2 Status: Offline Project Badges:
|
I got only one beta WU this time, but it segfaulted just like all the previous ones. I'm yet to see workunit that doesn't error out on everybody in the quorum.
----------------------------------------<core_client_version>6.2.12</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> Calling gridPlatform.init() Calling initGraphics() INFO: No state to restore. Start from the beginning. Calling gridPlatform.init() Calling initGraphics() SIGSEGV: segmentation violation Stack trace (9 frames): [0x86d68eb] [0x8740300] [0xf7ffd400] [0x872cd2f] [0x8708eca] [0x86d219a] [0x804d60b] [0x87423fa] [0x8048131] Exiting... </stderr_txt> ]]> [Edit 1 times, last edit by Cazfi at Dec 20, 2008 3:21:49 PM] |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
I'm yet to see workunit that doesn't error out on everybody in the quorum. You are right. For the Linux environment I don't remember having seen a happy cruncher saying "Mine was valid!". At least all mine in the other thread have gone till the end, maybe it means that the techs are making some progress toward a solution. And maybe one of my peers in quorum will be more successful than me. For the time being I am still the only one who has returned results for all three of them. Cheers. Jean. |
||
|
|
widdershins
Veteran Cruncher Scotland Joined: Apr 30, 2007 Post Count: 677 Status: Offline Project Badges:
|
widdershins, Is the Status Log of your Beta WU in error looking the same as what I have posted in this thread Beta Testing The Clean Energy Project for Linux and Mac or is it different? If so please post yours and say a little more about your crunching environment. Thanks. Jean. Substitute the just the word KBuntu for Ubuntu and change the unit numbers and you could use your post for each of my units and PC setup. |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
OK, thank you.
----------------------------------------Not too many different kinds of errors is rather good. However we don't have much feedback either from testers, and two of my WUs have their repair WU "Waiting to be sent". I don't know if the techs are holding them or if we have not enough testers... Cheers. Jean. |
||
|
|
widdershins
Veteran Cruncher Scotland Joined: Apr 30, 2007 Post Count: 677 Status: Offline Project Badges:
|
I haven't received any more WU's and all of the ones I returned are sitting with the resend units held up as "Waiting to be sent" also. It seems as though two copies were sent out initially, and then similar errors started coming back from different people.
It looks like the techs reacted quickly and put the brakes on the Betas being sent out to stop people wasting valuable crunching time whilst they fix whatever is wrong. |
||
|
|
BobCat13
Senior Cruncher Joined: Oct 29, 2005 Post Count: 295 Status: Offline Project Badges:
|
The techs may already be aware of these, but I'm going to post them anyway. 2 BETA 6.23 CEP for Linux tasks received, both resulting in error status.
The first one had just checkpointed at 50%, when I decided to test the ability to restart from a checkpoint. I let it run to 50.031% and then stopped the daemon, waited a couple of minutes and started the daemon. This resulted in an immediate error with the following stderr.txt: <core_client_version>5.10.45</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> Calling gridPlatform.init() Calling initGraphics() INFO: No state to restore. Start from the beginning. Calling gridPlatform.init() Calling initGraphics() SIGSEGV: segmentation violation Stack trace (7 frames): [0x86d68eb] [0x8740300] [0xffffe400] [0x804fe4f] [0x804dae2] [0x87423fa] [0x8048131] Exiting... </stderr_txt> ]]> The second task ran to completion as I didn't stop the client. Upon reaching 100 percent, the following message is listed: Sat 20 Dec 2008 01:06:18 PM EST|World Community Grid|Computation for task BETA_E000055_090A_000d0s00y_2 finished Sat 20 Dec 2008 01:06:18 PM EST|World Community Grid|Output file BETA_E000055_090A_000d0s00y_2_2 for task BETA_E000055_090A_000d0s00y_2 absent Watching the project's directory I noticed that the _2 output file was not changed upon completion when all of the other output files were and there are now two _3 output files (note the timestamps of the files): 140967 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_0 20 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_1 306858 2008-12-20 12:17 beta_e000055_090a_000d0s00y_2_2 322136 2008-12-20 13:06 beta_e000055_090a_000d0s00y_2_3 357704 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_3 40436 2008-12-20 13:06 BETA_E000055_090A_000d0s00y_2_4 I am just guessing here, but I think the first _3 file listed (in lowercase) is actually the _2 file and was improperly renamed. All of the files were in lowercase until completion when they were changed to uppercase, other than the _2 and the first _3. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I tried stopping and restarting one of mine, and it blew up exactly the same way as BobCat13's. The error trace is identical (even the stack trace addresses) to his.
log says: 21-Dec-2008 09:15:38 [World Community Grid] Restarting task BETA_E000055_079A_000d0s00n_0 using beta6 version 623 21-Dec-2008 09:15:39 [World Community Grid] Computation for task BETA_E000055_079A_000d0s00n_0 finished 21-Dec-2008 09:15:39 [World Community Grid] Output file BETA_E000055_079A_000d0s00n_0_2 for task BETA_E000055_079A_000d0s00n_0 absent 21-Dec-2008 09:15:39 [World Community Grid] Output file BETA_E000055_079A_000d0s00n_0_3 for task BETA_E000055_079A_000d0s00n_0 absent |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
Thank you Bobcat13 and Kremmen for highlighting this specific failure when restarting a WU. I hope that will help the techs to find what is wrong. Looks like a problem with writing one of the necessary files, but I am just guessing...
----------------------------------------Cheers. Jean. |
||
|
|
|