Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1455 times and has 5 replies Next Thread
KLT Associates
Cruncher
USA
Joined: Dec 1, 2008
Post Count: 16
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
sad Tasks that fail

We hosted a task that ran for over 42 hours and failed to finalize its results, so it started over and ran for another 42 hours and again failed to finalize its results, so it started over ... so we aborted the task.

Apparently we don't get credit for those 84 wasted CPU hours???
[Dec 22, 2008 12:06:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bieberj
Senior Cruncher
United States
Joined: Dec 2, 2004
Post Count: 406
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks that fail

The best you can do is to post the contents of the message tab of the entire work unit here and see if it is a known problem. At least a community advisor will address.
[Dec 22, 2008 2:16:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLT Associates
Cruncher
USA
Joined: Dec 1, 2008
Post Count: 16
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks that fail

Here is a sample of the messages:

12/21/2008 12:17:56 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 12:17:56 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 12:17:56 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 1:07:14 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 1:07:14 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 1:07:14 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 1:55:52 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 1:55:52 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 1:55:52 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 2:45:46 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 2:45:46 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 2:45:46 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 3:35:56 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 3:35:56 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 3:35:56 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 4:25:03 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 4:25:03 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 4:25:03 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 5:13:47 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 5:13:47 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 5:13:47 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 6:03:37 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 6:03:37 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 6:03:37 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
12/21/2008 6:51:54 AM|World Community Grid|Task E000044_589A_00055i00x_1 exited with zero status but no 'finished' file
12/21/2008 6:51:54 AM|World Community Grid|If this happens repeatedly you may need to reset the project.
12/21/2008 6:51:55 AM|World Community Grid|Restarting task E000044_589A_00055i00x_1 using cep1 version 619
[Dec 22, 2008 5:13:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
E. Frijters
Senior Cruncher
The Netherlands
Joined: Apr 26, 2007
Post Count: 228
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks that fail

I encountered the same thing. I aborted 2 WU's recently that ran 80 hrs or more before I recognised the behaviour (because of the weekend). The same message appeared.

I don't mind wasting a short running WU, but wasting 80 hrs is a bit harsh when you think of the amount of HCC wu's I could've processed.
----------------------------------------
Former grid.org slave


[Dec 22, 2008 5:21:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLT Associates
Cruncher
USA
Joined: Dec 1, 2008
Post Count: 16
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks that fail

Thanks for sharing that. We are new at WCG. At least now we know it's not something uniquely wrong on our end. We've decided not to take any more WUs from CEP until someone from CEP clears this up for us.
[Dec 22, 2008 6:24:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
E. Frijters
Senior Cruncher
The Netherlands
Joined: Apr 26, 2007
Post Count: 228
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks that fail

Welcome KLT Associates!

I just wanted to let you now that this not common here at WCG. Probably some bug that got past the scientists without being squashed...

The service towards crunchers is the best I've seen in grid computing so far...
applause

(so I expect a solution soon biggrin )

[update]Please follow this link:
https://secure.worldcommunitygrid.org/forums/wcg/viewthread?thread=22828
[/update]
----------------------------------------
Former grid.org slave


----------------------------------------
[Edit 2 times, last edit by E. Frijters at Dec 24, 2008 11:59:43 AM]
[Dec 23, 2008 1:54:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread