Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 13
Posts: 13   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2509 times and has 12 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Concerning the 'Bad' WUs [Resolved]

If there is one of the bad WU's on a remote machine, so that there is no access to it, when it is past its sell by date will it automatically abort?

I am thinking about the problem of it running endlessly....
----------------------------------------
[Edit 2 times, last edit by Former Member at Jan 13, 2009 7:34:34 AM]
[Jan 13, 2009 6:26:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

The techs are working out which tasks they can abort safely.
[Jan 13, 2009 6:30:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

The client contacts the servers every 3 days and WCG is sending a mass cancel per knreed. See known issues forum. Besides, all jobs have a wallclock limit as I understand it.
[Jan 13, 2009 6:32:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

The client contacts the servers every 3 days and WCG is sending a mass cancel per knreed. See known issues forum. Besides, all jobs have a wallclock limit as I understand it.


Does it contact every three days even if there is nothing to report or new tasks to be requested?
[Jan 13, 2009 7:23:20 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

Exactly.
[Jan 13, 2009 7:24:29 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

Cheers, Resloved.
[Jan 13, 2009 7:33:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1410
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

The client contacts the servers every 3 days and WCG is sending a mass cancel per knreed. See known issues forum. Besides, all jobs have a wallclock limit as I understand it.

It is not wallclock limit, but CPU-time limit, actually the rsc_fpops_bound; the maximum number of floating point operations.
And the CPU time stays on ZERO sad and I do not know if in this bug fpops are counted??
[Jan 13, 2009 8:23:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

No really, there is a wallclock/date kill switch besides the fpops control. Think to have read 3 weeks. Can't find something explicit this moment.

Edit. Cant's see anything in the client_state.xml on wallclock but find that the Human Proteome Folding 2 is 8 x the estimates fpops and FightAIDS has a stop setting of 10x. Nothing there for the other projects, but think Didactylos wrote it was 10x by default.
----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 13, 2009 8:39:06 AM]
[Jan 13, 2009 8:30:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

lavaflow, you are thinking of the United Devices system, which had a 3 week limit in addition to the runtime limit.

In the BOINC system, things can go very wrong when tasks don't run properly. BOINC can be too trusting. It will run tasks beyond their deadline.
[Jan 13, 2009 1:35:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Concerning the 'Bad' WUs

lavaflow, you are thinking of the United Devices system, which had a 3 week limit in addition to the runtime limit.

In the BOINC system, things can go very wrong when tasks don't run properly. BOINC can be too trusting. It will run tasks beyond their deadline.


So you are saying it will continue to run a WU even tho' it is days beyond its 'return' date.....if so this is NOT resolved....
[Jan 13, 2009 2:05:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 13   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread