Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 11
Posts: 11   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1907 times and has 10 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Time outs and Slow Computers.

Hi everyone I have a Question "or two".

Do we get points for Work Units that "Time Out"?

confused
[Jan 27, 2005 4:21:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Time outs and Slow Computers.

Do we get points for Work Units that "Time Out"? confused


I have had a look at this and the grid.org community FAQ which can be found here

states:


What are WU timeouts

A WU may or may not have a timeout associated with it. All WUs for a specific protein are split into batches. A batch may or may not have a timeout associated with it but all WU's within a batch will have whatever batch timeouts have been defined.

There are 2 timings that the agent checks against:

- CPU time: The time spent processing a molecule as shown on the UD agent Task CPU Time

- Wallclock time: The length of time a WU remains valid on your PC. Timing starts from the date/time the WU was downloaded.

There are several known timeout ranges although the first seems to be the only one used:

[snip - different timeouts for us]

Both CPU and wallclock timeouts apply to a WU.

If a WU takes longer than the CPU timeout to process (Task CPU time) the WU is aborted.

If the WU has been on your system for longer than the wallclock time it will be aborted. If it has not been processed yet it will be aborted as soon as processing starts (i.e UD Monitor situation)

[snip]



What happens when a WU times out

An aborted WU is returned to UD where it may be slit up into smaller WUs and sent out again. However, if there have been enough completed results for that WU from other PCs it won't be split or sent out again.

For an aborted WU you will get credit for the CPU time and the corresponding points for that time. You will not be credited with a result in your stats as the WU will not have been completed.

You may also see a large download occuring when a WU aborts (700k+). This is because an aborted WU is treated the same as a corrupted WU. THis will cause a new copy of the THINK program to be downloaded again with the next WU.


So it would appear that within their client, you get the credit for aborted (manual or time related) WUs but you will not be credited with a result.

I cannot see any reason why our client would be any different but one of the tech support guys will have to confirm.

It may also be beneficial if some of the detail contained within the grid.org FAQ could be replicated here (once all the detail had been checked). Obviously permission to use will need to be sought.
[Jan 27, 2005 9:25:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Time outs and Slow Computers.

smile Hi Stuart
I think you may have misinterpreted the wording here
I believe you do not get your points if you personally manually abort a Work Unit yourself
I think the rule of 14 Days +1 sec has to cause the Work Unit to expire to get credit for it
It has to send the expiration date info back to the server to get credit
I will attempt to get this clarified for you
----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 27, 2005 10:19:50 AM]
[Jan 27, 2005 10:17:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Time outs and Slow Computers.

Graham,

It is more than possible that I have misunderstood biggrin but the FAQ states:

For an aborted WU you will get credit for the CPU time and the corresponding points for that time. You will not be credited with a result in your stats as the WU will not have been completed.


so I was assuming that this was manual or not.

Whilst you are seeking clarification, can you try to find out if the WUs are re-assigned if they are aborted for whatever reason as per

An aborted WU is returned to UD where it may be slit up into smaller WUs and sent out again. However, if there have been enough completed results for that WU from other PCs it won't be split or sent out again.

[Jan 27, 2005 10:42:29 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Time outs and Slow Computers.

For an aborted WU you will get credit for the CPU time and the corresponding points for that time. You will not be credited with a result in your stats as the WU will not have been completed.
so I was assuming that this was manual or not.

smile I read this to mean, aborted by Rosetta.exe, not aborted by the user
A good point to get clarified, though!
Regards
[Jan 27, 2005 10:50:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Time outs and Slow Computers.

Stuart, Graham,

We will see what is thrue on monday, then my 14days + 1sec ) WU is returned before reaching completion.
it is now at 65.4% after 259hrs 39min runtime at 12:45 CET = 11:45 GMT.

smile You see Graham I behaved...... biggrin biggrin biggrin
[Jan 27, 2005 11:47:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Time outs and Slow Computers.

Frans,

This will answer the automatic aborting of a WU but not the manual. I may try that with one after my current WU is finished.

I will let it run for a period (up to abot 10%), kill it and see if my points go up.

I would also like to know if aborted WUs are reallocated.
[Jan 27, 2005 1:29:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
cool Re: Time outs and Slow Computers.

Back in November 2004, we had a bug in Rosetta. When it decided that a particular Work Unit would not converge, it aborted the Work Unit. The bug kept it from reporting the reason back to the server. Pretty soon we were mostly running copies of non-convergent Work Units as the server tried to get results on them. There are probably all sorts of different cases, but it is obvious that the server reacts quickly to try to get results.

Also, I seem to recall a staff post saying that a manual abort was not awarded points while a program-induced abort was, but there are a number of different cases and my memory is shaky. Try it and we will find out for ourselves!!

Lawrence
[Jan 27, 2005 4:54:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RT
Master Cruncher
USA - Texas - DFW
Joined: Dec 22, 2004
Post Count: 2636
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
applause Re: Time outs and Slow Computers.

Very good info. Suitable for a FAQ article when answers are given. applause
----------------------------------------
One of your friends in Texas cowboy
RT Website Hosting

[Jan 27, 2005 4:56:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
shock OK I Did it!

OK I Did it.
I killed two in quick succession.
The second one, showed a 101 point gain for @3.5hrs.
Much as I hate to admit it.
This may show that Slower Crunchers that are going to timeout
may just as well do a manual return so a faster PC gets another chance for it. (I got points)
Waiting 2 weeks for a time out is (Sorry) a waste of a Cruncher that
would other wise be doing a Unit more its capacity.
But be sure the Wisdom needed to offer this as advice is not with this poster!
But the Three timeout posts in a row I wanted to say to Dump them
but needed an opinion.
So I got points. But does that hurt the Project or help it.
That is the motivation for me.

Bob
----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 29, 2005 8:26:14 PM]
[Jan 29, 2005 8:25:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 11   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread