Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 5
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 855 times and has 4 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Filter out bad WUs before posting them

Over 67 hours running a WU (just my machine alone), and then lists as an error on all computers that run it. No science, no credits, just wasted computer time. Is there a way to filter bad WUs out before posting them? Should we just switch to other projects?

WU: 10001358-10001450

<message>
Maximum CPU time exceeded
</message>
<stderr_txt>
About to call graphics init
Failed to get VersionInfo size: 1812

sad
[Mar 28, 2007 1:15:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Filter out bad WUs before posting them

WCG send out 'Cancel' messages to hosts to abort either manually (5.4.x) or automatic (5.8.x). That only works if the clients initiates the contact thru for instance scheduling or thru hitting the update button of BOINC in the Projects tab!

Further coding has been added for future release to improve the process.

It's unfortunate, but if members do not read the fora, or the message logs or are simply off-line for longer periods, it becomes very hard to do anything to stop these rogue processes.

sorry

Added: Obviously, the recommendation is to upgrade post-haste to the latest version 5.8.15 (with WCG skin for Windows):
  • For BOINC for Linux installation files 5.8.15, click here.
  • For BOINC for Mac installation files 5.8.15, click here.
  • For BOINC for Windows installation files 5.8.15, click here.

----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Mar 28, 2007 2:01:36 PM]
[Mar 28, 2007 1:33:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Filter out bad WUs before posting them

WCG send out 'Cancel' messages to hosts to abort either manually (5.4.x) or automatic (5.8.x). That only works if the clients initiates the contact thru for instance scheduling or thru hitting the update button of BOINC in the Projects tab!

Further coding has been added for future release to improve the process.

It's unfortunate, but if members do not read the fora, or the message logs or are simply off-line for longer periods, it becomes very hard to do anything to stop these rogue processes.

And if the machine is off site? Do all of the "projects" have this problem or is it just Genome Comparison?
In other words, which projects don't have this problem, so I can just run them?
Thanks in advance!
[Mar 28, 2007 11:57:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Filter out bad WUs before posting them

Bad work units on any WCG project are very rare. Unfortunately, there is no way to detect them before hand.

However, when it does happen, WCG can act fast. They look at the failed work units, and fix the problem. As far as I know, the recent issue with Genome Comparison has already been fixed (several days ago, now).

The bad work units should time out automatically (as yours did), so the worst possible case is you lose a little crunching time. And for that, I apologise.
[Mar 29, 2007 12:06:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Filter out bad WUs before posting them

..... And if the machine is off site?.....
Thanks in advance!


The logical conclusion would be that off-site machines have a contact schedule which would than fetch the abort message for the bad work units, 'started/in progress' or in 'ready to run' state upon remote instruction from the project server AGAIN, only when the host contacts the server. Case here is to upgrade those off-site machines to the suggested version ASAP, to automate the action.

If such machines have a scheduled on-line, would propose to set the 'connect to' frequency to either the default of WCG being 0.3 days (every 7.2 hours) or if concerned about crunching while for instance the network isa down for a longer period to 1 day. Obviously that would in the worst case cause bad WU's to run amok for that scheduled time.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Mar 29, 2007 6:11:06 PM]
[Mar 29, 2007 8:26:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread