Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 106
Posts: 106   Pages: 11   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
This topic has been viewed 548858 times and has 105 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

nanoprobe, what's the deadline on these?

September 10th

And FWIW it's happening on 3 different crunchers.

Now it gets even stranger. One cruncher stopped 6 Leishmaniasis WUs that had been running for a total of 4 hours and started 6 new ones. All with the Sept. 10th due date.


Why this could happen depends on:

What's your connect setting?
What's your additional buffer setting?
What's the Switch between apps time setting?

These 3 work together and can push the panic state envelope. with an initial estimated run time uncertainty for DSFL, totals could have inflated. Certainly the first 6 received had 4 days deadline. The later had 10 days, and all of these came with 8:20 hours TTC. They completed in ~4:30 and after a half dozen the ones waiting in cache still show 6:42 TTC. Everything runs orderly, 2 day cache (to force a fetch at all, normally 1 day), connect 0.0 days.

--//--
[Sep 1, 2011 5:28:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

That type of behavior happens to me all the time. BOINC is supposed to complete the units with the earliest deadline when it goes into panic mode, but it does a really, really bad job of determining which work units to crunch.

Typically, this will occur after a particularly long running result. This pushes the expected run time of everything in the queue much higher triggering high-priority running. From this point it is speculation on my part, but it appears that BOINC calculates the number of results that it expects to be able to complete before the deadline (based on estimated run times) and jumps ahead of everything that it assumes will not meet the deadline. This results in it abandoning its currently running units and jumping to a seemingly random point in the queue and crunching from that point. Once the estimated run times get back to normal, it will pick up the units in progress and start back at the top of the queue.

Not much that you can do about it, just wait for things to sort out and they will over time. As Sekerob said, new projects are more likely to have inaccurate run time estimates.
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Sep 1, 2011 5:43:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

There's an amount of "try" and then it does it with the jobs that are at the end of the queue (and to the viewer seemingly randomly when the printed times of those are same). If it happens all the time, than the answer is in combined settings of the 3 items listed and the inflation effect, mostly in play when having high variability (HCMD2 e.g.). Drives the rDCF up the wall.

Mind you, I've tried to force the issue a few times under 6.12, and it was not easy to get it into panic. The scheduling logic seems to be improved quite a bit, for those with larger settings.

--//--
[Sep 1, 2011 5:59:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

Hi nanoprobe

Mine have jumped the queue as well

They are working in the same way as the C4CW's

I had HFCC's ticking along nicely with earlier due dates

All now waiting to run

Still great new project smile

I've never had this issue before. DSFL seems to completely ignore the "switch between applications every xxxx minutes" settings by at least 4 days. It's only happening on my Intel crunchers. I've had to manually suspend numerous DSFL WUs to let other projects finish WUs that were stopped with only minutes left to completion. confused Very much a PITA. My 2 AMD quads have not had this issue. thinking
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Sep 1, 2011 10:11:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
BladeD
Ace Cruncher
USA
Joined: Nov 17, 2004
Post Count: 28976
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

Welcome to WCG, Carlos and the Drug Search for Leishmaniasis (DSFL) project!

I don't remember welcoming a new project to WCG on it's FIRST day! wink
----------------------------------------
----------------------------------------
[Edit 1 times, last edit by BladeD at Sep 1, 2011 10:52:26 AM]
[Sep 1, 2011 10:51:09 AM]   Link   Report threatening or abusive post: please login first  Go to top 
coolstream
Senior Cruncher
SCOTLAND
Joined: Nov 8, 2005
Post Count: 475
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

I have just received two resends of WUs that had previously errored out. They are
https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=329165940
and
https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=329165914

Both had the same exit message (which I have never seen before)
<core_client_version>6.10.17</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
]]>

----------------------------------------

Crunching in memory of my Mum PEGGY, cousin ROPPA and Aunt AUDREY.
[Sep 1, 2011 12:22:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

Regrettably, no members except those in the actual quorum can see the details.

Generally that "too many exit(0)s is an system instability error... too busy and regressing to the previous checkpoint too many times.

6.10.17 hints, if not mistaken, at Linux? It's never been an ultra-stable version. By Launchpad listing of BOINC packages, from Ubuntu Lucid (10.04), it's at least 6.10.58 as recommended.

Summary: Don't think you'll have trouble with these.

--//--
[Sep 1, 2011 12:36:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
coolstream
Senior Cruncher
SCOTLAND
Joined: Nov 8, 2005
Post Count: 475
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

Sorry about the links. I keep forgetting that they are https.

It would be very strange if these had been processed on a Linux box because they have been sent to my Win7 64-bit machine. That's unless the 'same platform' rule is being ignored.
----------------------------------------

Crunching in memory of my Mum PEGGY, cousin ROPPA and Aunt AUDREY.
[Sep 1, 2011 12:41:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

:D

Ah, all the more reason for the wingman to go get an upgrade, as that version was short lived on Windows. It's just that 6.10.17 seems to be a frequent wingman on my Linux box, so [mistake as always], guessed it to be a Linux rig. Knowing the platform etc helps us to not stray into that territory ;>)

--//--
[Sep 1, 2011 1:04:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Welcome to the Drug Search for Leishmaniasis (DSFL) project

P.S. The "homogeneous redundancy" is only not applied to Human Proteome Folding 2 AFAIK. Those "work was available, but only for other platforms" gave that away at launch time. Had it yesterday for a little until uplinger said he'd upped the weight.

--//--
[Sep 1, 2011 1:08:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 106   Pages: 11   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread