Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 14
Posts: 14   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3927 times and has 13 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: what is up with this?

Well, possible, but very unlikely. The last one lasted for 35s so... yeah. Gotten yet another Beta WU since my last post, I'm feeling rather lucky, took about 35min. So again, I'm feeling uncertain whether those minute-long WUs can be considered grand children.
And aren't there posts about some WUs taking several hours?
...
Ah. http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=25710 I'd say this is one of the grand children, even if it's reported from a slow computer.

From the naming of the WUs I saw even the 4th generation in my most recent Beta WUs:
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 496_ 905378_ 905595_ 905560_ 905581_ 0-- <- grand children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 359_ 655651_ 655861_ 655759_ 655779_ 2-- <- grand children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 325_ 593524_ 593718_ 593605_ 593624_ 1-- <- grand children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 313_ 571676_ 572048_ 571944_ 571978_ 1-- <- grand children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM2A.clustersOccur_ 407_ 141444_ 141477_ 141467_ 141469_ 141469_ 141469_ 0-- <- great grand children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 334_ 610396_ 610552_ 610414_ 610429_ 610422_ 610423_ 1-- <- great grand children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 357_ 651640_ 651839_ 0-- <- children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-ITB5A.clustersOccur_ 4679_ 116991_ 116993_ 1-- <- children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 298_ 544217_ 544432_ 544391_ 544432_ 544425_ 544428_ 1-- <- great grand children
BETA_ CMD2_ 0001-TNR1AA.clustersOccur-UGPA2A.clustersOccur_ 84_ 85355_ 85594_ 0-- <- children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-TPM1A.clustersOccur_ 262_ 478777_ 478946_ 1-- <- children
BETA_ CMD2_ 0001-GPDAA.clustersOccur-PYGM.clustersOccur_ 304_ 0-- <- original
BETA_ CMD2_ 0001-GPDAA.clustersOccur-PYGM.clustersOccur_ 20_ 3-- <- original
BETA_ CMD2_ 0001-GPDAA.clustersOccur-PYGM.clustersOccur_ 305_ 1-- <- original
BETA_ CMD2_ 0001-GPDAA.clustersOccur-PYGM.clustersOccur_ 302_ 0-- <- original

OK, the WU names get very long, but I would suggest to split the remaining positions of a WU in not so many WUs (I would take only 2), and therefor create more generations if necessary.
This "splitting factor" depends on the number of hard-to-calculate positions and their distribution inside of the original WU. Only someone who has more insight into the WUs could decide what is the right value.

Greetings
Thorsten
[May 25, 2009 8:30:16 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: what is up with this?

A beta is a beta to find out exactly what best method is for this project and knreed posted somewhere he'd done the full circle and figured out how to do this without great-great-grand children and not end up with single position 6 second work units, in large numbers ;>)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 25, 2009 10:01:16 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: what is up with this?

The fifth WU in my list above was a great-grandchild and lasted only 9 sec on my relatively slow machine. It was a one position WU. I think it might not be possible to avoid this short one position WUs if two adjacent positions differ from very hard-to-calculate (>1h) to very easy (some seconds).
Another suggestion: A WU should always complete at least one position. From this thread it seems that WUs get aborted even if the calculation of first position has not been finished.
Only my 2ct.
Greetings
Thorsten
[May 25, 2009 11:39:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: what is up with this?

Workunits do always compute at least one position. We only check whether or not to exit the workunit after the computation of a position (think of it as the position being the smallest indivisible unit of work - after each position % complete is updated and the termination check is made).

One of the the issues that is huge for us but not readily apparent to the members is the time it takes a batch to complete. We get the data from the researchers in a batch and return a completed batch back to them. For many reasons it is good to minimize the time it takes for us to complete a batch. Each generation of workunits that is created based on an original workunit significantly extends the length of time that it will take to complete the batch. In this beta test we wound up having 5 generations of workunits. This was with a limit of 10 children created for each workunit. On the production system, the workunits will have a deadline of 10 days and it is likely to take 5*10+ days to complete a batch (the time it takes to complete a batch is determined by the hard luck case workunit that winds up going to computers that don't return the result and is the hardest to compute). 50-60 days is a very long time to wait for the results to complete. If we had placed a limit of 2 children, then it is very likely we would have had considerably more generations and thus have an even longer time for for the batch to complete.

We are likely to leave the 10 children as the max split for workunit in place. In addition, we are likely going to mark workunits that are great grandchildren or lower as needing reliable hosts to finish. We expect that this will limit batch completion time to around 30 days.
[May 26, 2009 3:00:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 14   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread