Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: FightAIDS@Home Phase 2 Thread: 24 Hour Deadline too Short |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 112
|
Author |
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7574 Status: Offline Project Badges: |
I have one machine which runs FAAH2 exclusively. It does run 24/7. A little while back there was some problem with the supply of FAAH2 and I allowed a bunch of SCC1 units to run so the machine would not be idle. Once the supply of FAAH2 units resumed, they, of course, took priority over the SCC1 units. (FAAH2 is restricted to a maximum queue of one work unit per thread at all times.) I figured I would just let BOINC manage the queue, which it did without a hiccup, no units were late. When the deadline for the SCC1 units became less than the deadline for the FAAH2 units, they were run and the FAAH2 units suspended until the FAAH2 deadline once again became less then the SCC1 units. All of the units processed before their deadline. I did not have to babysit the machine and manually push any through. It all happened automatically. I am running with only a 24 hour queue length.
----------------------------------------I agree with the sentiment that this project is best suited for machines which can run 24/7. Not every project may be suitable for everybody's set up. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Quick point: everything I've posted in this thread is based on i7 machines that run 24*7*365.
|
||
|
wolfman1360
Senior Cruncher Canada Joined: Jan 17, 2016 Post Count: 176 Status: Offline Project Badges: |
Yes, even an old core 2 duo is completing these in time but, as said, machine needs to be on 24/7.
----------------------------------------I don't run it on a few for this very reason - if it's not on all the time, the deadline is most likely missed. I wish they had set it longer, too - as someone pointed out, going on a day trip and it's basically gone. And so many people set boinc up and forget it and don't actually keep an eye on what's being processed. Are the betas still going on or did they end? Are there different amounts of steps for this project e.g. ending in a certain number = a certain amount of steps? I notice that some of the WU's take 4 + hours and some take an hour or 2. thanks
Crunching for the betterment of human kind and the canines who will always be our best friends.
AWOU! |
||
|
asdavid
Veteran Cruncher FRANCE Joined: Nov 18, 2004 Post Count: 521 Status: Offline Project Badges: |
I notice that some of the WU's take 4 + hours and some take an hour or 2. At the moment tasks starting with FAH2_xxxxxx up to 001769 have 10000 steps. Numbers from 001770 and higher have 50000 steps.
Anne-Sophie
|
||
|
SuicideCabbage
Cruncher Joined: Jan 13, 2010 Post Count: 1 Status: Offline Project Badges: |
BOINC is about donating "idle" cycles, not about being a supercomputing cluster at owners expense. I recently switched a 64-core production server over to this project, but unfortunately am going to have to remove it. The server has work it must to do, however it's on 24/7 anyway so I may as well do some good with it's idle cycles. This is the only project within WCG or the greater BOINC network that I have had issues with meeting the deadline on this server.
I am not going to speak for the scientists behind this project, but my personal belief is that a deadline of 48h would be a happy medium between getting work back, and getting work done. If hundreds of computers miss the deadline the work has to be resent anyway, slowing down progress, if more could successfully return that would result in an overall increase in FLOP throughput, even if some individual unit trees are slowed down. Unless there is just not enough work without stringing the trees very quickly, if that's the case then my entire post is irreverent. |
||
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1261 Status: Offline Project Badges: |
I am aware they are currently wanting to get all the work returned from the 2 experiments they have running so that the scientists can see how the new application is working. I think read this information in the latest update
---------------------------------------- |
||
|
AgrFan
Senior Cruncher USA Joined: Apr 17, 2008 Post Count: 365 Status: Offline Project Badges: |
BOINC is about donating "idle" cycles, not about being a supercomputing cluster at owners expense. I recently switched a 64-core production server over to this project, but unfortunately am going to have to remove it. The server has work it must to do, however it's on 24/7 anyway so I may as well do some good with it's idle cycles. This is the only project within WCG or the greater BOINC network that I have had issues with meeting the deadline on this server. Are you running only FAAH2? What is your work buffer setting? FAAH2 runs best as a dedicated project with a work buffer of 24 hours or less. Unfortunately, this is how the scientists have configured this project. [Edit 1 times, last edit by AgrFan at Mar 3, 2018 2:26:33 PM] |
||
|
JimWork
Cruncher Canada Joined: Oct 11, 2005 Post Count: 35 Status: Offline Project Badges: |
Longer Deadlines Don't Work !
----------------------------------------Example = Help Stop TB - 10 day lead times and many end up being "No Reply" I got this WU on Feb 20 - finished it on Feb 24 using 16.44 hrs CPU. My wingman never showed up for the party and 10 days later it was bounced to another machine which may or may not complete it on time. This is very common for this project. I wish they would cut its leadtime to 5 days. Workunit Status Project Name: Help Stop TB Created: 02/20/2018 02:03:07 Name: HST1_014856_000036_AC0001_T350_F00521_S00030 Minimum Quorum: 2 Replication: 2 Result Name OS type OS version App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit HST1_ 014856_ 000036_ AC0001_ T350_ F00521_ S00030_ 2-- Microsoft Windows 7 Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) - In Progress 3/2/18 02:06:36 3/5/18 14:06:36 0.00 0.0 / 0.0 HST1_ 014856_ 000036_ AC0001_ T350_ F00521_ S00030_ 0-- Microsoft Windows 7 x64 Edition, Service Pack 1, (06.01.7601.00) - No Reply 2/20/18 02:06:35 3/2/18 02:06:35 0.00 0.0 / 0.0 HST1_ 014856_ 000036_ AC0001_ T350_ F00521_ S00030_ 1-- Microsoft Windows 8.1 Core x64 Edition, (06.03.9600.00) 726 Pending Validation 2/20/18 02:06:31 2/21/18 00:50:36 16.44 390.5 / 0.0 Close [Edit 1 times, last edit by JimWork at Mar 3, 2018 5:35:56 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Jim,
While I, too, think that 24hrs is a little too short, it's not my decision! I can only assume that the techs have modelled the situation and found this way to be the most efficient within the constraints that currently exist. As crunchers we have to either live with that or decide not to take part in this project. However, as to your example of long deadlines not working, I see no issue. All I see is the system working as designed. Indeed, the tree of re-tries is still some way short of its limit. (At which point, as I understand it, the WU gets processed on a WCG server machine as a last resort.) I'll agree that until we had the "server aborted" mechanism it did seem somewhat inefficient but, as of right now, I see nothing to complain about. If the scientists are happy that the techs are giving them the results they want in a timely manner, great! Just my view. |
||
|
JimWork
Cruncher Canada Joined: Oct 11, 2005 Post Count: 35 Status: Offline Project Badges: |
Apis,
Everything you said resonates with me. I only wanted to chime in and offer a small note to the clarion of opinions. My comment is a simple ping but not a harsh gong. I heed the knell of the scientists and their pedantic designs. I humbly submit to their script of call & method ringing. Carry on and keep crunching ;-) |
||
|
|