Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 112
Posts: 112   Pages: 12   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 26235 times and has 111 replies Next Thread
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

Back in the days of the 100,000 step jobs, the time allowance was, ,if I recall correctly, 4 days. Even then I'd get the odd retry after someone had failed to return a job on time...

Now, we have two batches of data being run through which are, as tar as I am aware, identical except for the number of steps per task - one lot does 10,000 steps and the other 50,000 steps.

A 2-day limit might seem reasonable for the 50,000-step jobs, but would be inappropriate for the the 10,000-step jobs as it would take up to 5 times as long to achieve the same total number of steps

I think I saw something about there being 2.5 million steps in total (though I've been unable to find said reference, if it ever existed outside my imagination) -- and the highest iteration number I've seen in a 50,000 step job so far is 044 so that's not disproved yet. If it is right, the shorter jobs will need 250 task iterations to get the required number of steps, and I don't think it requires much imagination to work out what would happen if people with mixed workloads and largish buffers could spend 2 days on an 10,000 step task!

So Sgt. Joe (as quoted by Shaky Jake recently) has the right of it, I would think. They've already added a limit on the number of concurrent FAH2 tasks one can have as part of an effort to stop overruns or contention with other projects, and apart from putting different time limits on the two different types of job I don't see what else they can actually do if they want the results in a timely fashion!

And whilst I do appreciate that those limits aren't encouraging to crunchers with slow machines, it doesn't make sense to hamstring a project just so under-resourced systems can have a go (witness the warnings about CEP2...) My ancient Core 2 Duo Macbook Pro (clocked at about 2GHz) would take 14 to 16 hours to chew through a 50,000 step job (having taken about 28 hurs on the two 100,000 step jobs it got sent without my say-so) so I excluded FAH2 from the list of jobs it could have (just as I had previously not let it get CEP2 tasks) Sometimes, a little common sense goes a long way!

Cheers - Al.
[Jun 12, 2018 4:07:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Macroman
Advanced Cruncher
Joined: Jun 4, 2005
Post Count: 112
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

It's not just people with slow machines that dislike the short deadline.

I'm running two machines on the grid 24-7 and it's unacceptable to me to have any of my units refused as "too late". I am finding that it's basically impossible to fulfill this desire if I run FAH2 in conjunction with any other tasks.

Worse yet, more often than not it's tasks for the other projects that get this treatment because the FAH2 units push to the front of the line time after time with their short deadlines.

Accordingly I have joined the club in deciding to turn this project off on my systems.

Mark
[Jun 13, 2018 3:16:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Jim1348
Veteran Cruncher
USA
Joined: Jul 13, 2009
Post Count: 1066
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

I run FAH2 along with all other projects on a Ryzen 1700 with the default buffer of 0.1+0.5 days and don't have a problem.
I sometimes see the FAH2 running high priority, but that is about it.
[Jun 13, 2018 4:50:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

It's not just people with slow machines that dislike the short deadline.

I'm running two machines on the grid 24-7 and it's unacceptable to me to have any of my units refused as "too late". I am finding that it's basically impossible to fulfill this desire if I run FAH2 in conjunction with any other tasks.

Worse yet, more often than not it's tasks for the other projects that get this treatment because the FAH2 units push to the front of the line time after time with their short deadlines.

Accordingly I have joined the club in deciding to turn this project off on my systems.

Mark

Mark,

The reason I didn't comment on it being an issue for faster machines is that I have never experienced "Too Late" situations on any of my more modern systems (with or without FAH2) except on occasions where I've had a system go offline for one reason or another (such as a catastrophic CPU failure) and I've been unable to fix the issue within the several days that anything but FAH2 gives me. Like Jim1384, whose post follows yours, I don't run large buffers (as "always online" is not a problem for my systems) so even if tasks for other projects take longer than their estimates I'm unlikely to have any time out!

My main workstation runs a buffer of 0.6+0.1 days, allocating 6 cores; it does WCG and CPDN CPU jobs and its [slowish] NVIDA GPU does SETI, Einstein and Milkyway@Home. My local "server" runs 0.45+0.05 (and only does WCG work), and my daily-use laptop (on 12..18 hours a day) is 0.35+0.05 days (again WCG work only). I find that if I allow MCM work on a machine, I' don't get FAH2 work at every top-up anyway! (I might let them all pick up an extra half-day of work to tide me round the down-time scheduled for today, though smile )

Now, I realize that if a user is doing CPU work for projects other than WCG, that may introduce other work access and scheduling issues, but when I was doing SETI@Home and POEM@Home on a fairly slow,(and not always on) laptop along with WCG I still never had any Too Late items...

So I'm left intrigued as to the circumstances in which a shorter-deadline task such as FAH2 can cause other projects to miss deadlines. I'd like to know because I'd like to try to avoid such issues without having to drop projects!

Good luck with whatever projects you do.

Cheers - Al.
[Jun 14, 2018 3:36:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Brummig
Cruncher
Joined: Sep 19, 2016
Post Count: 26
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

One of the machines on which I run WCG is both fast and modern, and like all my machines it runs with a minimal buffer (my other WCG PC host is older, but still very much usable). However, FAH2, with its absurdly short deadline, gave me endless problems with disruption of other BOINC projects, and missed deadlines. And despite what is being said here, I often found myself with multiple FAH2 tasks scrambling over each other and other projects, desperately trying to meet the deadline. Sometimes I would only be a couple of hours over, but by then the task had already been sent to another host. I gave up, and dumped FAH2. Good luck with all those unmanaged hosts that accept tasks and rarely if ever do anything useful.

BOINC was set up to use spare CPU cycles donated by volunteers on millions of ordinary computing platforms around the world. The FAH2 project should either work within the BOINC paradigm, just like lots of other projects do, or buy themselves their own full-time supercomputer. As it is, IMHO they are abusing BOINC.
[Jun 14, 2018 9:52:42 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Aurum
Master Cruncher
The Great Basin
Joined: Dec 24, 2017
Post Count: 2391
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

Right on Brummig!!!
FAH2 asked for help so I'm trying again to run their WUs. BoincTasks 1.79 shows that many FAH2 WUs are "Running High Priority" even though they will have no problem completing well before their deadline (since I reduced buffer to 0.5+0.25 days). When CPUs are RHP then WCG takes over all the CPU cores leaving none for my GPU WUs. When I see that happening I perform numerous abortions to get my GPUs back to work.
It's just rude that FAH2 project staff ignore donors & continue to require a ridiculous 24 hour deadline. WCG should not allow it either. The convention should be 7 days with 3 days being the lowest allowable.
----------------------------------------

...KRI please cancel all shadow-banning
----------------------------------------
[Edit 1 times, last edit by Aurum420 at Nov 10, 2018 1:19:26 PM]
[Nov 10, 2018 1:17:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7844
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

I believe this subject has been pretty thoroughly vetted. The scientists have said this is the model which works best for them to get heir results in a timely fashion. I have run this project in the past (not currently running it now) and it is my opinion that this project runs best when it is the only project running on a machine. And, it runs by design with a short queue ( loosened a slight bit lately per Lavaflow's request). If this project is disrupting your other projects, you could limit it to only using a specified number of cores, leaving some cores available for your GPU's. The bottom line is, that if this project does not work for you in your environment, don't run it. After all, it is your resources you are donating to various causes and these projects should not cause you undue stress.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Nov 10, 2018 3:49:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Aurum
Master Cruncher
The Great Basin
Joined: Dec 24, 2017
Post Count: 2391
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

I found the cure for FAH2, turn it off. If the FAH2 scientists think this appropriate way to treat people paying to do their work for them for free then they're jerks.
----------------------------------------

...KRI please cancel all shadow-banning
[Nov 10, 2018 6:45:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

Was already explained a few days ago and repeated ad nauseam. The tasks have a dependency and a sequence of hundreds. One result forms the base for generating the next step set. With 24 hours deadline and a sequence of 300, they know they'll have a complete simulation series in at most 300 days for that target. If they'd allow the common 7-10 days deadline, they'd not have a simulation complete until 2025.

Enough said.
----------------------------------------
[Edit 1 times, last edit by Former Member at Nov 10, 2018 7:30:18 PM]
[Nov 10, 2018 7:29:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: 24 Hour Deadline too Short

Sgt. Joe wrote:

"And, it runs by design with a short queue ( loosened a slight bit lately per Lavaflow's request)"

Not yet what I can tell, but devised a way around. Set cc_config.xml with <ncpus>9</ncpus> to make WCG think there's a nine core machine asking for work, and set the app_config with <project_max_concurrent>8</project_max_concurrent> to maximum of 8 jobs concurrent for WCG, which works long as only computing for WCG.

Now the pausing is a few seconds between 1 fahb finishing and the next one starting.
[Nov 11, 2018 1:48:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 112   Pages: 12   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread