Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 14
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1609 times and has 13 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Job Run Times

Recently I note that jobs are now being assigned that have estimated run times of 4 to 6 hours. Why such long jobs? One of these jobs was showing as running but no change in CPU time or any other reference. When I checked this on the consol, this job had been running for over 40 hours of CPU time.

I went back to check to see what my choosen parameters of reference are and found that they had all changed so that I now was being assigned all types of jobs not just the two I selected. Is this a common occurence?

I have found that long running jobs seem to be the ones that get into difficulty. My solution is to abort any job I see in that kind of time frame. I hope this is just an anomaly and will not last for long.

I realize that I have fast machine but that should not change the job sets.

Mac Pro Quad-Core Intel Xeon 3 GHz 5 GB memory
[May 12, 2007 3:12:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Job Run Times

Hi stupidname,

All jobs get an estimated number of iops/flops when send to a client. On basis of these estimates, the client computes the expected duration based on passed historic performance parameters. Problem is, non-deterministic calculations can end up doing many more than expected, extending the duration extensively. 4-6 hours for FA@H and HPF2 sound quite reasonable for the present work as they have more attempts and seeds packed since early this week. GC sizes remain unchanged, but fluctuate quite a bit in duration based on complexity.

Your preferences are not being changed and if they were, that could have been a program fluke, but given that there are several places to select choices - My Projects (for overall control) and Device Profile (for client level control), it's probable you looked at the other.

Even the long running jobs are fine 99,999%, it's the patience being tested very hard. Aborting a job does not help anyone as you loose the computing time and the WCG technicians will not know why it happened. We want you to post the BOINC message log that show any anomaly. Also we'd like you to look at the "Result Status" page and click on the suspect jobs Work Unit Name (first column) to see the quorum. If any other jobs were returned of the standard set of 3 (all projects except HPF2 which has standard 19 presently), and they were returned without error, it's more likely a random or local problem arose. Again WCG technicians like to see a description and a piece of the message log and client information like if the CPU is/was clocking up time for the job. What project running 40 hours and how far had the job progressed and were graphics viewed during any time?

And yes, FA@H and HPF2

Sekerob

PS Where normally HPF2 runs 6+ hours on my C2D, yesterday had a few which did 4.5 hours..... no duration is fixed.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 12, 2007 5:54:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Job Run Times

Hello stupidname,
I went back to check to see what my choosen parameters of reference are and found that they had all changed so that I now was being assigned all types of jobs not just the two I selected. Is this a common occurence?

The only change that has recently been made is for users who had selected 'Help Defeat Cancer'. When the project ended they were reset to 'All Projects'. But we have been wrestling with the database since early in May when we started being hit with errors after an update, so we may have done something to your profile by mistake.

What changes do you see in your profile?

Lawrence
[May 12, 2007 12:20:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Job Run Times

.... forgot about that one blushing , but only if HDC was an exclusive project!

Added: About 2,000 members had that profile condition.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at May 12, 2007 12:33:26 PM]
[May 12, 2007 12:27:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Job Run Times

Thanks for taking the time to provide some clues to me. I appreciate the time being given.

The job at 40+ hours was still clocking time as revealed on the Console. It was not doing anything as shown by Boinc. To determine all the details as you suggest is very difficult and time consuming. The name of the job shown on the Console is only a high level discription. In this case I had six WCG jobs runnning all showing the same name. One has to Inspect each job and then compare CPU Times showing in Boinc by way of elimination to determine which job is actually at fault. Now after doing all that I am not going to go to the WCG web site which I find to be very confusing and difficult to follow to do another elimination routine to see what that site might be telling me. When I made this post I had just gone through hours of the WCG web site not allowing access. The actual contact to get logged in would just time out at the web site asking to try later.

I might be retired but I am not looking for a full time job. I run other jobs under Boinc and have found that when things go screwy I just suspend that site and run something else until things sort out.

Right now I have 8 jobs running, all WCG jobs and only one in the que for some strange reason (usually there are four or six in the que.). The one in the que is estimated at 4:49:14 hours. I will watch it if I can identify it when it is running.
[May 12, 2007 4:11:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Job Run Times

Hello stupidname,
I will watch it if I can identify it when it is running.

That will not be easy unless you use BOINC Manager. Otherwise, all you will see is a number of same-name application programs. The easy way to see what went5 wrong is to read the error messages stored in the 'Results Status' page with each result.

Do it the easy way. If it has been running too long, abort it and then look at the 'Results Status' page to see if anybody else in your quorum had that problem.

Lawrence
[May 12, 2007 4:58:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
twilyth
Master Cruncher
US
Joined: Mar 30, 2007
Post Count: 2130
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Job Run Times

I'm running the UD agent on an old AMD K6 overclocked to about 900MHz. I've just dl'd a job that is at 2% after 2 hours (roughly). That means I'm looking at a run time of at least 4 days - probably more like 5 or 6 since it's actually reading 1% and I'm rounding up.

I have to run UD agent because BOINC seems to cause problems as a screen saver on older machines with AGP (ATI brand) graphics cards and I need the screen saver to run so I can tell at a glance if everything is copasetic. Of the 6 machines running now, only about half are used for other purposes and I don't always remember to check the device statistics to make sure everyone is working.
----------------------------------------


[May 12, 2007 5:11:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Job Run Times

Hello twilyth,
Regardless of the official minimum requirements, I would only run the project with the shortest time (Genome Comparison). HCMD is working through really big proteins now, and will continue to do so until the next phase, which is months away. Even so, it sounds as though you will get your current work unit done in time, whatever project it is for.

Lawrence
[May 12, 2007 5:19:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
twilyth
Master Cruncher
US
Joined: Mar 30, 2007
Post Count: 2130
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Job Run Times

Hello twilyth,
Regardless of the official minimum requirements, I would only run the project with the shortest time (Genome Comparison). HCMD is working through really big proteins now, and will continue to do so until the next phase, which is months away. Even so, it sounds as though you will get your current work unit done in time, whatever project it is for.

Lawrence

Yeah - if it takes a week to run - I don't really mind. It's nice to see the stats grow on a daily basis but getting a big bump every once in while is worth waiting for.

Now if I could just get real time stats downloaded directly into my brain . . . Ahhhh . . . Nirvana!!!
----------------------------------------


[May 12, 2007 5:35:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Job Run Times

I think so-far one (female) indicated to be perfectly okay with the extreme long run times...... don't you love the satisfaction to work thru a 100 hour unit, hibernate, resume, hibernate, resume, and that enormous flush on the 7th day leaving all your team mates in the exhaust fumes as you're speeding by riding that ol P3 biggrin

BTW Lawrence, south paws first check the Result Status page of BOINC to see if others returned a WU without error and how long it took them. If I see others delivering with Pending Validation, I'm running on. cool

cheers

PS, my slowest/longest/toughest record stands at 110 hours on UD agent for a HCMD.... did an occasional peak to see in the graphics screen if percentages moved. There you can see even 0.1%. With BOINC one even sees 0.001% in the Tasks Tab section.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 12, 2007 5:38:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread