Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 98
Posts: 98   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 8494 times and has 97 replies Next Thread
mclaver
Veteran Cruncher
Joined: Dec 19, 2005
Post Count: 566
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

OK, may as well use this as a learning experience.

You guys keep talking about WU app versions. How do you tell?

My current, according to the /var/lib/boinc-client/stdoutae.txt file is:

CMD2_0001-1HCI_A.clustersOccur-1YDI_A.clustersOccur_6739_3

Could someone parse that?


If it is Running or Ready to start, you can go to the Tasks tab, under BOINC Manager, Highlight the WU, and click on properties. Application is in the first line.

If it is complete, either in Error, Pending or Valid, you need to go to MY Grid on the website, Result Status in the left column, and find the result you want, and click on the WU you want under result name.

Hope this helps.

- Mitch
----------------------------------------



[Jun 7, 2009 9:29:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Van Fanel
Cruncher
Joined: Dec 27, 2006
Post Count: 42
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

You guys keep talking about WU app versions. How do you tell?


The app version can be seen on the Application column in Tasks View (Assuming you're using the Advanced View of the BOINC Manager).

An interesting observation. Of all my Pending Validation, where my wingmen had multiple errors with long CPU times, They were on 613 and I was on 611.
But, the WU above, which is Pending and took 41.26 hours, which is my longest WU on this machine, completed under 613.


Exactly my point. All monster WUs I've seen so far are resends from previous aborted/error/inconclusive/invalid calculations. The original WUs were crunched using 6.11 and we can see that they took a normal amount of time to complete. However, the resends are running with app. version 6.13 and, at least on my Linux machine, they are taking enormous amounts of time to complete, above 40 hours. Some of them are even reaching the Maximum CPU time and bursting out in Error, not validating, and being again resent.
[Jun 7, 2009 11:58:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

Van Fanel
I am using 6.2.18 (boinc). I do not see any version for the app there.

mclaver
I just hac that bugger error out (good, at least to me, ETA was still going up) and checked it on the website. V613 and so was the other that quit on error.

The one I aborted and the one that has not been verified (ended on 5/31) are both 611.

I have 1 WU running now from HCC. I just looked, have looked before, and con not find any "properties" to click on. I have checked all the menus on the bar and there is no right click.

It is interesting how screwed up things are. This WU has run for an hour and is 33% done. The ETA, calculated on past results, is 185 hours. I think it will come in under that, at least a little. :-)
[Jun 8, 2009 12:46:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

Hello slade52,
I read the description of 'properties' with astonishment, then realized that Sekerob was talking about BOINC 6.6. It must be a new feature. Sounds useful, but we do not have it in WCG-supplied BOINC.

Lawrence
[Jun 8, 2009 1:04:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

lawrencehardin
Yes I thought it would be cool. The Boinc version in synaptic was this 6.2.18 (Ubuntu (9.04) so I guess I am out of luck on that one.

I dropped HCMD2 until this is cured. This is time that could be being used to do something other than entertain us by letting us guess if they will finish in a week or not.

I'll sign on again when whoever needs to gets their ducks in a row.
[Jun 8, 2009 1:16:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

Hi I appear to have 2 of these monster work units. I am running app version 6.13 under BOINC 6.2.15 on Slackware Linux 12.2 (32 bit).

The first has been going for about 112 hours and is at 17.222% the second for 20 hours and is at 10.909%

the first is: CMD2_0001-1HCI_A.clustersOccur-1ZXC_B.clustersOccur_77_3
https://secure.worldcommunitygrid.org/ms/devi...us.do?workunitId=70558702
the second: CMD2_0001-1HCI_A.clustersOccur-1RKC_A.clustersOccur_6167_4
https://secure.worldcommunitygrid.org/ms/devi...us.do?workunitId=70699835

I am going to abort the first, should I cancel the second as well?
----------------------------------------
[Edit 5 times, last edit by Former Member at Jun 8, 2009 3:43:41 PM]
[Jun 8, 2009 3:10:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

Hello slade52,
I read the description of 'properties' with astonishment, then realized that Sekerob was talking about BOINC 6.6. It must be a new feature. Sounds useful, but we do not have it in WCG-supplied BOINC.

Lawrence

What was I thinking... the reply was shaped towards the post which mentioned 6.6, so edited the version number in so those on 6.2 don't go on a goose chase (no idea if this properties feature appeared in the 6.4 intermediate version, not looked)

biggrin

edit:

There are 2 properties screens in 6.6

1. In the task window of BOINC Manager providing job information, and CPU time for the 6.6 BM does not show this, rather Elapsed/Wallclock time the task has been active since start, regardless if getting 100% of whatever other %, long as more than 0 CPU time.

2. In the Projects window, proving information on debt status, rDCF etc. Regrettably not the information such as <on_frac> for these add into the work fetch policy, but it's progress and requires less file searching to see why work fetch is for instance behind... see high rDCF in Screenshot.


----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jun 8, 2009 8:13:01 PM]
[Jun 8, 2009 7:58:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

Well, got still 2 running:
1: 259 hours on 10.9% to be returned before 11-6 22:15
2: 224 hours on 19.4% to be returned before 13-6 8:32

What should I do with these 'things'?
They will defenately not ready before the deadline and they are already running for more than 10 days!
What's the sense of leaving these running?
[Jun 9, 2009 7:21:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

I'd abort. It seems only Linux gets these monsters served and throws up the question if 6.13 with the different compiler settings has caused it to crawl.

The name of the result tells if it's a child task or not. Child tasks show 1 , 2 or more numbers at end to tell what positions are in it.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 9, 2009 7:33:10 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Monster WU on the loose...

Here's the current state of the one I aborted. My wingman run until he hit the CPU limit and now 3 other machines are running it.

CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 2648_ 4-- - In Progress 8/06/09 08:00:38 18/06/09 10:40:55 0.00 0.0 / 0.0
CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 2648_ 3-- - In Progress 8/06/09 07:58:13 13/06/09 22:22:13 0.00 0.0 / 0.0
CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 2648_ 2-- - No Reply 2/06/09 16:58:49 8/06/09 07:22:49 0.00 0.0 / 0.0
CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 2648_ 1-- 613 Error 27/05/09 21:14:43 7/06/09 01:42:16 238.45 2,866.0 / 0.0
CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 2648_ 0-- 613 Aborted 27/05/09 21:14:24 2/06/09 16:52:13 132.97 1,009.2 / 0.0
[Jun 9, 2009 1:09:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 98   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread