Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 18
Posts: 18   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 40038 times and has 17 replies Next Thread
Dark Angel
Veteran Cruncher
Australia
Joined: Nov 11, 2005
Post Count: 728
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

All my machines run Ubuntu. Some 64bit some 32bit. These issues are confined to the single core machines. The other 32bit machine is a multi-core and had no trouble.
----------------------------------------

Currently being moderated under false pretences
[Jul 10, 2010 12:02:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

"Deliberately huge" is when the scientists take a fair selection across the body of work to Beta test as many variations as possible. In production, we now see a gradual building of the run times. The initial jobs on my quad took 1.5 hours and after 6 days now see a constant stream of ever increasing durations. Longest now stands at 4.5 hours (device dependent of course). For a slow device that means probably that 12 hours is already being reached.


Of course, to avoid R=0x4, the WU needs to reach 12 hours before getting to job #4. We are now starting to get some WUs which are big enough for this on an Athlon, but it may never happen on an Athlon XP, unless it can be severely underclocked, which rather defeats the purpose overall.
[Jul 14, 2010 8:51:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Another data point: I've had one WU with RC=0x4 marked as valid!

The wingman was slower and didn't make it to job #4 at all!

Makes me more curious about the results validation side. Maybe a result with RC=0x4 is actually ok, but the validator isn't happy with it when another, longer, result doesn't include that? Alternatively, maybe a result with RC=0x4 is not ok, but gets passed by the validator anyhow when the wingman is slower?
[Jul 21, 2010 10:44:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Maybe it's time to bite the bullet for WCG and add to the bandwidth minimum also a CPU speed requirement for CEP2. After all what's the value of jobs that don't get even past the infant steps in a task?
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jul 22, 2010 7:34:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Or even better, quit trying to force quotas via time limits.
[Jul 22, 2010 8:05:58 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Randzo
Senior Cruncher
Slovakia
Joined: Jan 10, 2008
Post Count: 339
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Agree more with Fredski.
And I am quite interesting in answer of techs or scientists to RC=0x4 marked as valid.
[Jul 22, 2010 10:50:58 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Well, so far WCG is not budging from limiting the upper run times and as we progress into the research and average toughness of the tasks will increase to the point where quads too will need 12 hours, don't think it to happen that P3/P4's are allowed to run all 16 job within a task to the end over several days [let alone the part-time devices], with checkpoints that are multiple hours apart (can't do anything about that unless crippling the system for users and start saving 100s of megabyte chunks to disk at short intervals). Then on each boot it will step back those hours without LAIM on (recommended for this project), skipping back to last checkpoint when the user has set the client to pause when in use. Need to have a hard think about that, without knowing all the facts for consideration available.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jul 22, 2010 11:57:47 AM]
[Jul 22, 2010 11:56:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Yet another valid one!
E200195_750_A.20.C20H14.60.set1d06_1-- 619 Valid 7/24/10 01:11:26 7/24/10 23:33:15 11.70 46.7 / 46.8
E200195_750_A.20.C20H14.60.set1d06_0-- 619 Valid 7/24/10 01:07:01 7/25/10 01:52:43 7.29 46.9 / 46.8

On this occasion, the _1 WU had the standard:
"Application exited with RC = 0x4"

The _0 WU had
"Application exited with RC = 0x84"

(somewhat earlier, it had
"Starting job 2,CPU time has been restored to 1167.478515.
Error reading in TMP file 44/0 (1024): No such file or directory",
yet appeared to continue normally at that point)
[Jul 25, 2010 7:46:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 18   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread