Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 18
Posts: 18   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 39996 times and has 17 replies Next Thread
Dark Angel
Veteran Cruncher
Australia
Joined: Nov 11, 2005
Post Count: 728
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
confused Large number of "Invalid" units

I have three machines, all native Linux (ie not running in VMs) all 32bit, all 1GB plus RAM, that have returned what I would consider a significant amount of invalid units on this project.
I have already taken them off the project and they are running HCC with no troubles.
Does this project not like Athlon XP processors, or did I miss something in the project requirements? confused

My Opteron systems (64bit, Ubuntu 10.04 Desktop) and Sossaman (32bit, Ubuntu 9.10 Server) are not having the same issue.
----------------------------------------

Currently being moderated under false pretences
[Jul 1, 2010 12:16:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Plz post copy of the Result log, by clicking on the Invalid status. If the logs report different error codes, post copies of the variations. Expecting to see something along the line of RC = 0x4 or similar as discussed in the Beta forum http://www.worldcommunitygrid.org/forums/wcg/...238_lastpage,yes#lastpost
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jul 1, 2010 7:07:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Dark Angel
Veteran Cruncher
Australia
Joined: Nov 11, 2005
Post Count: 728
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

From an Athlon XP3000+ (running stock)
Result Name: E200008_ 102_ A.14.C11H9NOSi.34.0.set1d06_ 1--
<core_client_version>6.10.56</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[00:46:40] Number of jobs = 16
[00:46:40] Starting job 0,CPU time has been restored to 0.000000.
[00:46:40] Starting new Job
[00:46:40] Qink name = fldman
[00:46:40] Qink name = gesman
[00:46:40] Qink name = scfman
[00:48:21] Qink name = anlman
[00:48:22] End of Job
[00:48:24] Finished Job #0
[00:48:24] Starting job 1,CPU time has been restored to 67.692230.
[00:48:24] Starting new Job
[00:48:24] Qink name = fldman
[00:48:25] Qink name = gesman
[00:48:25] Qink name = scfman
[00:53:22] Qink name = anlman
[00:53:33] End of Job
[00:53:36] Finished Job #1
[00:53:36] Starting job 2,CPU time has been restored to 331.732731.
[00:53:36] Starting new Job
[00:53:36] Qink name = fldman
[00:53:37] Qink name = gesman
[00:53:37] Qink name = scfman
[00:57:09] Qink name = anlman
[00:57:09] Qink name = drvman
[00:58:40] Qink name = optman
[00:58:40] Qink name = fldman
[00:58:40] Qink name = gesman
[00:58:41] Qink name = scfman
[01:05:41] Qink name = anlman
[01:05:41] Qink name = drvman
[01:07:12] Qink name = optman
[01:07:12] Qink name = fldman
[01:07:12] Qink name = gesman
[01:07:12] Qink name = scfman
[01:13:39] Qink name = anlman
[01:13:40] Qink name = drvman
[01:15:11] Qink name = optman
[01:15:11] Qink name = fldman
[01:15:11] Qink name = gesman
[01:15:12] Qink name = scfman
[01:20:58] Qink name = anlman
[01:20:58] Qink name = drvman
[01:22:30] Qink name = optman
[01:22:30] Qink name = fldman
[01:22:30] Qink name = gesman
[01:22:30] Qink name = scfman
[01:27:26] Qink name = anlman
[01:27:26] Qink name = drvman
[01:28:58] Qink name = optman
[01:28:58] Qink name = fldman
[01:28:58] Qink name = gesman
[01:28:58] Qink name = scfman
[01:33:23] Qink name = anlman
[01:33:23] Qink name = drvman
[01:34:54] Qink name = optman
[01:34:54] Qink name = anlman
[01:35:05] End of Job
[01:35:08] Finished Job #2
[01:35:08] Starting job 3,CPU time has been restored to 2501.900358.
[01:35:08] Starting new Job
[01:35:08] Qink name = fldman
[01:35:09] Qink name = gesman
[01:35:09] Qink name = scfman
[01:40:45] Qink name = anlman
[01:40:53] End of Job
[01:40:55] Finished Job #3
[01:40:55] Starting job 4,CPU time has been restored to 2797.742847.
[01:40:56] Starting new Job
[01:40:56] Qink name = fldman
[01:40:56] Qink name = gesman
[01:40:56] Qink name = scfman
Application exited with RC = 0x4
[01:40:59] Finished Job #4
[01:40:59] Starting job 5,CPU time has been restored to 2799.742972.
[01:40:59] Skipping Job #5
[01:40:59] Starting job 6,CPU time has been restored to 2799.742972.
[01:40:59] Skipping Job #6
[01:40:59] Starting job 7,CPU time has been restored to 2799.742972.
[01:40:59] Skipping Job #7
[01:40:59] Starting job 8,CPU time has been restored to 2799.742972.
[01:40:59] Skipping Job #8
[01:40:59] Starting job 9,CPU time has been restored to 2799.742972.
[01:40:59] Skipping Job #9
[01:40:59] Starting job 10,CPU time has been restored to 2799.742972.
[01:40:59] Skipping Job #10
[01:40:59] Starting job 11,CPU time has been restored to 2799.742972.
[01:40:59] Skipping Job #11
[01:40:59] Starting job 12,CPU time has been restored to 2799.742972.
[01:40:59] Starting new Job
[01:40:59] Qink name = fldman
[01:41:01] Qink name = gesman
[01:41:01] Qink name = scfman
[02:08:16] Qink name = anlman
[02:11:09] End of Job
[02:11:12] Finished Job #12
[02:11:12] Starting job 13,CPU time has been restored to 4413.347816.
[02:11:12] Starting new Job
[02:11:12] Qink name = fldman
[02:11:14] Qink name = gesman
[02:11:14] Qink name = scfman
Application exited with RC = 0x4
[02:11:23] Finished Job #13
[02:11:23] Starting job 14,CPU time has been restored to 4421.172305.
[02:11:23] Skipping Job #14
[02:11:23] Starting job 15,CPU time has been restored to 4421.172305.
[02:11:23] Skipping Job #15
called boinc_finish
Exiting 0

</stderr_txt>
]]>

There's multiples, but all the same error code across all three machines.
Application exited with RC = 0x4
----------------------------------------

Currently being moderated under false pretences
[Jul 2, 2010 4:10:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Yes, that's exactly the same as what happens a significant proportion of the time with plain Athlon (and P3) machines. Reported by several users during beta.
[Jul 5, 2010 5:24:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

What's more, having a look through the (still available) results, I don't believe my Athlon was ever successful unless it ran out of CPU time before Job #4.

The WUs in the beta were (deliberately?) huge, so plenty hit 12 hours before Job #4. The production WUs on my Athlon are so far 100% getting to the Job #4/RC=0x4 point and I'm betting they will all be marked Invalid.
[Jul 6, 2010 9:54:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

"Deliberately huge" is when the scientists take a fair selection across the body of work to Beta test as many variations as possible. In production, we now see a gradual building of the run times. The initial jobs on my quad took 1.5 hours and after 6 days now see a constant stream of ever increasing durations. Longest now stands at 4.5 hours (device dependent of course). For a slow device that means probably that 12 hours is already being reached. I don't know if there is a minimum number of jobs within a task that are needed, and if so, the distribution would than have to be looked into... something for longer on the techs development list, to send the lighter stuff to the ponies and v.v. the heavy stuff to the thoroughbreds. (now that is not intended to read as you could interpret ;-)

edit: spelling
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jul 6, 2010 10:09:48 AM]
[Jul 6, 2010 10:07:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
WBT112
Cruncher
Joined: Sep 9, 2007
Post Count: 28
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

a bit off-topic but because i also have some Invalid WUs here (Debian on 2 VM's but it is better now): Is the time for "Invalid" units granted ? Couldn't find anything, only that you get 1/2 of the points. Thanks :)
----------------------------------------

[Jul 7, 2010 10:38:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Into the 3rd week of RC=0x4 errors now. Obviously, the best approach in the short term is to avoid running this project on affected machines. However, if we all do that, there will be no errors, so the techs will have no incentive to ever fix it.
[Jul 8, 2010 8:48:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dark Angel
Veteran Cruncher
Australia
Joined: Nov 11, 2005
Post Count: 728
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

I gave up running this on my single core machines and have cut back to a single dual core due to the large uploads choking my connection. It's fast enough to handle things, but not to run big uploads AND anything else at the same time.
----------------------------------------

Currently being moderated under false pretences
[Jul 9, 2010 2:23:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Large number of "Invalid" units

Have you tried installing Ubuntu on the single core machines or is there something that would prevent that? I have either duos or quads, no single core, and I have not installed Ubuntu so I do not have any experience with CEP2.
[Jul 9, 2010 3:48:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 18   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread