Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Locked
Total posts in this thread: 177
Posts: 177   Pages: 18   [ Previous Page | 9 10 11 12 13 14 15 16 17 18 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 506565 times and has 176 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Linux Only Beta Test version 6.16

Have you got a 6.4 ~~ 6.10.56 client? If so select the task, and hit the properties button on left. What CPU and Elapsed time is reported. If Elapsed is lower, then you confirm a client issue which is fixed in the 6.10.57 client.

As for the percent being possibly incorrect (and they are estimates anyhow as it is not possible to compute progress exactly on non-deterministic), see post by armstrdj.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 19, 2010 10:44:59 AM]   Link   Report threatening or abusive post: please login first  Go to top 
X-Files 27
Senior Cruncher
Canada
Joined: May 21, 2007
Post Count: 391
Status: Offline
Project Badges:
Re: Linux Only Beta Test version 6.16

One of my WUs (BETA_A.22.C17H10N2S3.8.3.set1d06) was stopped within job #15 after 12 hours by the time limit. The wingman was stopped after 12 minutes within job #2 with the mssage
Application exited with RC = 0x100
[10:07:39] Finished Job #2
[10:07:39] Starting job 3,CPU time has been restored to 616.423288.
[10:07:39] Skipping Job #3
...
[10:07:39] Starting job 15,CPU time has been restored to 616.423288.
[10:07:39] Skipping Job #15
called boinc_finish

Both jobs became valid.
It's a pity that one task is stopped by an error so early while the other continues. Guess in production only the first two jobs can be validated while all others will be redone by the next WU. A waste of nearly 12 hours work for 13 jobs...
Or did I get something wrong?

I have this kind of errors -> Application exited with RC = 0x100
resultId=383625736, 383625744 = PV
was crunching thru vm at first then move it to physical. All 4 running task from vm has called boinc_finish upon running on physical.
resultid=383625711, 383625678 = Valid
----------------------------------------

[Jun 19, 2010 3:55:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Linux Only Beta Test version 6.16

Read to me as an arbitrary rule, simply using the task that gut deeper in lest misunderstood: If one task is getting further than the other, than the one getting further fairly would have more credit than the shorter. So far I´ve only seen equal credit, but have not looked into any job quorum detail to see if it is handled pro-rata.

As for this assumption:

A waste of nearly 12 hours work for 13 jobs...


No, from reading uplinger, the job that gets to most done is taken for assimilation and send to the scientists. The validation is done on the part that both tasks did i.e. the full 12 hour tasks is not going to waste... the difference is not send out into another job.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 19, 2010 4:13:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
X-Files 27
Senior Cruncher
Canada
Joined: May 21, 2007
Post Count: 391
Status: Offline
Project Badges:
Re: Linux Only Beta Test version 6.16

WU: 162578776
Maximum disk usage exceeded

BETA_ A.32.C29H18N2S.1.4.set1d06_ 3-- - In Progress 19/06/10 17:30:49 22/06/10 09:20:05 0.00 0.0 / 0.0
BETA_ A.32.C29H18N2S.1.4.set1d06_ 2-- 617 Error 19/06/10 09:34:38 19/06/10 17:30:40 6.46 150.4 / 0.0 <-mine
BETA_ A.32.C29H18N2S.1.4.set1d06_ 0-- 617 Error 19/06/10 02:25:43 19/06/10 09:34:28 6.64 134.1 / 0.0
BETA_ A.32.C29H18N2S.1.4.set1d06_ 1-- - In Progress 19/06/10 02:25:15 24/06/10 02:25:15 0.00 0.0 / 0.0
----------------------------------------

[Jun 19, 2010 5:57:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Linux Only Beta Test version 6.16

What was the value printed in the client message log at which the task got send off?
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 19, 2010 6:20:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
X-Files 27
Senior Cruncher
Canada
Joined: May 21, 2007
Post Count: 391
Status: Offline
Project Badges:
Re: Linux Only Beta Test version 6.16

What was the value printed in the client message log at which the task got send off?

World Community Grid 06/19/2010 1:29:30 PM Aborting task BETA_A.32.C29H18N2S.1.4.set1d06_2: exceeded disk limit: 1557.92MB > 1500.00MB
----------------------------------------

[Jun 19, 2010 6:42:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Linux Only Beta Test version 6.16

hmmmm, that's 1.5 gig. Could you have a BOINC setting that limits permissions on disk use?

edit: the largest CEP2 model use of VM has been 360MB on the first 16 processed here on me quad.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jun 19, 2010 6:49:06 PM]
[Jun 19, 2010 6:47:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Linux Only Beta Test (CEP2)

New error... or at least I skimmed this thread and did not see anyone else mention it.


Result Name: BETA_ A.31.C27H16N2S2.3.2.set1d06_ 1--

<core_client_version>6.10.36</core_client_version>

<![CDATA[
<message>
Maximum disk usage exceeded
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[00:17:37] Number of jobs = 16
[00:17:37] Starting job 0,CPU time has been restored to 0.000000.
[00:17:37] Starting new Job
[00:17:37] Qink name = fldman
[00:17:37] Qink name = gesman
[00:17:37] Qink name = scfman
[00:20:55] Qink name = anlman
[00:20:59] End of Job
[00:21:02] Finished Job #0
[00:21:02] Starting job 1,CPU time has been restored to 128.414478.
[00:21:02] Starting new Job
[00:21:02] Qink name = fldman
[00:21:03] Qink name = gesman
[00:21:03] Qink name = scfman
[00:31:11] Qink name = anlman
[00:32:24] End of Job
[00:32:28] Finished Job #1
[00:32:28] Starting job 2,CPU time has been restored to 593.229815.
[00:32:28] Starting new Job
[00:32:28] Qink name = fldman
[00:32:29] Qink name = gesman
[00:32:29] Qink name = scfman
[00:40:26] Qink name = anlman
[00:40:26] Qink name = drvman
[00:42:41] Qink name = optman
[00:42:41] Qink name = fldman
[00:42:41] Qink name = gesman
[00:42:42] Qink name = scfman
[00:56:51] Qink name = anlman
[00:56:51] Qink name = drvman
[00:58:55] Qink name = optman
[00:58:55] Qink name = fldman
[00:58:55] Qink name = gesman
[00:58:56] Qink name = scfman
[01:12:36] Qink name = anlman
[01:12:36] Qink name = drvman
[01:14:36] Qink name = optman
[01:14:36] Qink name = fldman
[01:14:36] Qink name = gesman
[01:14:37] Qink name = scfman
[01:28:14] Qink name = anlman
[01:28:14] Qink name = drvman
[01:30:13] Qink name = optman
[01:30:13] Qink name = fldman
[01:30:13] Qink name = gesman
[01:30:14] Qink name = scfman
[01:42:38] Qink name = anlman
[01:42:38] Qink name = drvman
[01:44:36] Qink name = optman
[01:44:36] Qink name = fldman
[01:44:36] Qink name = gesman
[01:44:37] Qink name = scfman
[01:57:12] Qink name = anlman
[01:57:12] Qink name = drvman
[01:59:10] Qink name = optman
[01:59:10] Qink name = fldman
[01:59:10] Qink name = gesman
[01:59:11] Qink name = scfman
[02:11:44] Qink name = anlman
[02:11:44] Qink name = drvman
[02:13:41] Qink name = optman
[02:13:41] Qink name = fldman
[02:13:41] Qink name = gesman
[02:13:41] Qink name = scfman
[02:26:13] Qink name = anlman
[02:26:14] Qink name = drvman
[02:28:15] Qink name = optman
[02:28:16] Qink name = fldman
[02:28:16] Qink name = gesman
[02:28:16] Qink name = scfman
[02:40:46] Qink name = anlman
[02:40:46] Qink name = drvman
[02:42:44] Qink name = optman
[02:42:44] Qink name = fldman
[02:42:44] Qink name = gesman
[02:42:45] Qink name = scfman
[02:55:14] Qink name = anlman
[02:55:15] Qink name = drvman
[02:57:10] Qink name = optman
[02:57:10] Qink name = fldman
[02:57:10] Qink name = gesman
[02:57:11] Qink name = scfman
[03:09:34] Qink name = anlman
[03:09:34] Qink name = drvman
[03:11:33] Qink name = optman
[03:11:33] Qink name = fldman
[03:11:33] Qink name = gesman
[03:11:34] Qink name = scfman
[03:22:33] Qink name = anlman
[03:22:33] Qink name = drvman
[03:24:36] Qink name = optman
[03:24:37] Qink name = fldman
[03:24:37] Qink name = gesman
[03:24:38] Qink name = scfman
[03:35:25] Qink name = anlman
[03:35:25] Qink name = drvman
[03:37:22] Qink name = optman
[03:37:22] Qink name = fldman
[03:37:22] Qink name = gesman
[03:37:23] Qink name = scfman
[03:48:01] Qink name = anlman
[03:48:01] Qink name = drvman
[03:49:54] Qink name = optman
[03:49:54] Qink name = fldman
[03:49:54] Qink name = gesman
[03:49:56] Qink name = scfman
[03:58:45] Qink name = anlman
[03:58:45] Qink name = drvman
[04:00:45] Qink name = optman
[04:00:45] Qink name = fldman
[04:00:45] Qink name = gesman
[04:00:46] Qink name = scfman
[04:09:29] Qink name = anlman
[04:09:30] Qink name = drvman
[04:11:24] Qink name = optman
[04:11:24] Qink name = fldman
[04:11:24] Qink name = gesman
[04:11:25] Qink name = scfman
[04:19:54] Qink name = anlman
[04:19:54] Qink name = drvman
[04:21:53] Qink name = optman
[04:21:53] Qink name = fldman
[04:21:53] Qink name = gesman
[04:21:53] Qink name = scfman
[04:29:39] Qink name = anlman
[04:29:39] Qink name = drvman
[04:31:34] Qink name = optman
[04:31:34] Qink name = anlman
[04:32:42] End of Job
[04:32:47] Finished Job #2
[04:32:47] Starting job 3,CPU time has been restored to 9886.533018.
[04:32:47] Starting new Job
[04:32:47] Qink name = fldman
[04:32:48] Qink name = gesman
[04:32:48] Qink name = scfman
[04:43:20] Qink name = anlman
[04:44:27] End of Job
[04:44:30] Finished Job #3
[04:44:30] Starting job 4,CPU time has been restored to 10340.549996.
[04:44:30] Starting new Job
[04:44:30] Qink name = fldman
[04:44:31] Qink name = gesman
[04:44:31] Qink name = scfman
[04:53:14] Qink name = anlman
[04:54:25] End of Job
[04:54:27] Finished Job #4
[04:54:27] Starting job 5,CPU time has been restored to 10728.335043.
[04:54:27] Starting new Job
[04:54:28] Qink name = fldman
[04:54:28] Qink name = gesman
[04:54:29] Qink name = scfman
[05:03:29] Qink name = anlman
[05:04:36] End of Job
[05:04:39] Finished Job #5
[05:04:39] Starting job 6,CPU time has been restored to 11128.047277.
[05:04:39] Starting new Job
[05:04:39] Qink name = fldman
[05:04:40] Qink name = gesman
[05:04:40] Qink name = scfman
[05:13:18] Qink name = anlman
[05:14:25] End of Job
[05:14:28] Finished Job #6
[05:14:28] Starting job 7,CPU time has been restored to 11512.523827.
[05:14:28] Starting new Job
[05:14:28] Qink name = fldman
[05:14:29] Qink name = gesman
[05:14:29] Qink name = scfman
[05:26:49] Qink name = anlman
[05:27:50] End of Job
[05:27:53] Finished Job #7
[05:27:53] Starting job 8,CPU time has been restored to 12050.212085.
[05:27:53] Starting new Job
[05:27:53] Qink name = fldman
[05:27:54] Qink name = gesman
[05:27:54] Qink name = scfman
[05:36:20] Qink name = anlman
[05:37:34] End of Job
[05:37:37] Finished Job #8
[05:37:37] Starting job 9,CPU time has been restored to 12431.083183.
[05:37:37] Starting new Job
[05:37:37] Qink name = fldman
[05:37:38] Qink name = gesman
[05:37:38] Qink name = scfman
[05:46:19] Qink name = anlman
[05:48:01] End of Job
[05:48:03] Finished Job #9
[05:48:03] Starting job 10,CPU time has been restored to 12839.568083.
[05:48:04] Starting new Job
[05:48:04] Qink name = fldman
[05:48:04] Qink name = gesman
[05:48:05] Qink name = scfman
[06:09:22] Qink name = anlman
[06:11:12] End of Job
[06:11:16] Finished Job #10
[06:11:16] Starting job 11,CPU time has been restored to 13766.408181.
[06:11:17] Starting new Job
[06:11:17] Qink name = fldman
[06:11:18] Qink name = gesman
[06:11:18] Qink name = scfman
[06:21:19] Qink name = anlman
[06:23:07] End of Job
[06:23:09] Finished Job #11
[06:23:09] Starting job 12,CPU time has been restored to 14232.743287.
[06:23:10] Starting new Job
[06:23:10] Qink name = fldman
[06:23:14] Qink name = gesman
[06:23:15] Qink name = scfman
[07:19:17] Qink name = anlman

</stderr_txt>
]]>



[ZoSo@AX4P3000 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 52G 32G 18G 65% /


Fri 18 Jun 2010 08:44:28 PM EDT Starting BOINC client version 6.10.36 for i686-pc-linux-gnu
Fri 18 Jun 2010 08:44:28 PM EDT log flags: file_xfer, sched_ops, task, checkpoint_debug
Fri 18 Jun 2010 08:44:28 PM EDT Libraries: libcurl/7.19.7 NSS/3.12.6.2 zlib/1.2.3 libidn/1.9 libssh2/1.2.2
Fri 18 Jun 2010 08:44:28 PM EDT Data directory: /var/lib/boinc
Fri 18 Jun 2010 08:44:28 PM EDT Processor: 4 AuthenticAMD AMD Phenom(tm) II X4 940 Processor [Family 16 Model 4 Stepping 2]
Fri 18 Jun 2010 08:44:28 PM EDT Processor: 512.00 KB cache
Fri 18 Jun 2010 08:44:28 PM EDT Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 \
clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow \
constant_tsc nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_l
Fri 18 Jun 2010 08:44:28 PM EDT OS: Linux: 2.6.32.12-115.fc12.i686.PAE
Fri 18 Jun 2010 08:44:28 PM EDT Memory: 3.93 GB physical, 9.77 GB virtual
Fri 18 Jun 2010 08:44:28 PM EDT Disk: 51.18 GB total, 18.36 GB free
Fri 18 Jun 2010 08:44:28 PM EDT Local time is UTC -4 hours
Fri 18 Jun 2010 08:44:28 PM EDT No usable GPUs found
Fri 18 Jun 2010 08:44:28 PM EDT WCG URL http://www.worldcommunitygrid.org/; Computer ID 1210272; resource share 100
Fri 18 Jun 2010 08:44:28 PM EDT WCG General prefs: from World Community Grid (last modified 21-Feb-2010 05:32:37)
Fri 18 Jun 2010 08:44:28 PM EDT WCG Computer location: DDDT2
Fri 18 Jun 2010 08:44:28 PM EDT General prefs: using separate prefs for DDDT2
Fri 18 Jun 2010 08:44:28 PM EDT Reading preferences override file
Fri 18 Jun 2010 08:44:28 PM EDT Preferences:
Fri 18 Jun 2010 08:44:28 PM EDT max memory usage when active: 3622.96MB
Fri 18 Jun 2010 08:44:28 PM EDT max memory usage when idle: 4005.39MB
Fri 18 Jun 2010 08:44:32 PM EDT max disk usage: 10.00GB
Fri 18 Jun 2010 08:44:32 PM EDT (to change, visit the web site of an attached project,
Fri 18 Jun 2010 08:44:32 PM EDT or click on Preferences)
Fri 18 Jun 2010 08:44:32 PM EDT Not using a proxy
Fri 18 Jun 2010 08:44:32 PM EDT Suspending computation - initial delay



Sat 19 Jun 2010 12:17:01 AM EDT WCG Starting BETA_A.31.C27H16N2S2.3.2.set1d06_1
Sat 19 Jun 2010 12:17:01 AM EDT WCG Starting task BETA_A.31.C27H16N2S2.3.2.set1d06_1 using beta11 version 617
Sat 19 Jun 2010 12:21:02 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 12:32:29 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 04:32:47 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 04:44:31 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 04:54:28 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 05:04:40 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 05:14:28 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 05:27:54 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 05:37:38 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 05:48:05 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 06:11:17 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 06:23:10 AM EDT WCG [checkpoint_debug] result BETA_A.31.C27H16N2S2.3.2.set1d06_1 checkpointed
Sat 19 Jun 2010 07:43:43 AM EDT WCG Aborting task BETA_A.31.C27H16N2S2.3.2.set1d06_1: exceeded disk limit: 1501.22MB > 1500.00MB
Sat 19 Jun 2010 07:43:44 AM EDT WCG Computation for task BETA_A.31.C27H16N2S2.3.2.set1d06_1 finished
Sat 19 Jun 2010 07:43:44 AM EDT WCG Output file BETA_A.31.C27H16N2S2.3.2.set1d06_1_0 for task BETA_A.31.C27H16N2S2.3.2.set1d06_1 absent
Sat 19 Jun 2010 07:43:44 AM EDT WCG Output file BETA_A.31.C27H16N2S2.3.2.set1d06_1_1 for task BETA_A.31.C27H16N2S2.3.2.set1d06_1 absent
Sat 19 Jun 2010 07:43:44 AM EDT WCG Output file BETA_A.31.C27H16N2S2.3.2.set1d06_1_2 for task BETA_A.31.C27H16N2S2.3.2.set1d06_1 absent


The disk limit must be set in the work unit itself, because as you can see by the above I have BOINC set to a 10GB limit and there are 18GB free on the root partition.

I don't understand how it figures CPU time... in the Results Status it shows 4.88 hours, but the log shows it ran for about 7.5 hours. (???)

I see it's been [re]sent to another wingman; both of them still say 'In Progress'.
HTH; Thanks.
[Jun 19, 2010 6:51:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Linux Only Beta Test (CEP2)

Think you missed the last few posts during the skimming ;P

edit: 6.10.36 shows wallclock/elapsed time. The task properties shows elapsed time and CPU time. At for instance 60% throttle or when the system isvery busy, that differential can get very big. Presently see an 1:20 hour difference on my quad for a task, whilst watching streaming WSC. football matches.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jun 19, 2010 6:57:10 PM]
[Jun 19, 2010 6:52:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
X-Files 27
Senior Cruncher
Canada
Joined: May 21, 2007
Post Count: 391
Status: Offline
Project Badges:
Re: Linux Only Beta Test version 6.16

BOINC max disk usage is 100GB. That should be enough.

I know that 1500MB is the requirement as I got this message when BOINC max disk usage was around 900MB.
----------------------------------------

[Jun 19, 2010 6:55:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 177   Pages: 18   [ Previous Page | 9 10 11 12 13 14 15 16 17 18 | Next Page ]
[ Jump to Last Post ]
Post new Thread