| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 4
|
|
| Author |
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
The below task finished some 40 hours ago and succeeded to upload parts _3 and _4, failing continuous on _0, _1 & with error as per topic. No transients or other burps are logged other than the perpetual Backoffs and also at start the "HTTP::init_post2: couldn't get file size". The wingman succeeded to report.
----------------------------------------E200642_ 054_ A.27.C20H11N3OS3.157.3.set1d06_ 1-- - In Progress 11/27/10 04:12:58 12/7/10 04:12:58 0.00 0.0 / 0.0 < stuck in upload E200642_ 054_ A.27.C20H11N3OS3.157.3.set1d06_ 0-- 637 Pending Validation 11/27/10 04:03:13 11/28/10 04:48:50 3.96 125.0 / 0.0 No issues are logged on prior or subsequent tasks. Only the backoffs are since logged, none of the others to say why it's backing off. plz advise. 28-Nov-2010 01:10:02 [WCG] Computation for task E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1 finished 28-Nov-2010 01:10:02 [WCG] [dcf] DCF: 0.944812->0.905610, raw_ratio 0.552786, adj_ratio 0.585075 28-Nov-2010 01:10:02 [WCG] Starting CMD2_1033-2JS4_A.clustersOccur-2OXE_A.clustersOccur_6_100762_102316_0 28-Nov-2010 01:10:02 [WCG] Starting task CMD2_1033-2JS4_A.clustersOccur-2OXE_A.clustersOccur_6_100762_102316_0 using hcmd2 version 614 HTTP::init_post2: couldn't get file size 28-Nov-2010 01:10:04 [WCG] Backing off 1 min 0 sec on upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_0 28-Nov-2010 01:10:04 [WCG] Started upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_1 28-Nov-2010 01:10:04 [WCG] Started upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_2 28-Nov-2010 01:10:04 [WCG] Started upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_3 28-Nov-2010 01:10:06 [WCG] [sched_op_debug] Starting scheduler request 28-Nov-2010 01:10:07 [WCG] Sending scheduler request: To fetch work. 28-Nov-2010 01:10:07 [WCG] Requesting new tasks 28-Nov-2010 01:10:07 [WCG] [sched_op_debug] CPU work request: 17237.38 seconds; 0.00 CPUs HTTP::init_post2: couldn't get file size 28-Nov-2010 01:10:08 [WCG] Project file upload handler is missing 28-Nov-2010 01:10:08 [WCG] Backing off 1 min 0 sec on upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_1 28-Nov-2010 01:10:08 [WCG] Started upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_4 HTTP::init_post2: couldn't get file size 28-Nov-2010 01:10:09 [WCG] Project file upload handler is missing 28-Nov-2010 01:10:09 [WCG] Backing off 1 min 0 sec on upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_2 28-Nov-2010 01:10:09 [WCG] Finished upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_3
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
anhhai
Veteran Cruncher Joined: Mar 22, 2005 Post Count: 839 Status: Offline Project Badges:
|
Sekerob, did you somehow force extra logging to get that message? or is that standard (if it is, then you probably recreated the problem several of us are seeing)
----------------------------------------![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I did have a problem doing an upload so I waited half a day and retried the transfer for one part of a WU - all proceeded as it should.
|
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
fao Techs,
----------------------------------------Any chance of you researching this result that kicked of this thread? It's starting to get creepy. As eluded, parts _3 and _4 uploaded fine but _0, _1 and _2 are still stuck, only after a client restart and an upload attempt giving a missing file handler message and after only the backoffs every few hours. Today I forced a benchmark and client restart and now it logged to have deleted the 3 files, except they're still there trying to upload. 02-Dec-2010 08:32:42 [---] Benchmark results: 02-Dec-2010 08:32:42 [---] Number of CPUs: 4 02-Dec-2010 08:32:42 [---] 2222 floating point MIPS (Whetstone) per CPU 02-Dec-2010 08:32:42 [---] 12018 integer MIPS (Dhrystone) per CPU 02-Dec-2010 08:32:42 [---] [dcf] scaling all duration correction factors by 0.998066 02-Dec-2010 08:32:43 [---] Resuming computation 02-Dec-2010 08:33:04 [---] Received signal 15 02-Dec-2010 08:33:05 [---] Exit requested by user 02-Dec-2010 08:33:06 [---] [error] Deleting file E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_0 while in use 02-Dec-2010 08:33:06 [---] [error] Deleting file E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_1 while in use 02-Dec-2010 08:33:06 [---] [error] Deleting file E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_2 while in use 02-Dec-2010 08:34:04 [---] Starting BOINC client version 6.10.58 for x86_64-pc-linux-gnu 02-Dec-2010 08:34:04 [---] Config: don't use coprocessors 02-Dec-2010 08:34:04 [---] Config: report completed tasks immediately 02-Dec-2010 08:34:04 [---] Config: ignoring NVIDIA GPU 0 02-Dec-2010 08:34:04 [---] Config: GUI RPC allowed from any host 02-Dec-2010 08:34:04 [---] Config: GUI RPC allowed from: 02-Dec-2010 08:34:04 [---] Config: 127.0.0.1 localhost 02-Dec-2010 08:34:04 [---] log flags: file_xfer, sched_ops, task, checkpoint_debug, dcf_debug, sched_op_debug 02-Dec-2010 08:34:04 [---] Libraries: libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15 02-Dec-2010 08:34:04 [---] Data directory: /var/lib/boinc-client 02-Dec-2010 08:34:05 [---] Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU @ 2.40GHz [Family 6 Model 15 Stepping 7] 02-Dec-2010 08:34:05 [---] Processor: 4.00 MB cache 02-Dec-2010 08:34:05 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdc 02-Dec-2010 08:34:05 [---] OS: Linux: 2.6.32-26-generic 02-Dec-2010 08:34:05 [---] Memory: 2.89 GB physical, 2.99 GB virtual 02-Dec-2010 08:34:05 [---] Disk: 68.24 GB total, 34.27 GB free 02-Dec-2010 08:34:05 [---] Local time is UTC +1 hours 02-Dec-2010 08:34:05 [WCG] URL http://www.worldcommunitygrid.org/; Computer ID 1292373; resource share 500 02-Dec-2010 08:34:05 [WCG] General prefs: from WCG (last modified 01-Dec-2010 08:55:05) 02-Dec-2010 08:34:05 [WCG] Computer location: school 02-Dec-2010 08:34:05 [---] General prefs: using separate prefs for school 02-Dec-2010 08:34:05 [---] Reading preferences override file 02-Dec-2010 08:34:05 [---] Preferences: 02-Dec-2010 08:34:05 [---] max memory usage when active: 2511.26MB 02-Dec-2010 08:34:05 [---] max memory usage when idle: 2954.42MB 02-Dec-2010 08:34:05 [---] max disk usage: 10.00GB 02-Dec-2010 08:34:05 [---] max download rate: 163840 bytes/sec 02-Dec-2010 08:34:05 [---] max upload rate: 131072 bytes/sec 02-Dec-2010 08:34:05 [---] (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 02-Dec-2010 08:34:05 [---] Not using a proxy Initialization completed 02-Dec-2010 08:34:05 [---] Suspending computation - initial delay 02-Dec-2010 08:34:50 [WCG] Restarting task HFCC_L3_01060949_L3_0001_0 using hfcc version 611 02-Dec-2010 08:34:50 [WCG] Restarting task HFCC_L3_01063460_L3_0001_0 using hfcc version 611 02-Dec-2010 08:34:50 [WCG] Restarting task CMD2_1047-2Q5H_A.clustersOccur-2V5Y_A.clustersOccur_71_155529_155743_155617_155680_1 using hcmd2 version 614 02-Dec-2010 08:34:50 [WCG] Restarting task CMD2_1042-2K5F_A.clustersOccur-2DYP_D.clustersOccur_0_17767_25578_1 using hcmd2 version 614 HTTP::init_post2: couldn't get file size 02-Dec-2010 08:36:33 [WCG] Backing off 3 min 40 sec on upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_0 02-Dec-2010 08:36:38 [WCG] Started upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_1 HTTP::init_post2: couldn't get file size 02-Dec-2010 08:36:40 [WCG] Project file upload handler is missing 02-Dec-2010 08:36:40 [WCG] Backing off 2 hr 33 min 35 sec on upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_1 02-Dec-2010 08:36:44 [WCG] Started upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_2 HTTP::init_post2: couldn't get file size 02-Dec-2010 08:36:45 [WCG] Project file upload handler is missing 02-Dec-2010 08:36:45 [WCG] Backing off 3 hr 38 min 33 sec on upload of E200642_054_A.27.C20H11N3OS3.157.3.set1d06_1_2 On inspection, the slot is now gone, so the real situation seems currently to be that the client_state.xml is holding (held) information on a job for which the data is no longer there. Going forward, went on to abort the 3 uploads in the transfer tab after which the result in the Tasks view changed to "Ready to Report". Visiting the Result Status page, the log looks identical to the wingman, who's now marked "inconclusive" and mine marked "error". E200642_ 054_ A.27.C20H11N3OS3.157.3.set1d06_ 1-- 637 Error 11/27/10 04:12:58 12/2/10 07:50:48 4.37 80.2 / 0.0 < E200642_ 054_ A.27.C20H11N3OS3.157.3.set1d06_ 0-- 637 Inconclusive 11/27/10 04:03:13 11/28/10 04:48:50 3.96 125.0 / 0.0 E200642_ 054_ A.27.C20H11N3OS3.157.3.set1d06_ 2-- - Waiting to be sent — — 0.00 0.0 / 0.0 Both have the Rc = 0x100 recorded at end of job #11 (12)
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
|