| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 115
|
|
| Author |
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
WCG are still sending out CEP WUs from bad batches
Hang on to those suspended WUs a bit longer, maybe. Continuing from my post above, another copy of the CEP WU that I had been keeping Ready to Run/Suspended, and that I aborted manually today, has been sent to another cruncher. The new copy was sent out 47 min after I aborted mine. Where are the (insert your own adjectives) techs? goaltender says:"I wish I would have tuned in to this forum earlier in order to select other projects" It's a good idea to check the forums at any time that you notice something unusual or undesirable happening, but that assumes that you have enough spare time to search for the relevant thread. We can arrange in our My Forum Profile to watch selected threads or forum sections and have emails sent to us when there are new messages. One would expect that monitoring the Member News and Known Issues sections would alert us to issues like this one with CEP. However, in this case there has been nothing posted there. Do you think that;s good enough? |
||
|
|
Greg Lyke
Advanced Cruncher Joined: May 30, 2008 Post Count: 50 Status: Offline |
For whatever it's worth WCG (or the great computer in the sky
) came through & aborted all my suspended CEP units in the 765+ range (both the ones that had been started & the ones that hadn't).As I have switched everything to work on different projects, is it safe to pick up some new CEP's & not have to worry about a large percentage of them erroring out? |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Well, I was just adding new task errors to my original post back on page one... but I'm not sure they're being noted so here are more I've seen...
Output file E000770_460C_003009010_0_0 for task E000770_460C_003009010_0 exceeds size limit. File size: 18600719.000000 bytes. Limit: 15000000.000000 bytes Output file E000771_457C_002x0370k_0_0 for task E000771_457C_002x0370k_0 exceeds size limit. File size: 16897523.000000 bytes. Limit: 15000000.000000 bytes Output file E000771_490C_003e0300h_1_0 for task E000771_490C_003e0300h_1 exceeds size limit. File size: 31961951.000000 bytes. Limit: 15000000.000000 bytes Output file E000772_183C_005v09306_1_0 for task E000772_183C_005v09306_1 exceeds size limit. File size: 17145561.000000 bytes. Limit: 15000000.000000 bytes Output file E000772_436C_003e0660a_0_0 for task E000772_436C_003e0660a_0 exceeds size limit. File size: 36141323.000000 bytes. Limit: 15000000.000000 bytes Output file E000773_445C_002x07514_0_0 for task E000773_445C_002x07514_0 exceeds size limit. File size: 17934179.000000 bytes. Limit: 15000000.000000 bytes Output file E000773_882C_002y0020h_0_0 for task E000773_882C_002y0020h_0 exceeds size limit. File size: 41483822.000000 bytes. Limit: 15000000.000000 bytes Output file E000774_328C_00300280x_1_0 for task E000774_328C_00300280x_1 exceeds size limit. File size: 15901861.000000 bytes. Limit: 15000000.000000 bytes i.e. some are still going over as of 774, and I see I've picked up a reissued 768 (about 8 hours left to run on that one). Again, AMD dual core 2GB, Win7 RC, WGC's 6.2.28 version of BOINC. |
||
|
|
Los Alcoholicos.scorpionv
Cruncher Joined: Jul 28, 2007 Post Count: 13 Status: Offline Project Badges:
|
Well, I switched from CEP -> HPF2 yesterday. Still 6 pages of CEP errored work units. Hope it will be fixed soon...
|
||
|
|
mclaver
Veteran Cruncher Joined: Dec 19, 2005 Post Count: 566 Status: Offline Project Badges:
|
Well, I switched from CEP -> HPF2 yesterday. Still 6 pages of CEP errored work units. Hope it will be fixed soon... It looks like I am having the same problem as everyone else. It look like the last couple days I have many ERROR. In each case multiple wingman have received ERRORS too. I do not think anyone was able to prcoess these WUs successflly. From 6/27-6/30 I have had 24 WUs in Error, and only 6 either Valid or PV. These are on differenet machines, different operating systems (Windows and Ubuntu), different processors, Since everyone is getting errors it seems like a bad batch of CEP WUs. Result Name Device Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit E000772_ 427C_ 002y0470c_ 2-- winfast-6000 Error 6/30/09 14:40:49 7/1/09 01:23:28 7.04 129.5 / 0.0 E000775_ 052C_ 002y0820z_ 0-- Abit-Q6600 Error 6/30/09 12:30:33 7/1/09 02:43:31 9.32 142.1 / 0.0 E000771_ 496C_ 003e0360h_ 2-- Photo-PC Error 6/30/09 04:44:17 6/30/09 12:49:44 4.55 106.5 / 0.0 E000773_ 682C_ 002x05204_ 1-- ECS-AMD9500 Error 6/29/09 22:40:03 6/30/09 16:57:44 9.34 115.6 / 0.0 E000766_ 767C_ 002y0070i_ 5-- ASUS-i7-965 Error 6/29/09 15:09:27 6/30/09 02:24:38 7.05 179.8 / 0.0 E000768_ 375C_ 005v03505_ 4-- fox-amd-9950 Error 6/29/09 11:36:33 6/29/09 23:10:21 7.95 139.7 / 0.0 E000765_ 080C_ 002x00005_ 6-- EricClaver-PC Error 6/29/09 11:32:40 6/29/09 20:26:06 5.90 146.8 / 0.0 E000765_ 470C_ 003e0700x_ 6-- MSI-I7-920 Error 6/29/09 09:56:55 6/29/09 23:38:39 9.17 131.1 / 0.0 E000772_ 201C_ 005v0810v_ 1-- MSI-I7-920 Error 6/29/09 09:31:52 6/29/09 23:21:38 9.79 140.9 / 0.0 E000770_ 315C_ 002y0750w_ 2-- MSI-I7-920 Error 6/29/09 08:53:50 6/29/09 22:49:10 9.97 143.5 / 0.0 E000767_ 854C_ 003007408_ 3-- ECS-AMD9500 Error 6/29/09 08:45:33 6/29/09 22:40:03 9.27 108.5 / 0.0 E000768_ 128C_ 002y0180j_ 3-- ASUS-i7-965 Error 6/29/09 04:25:11 6/29/09 14:31:10 6.72 169.6 / 0.0 E000766_ 994C_ 002x00410_ 3-- fox-amd-9950 Error 6/29/09 04:21:41 6/29/09 16:42:11 7.88 138.6 / 0.0 E000766_ 950C_ 002x0900g_ 3-- MSI-I7-920 Error 6/29/09 00:46:02 6/29/09 13:16:01 9.64 132.4 / 0.0 E000769_ 966C_ 003e0460l_ 2-- Tania-PC Error 6/29/09 00:33:08 6/29/09 17:58:04 9.80 112.7 / 0.0 E000766_ 560C_ 003e09009_ 3-- ECS-AMD9500 Error 6/28/09 15:39:46 6/29/09 06:24:57 9.18 111.7 / 0.0 E000766_ 832C_ 003006211_ 5-- Abit-Q6600 Error 6/28/09 10:09:31 6/29/09 18:19:42 8.21 125.2 / 0.0 E000769_ 212C_ 002x04208_ 1-- winfast-6000 Error 6/28/09 03:57:10 6/28/09 16:52:26 7.09 130.5 / 0.0 E000766_ 001C_ 002x01104_ 3-- MSI-I7-920 Error 6/28/09 02:32:48 6/28/09 15:49:23 9.43 130.8 / 0.0 E000768_ 435C_ 002y0350r_ 1-- asus-amd6000 Error 6/27/09 20:15:06 6/28/09 07:20:32 6.95 127.0 / 0.0 E000765_ 658C_ 002x0880l_ 2-- chaintech-4200 Error 6/27/09 18:27:13 6/28/09 10:11:12 9.45 127.1 / 0.0 E000767_ 868C_ 005v0080t_ 1-- Photo-PC Error 6/27/09 15:06:27 6/27/09 23:36:40 5.15 119.6 / 0.0 E000767_ 425C_ 005v0850d_ 1-- MSI-I7-920 Error 6/27/09 11:17:43 6/27/09 23:49:57 9.47 129.6 / 0.0 E000766_ 320C_ 003e0700p_ 1-- GIGA-Q9450 Error 6/27/09 01:16:38 6/27/09 12:21:51 6.62 120.6 / 0.0 Result Log <core_client_version>6.6.28</core_client_version> <![CDATA[ <stderr_txt> Calling initGraphics() INFO: No state to restore. Start from the beginning. Calling initGraphics() called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>E000772_427C_002y0470c_2_0</file_name> <error_code>-131</error_code> </file_xfer_error> </message> ]]> ![]() ![]() ![]() |
||
|
|
mclaver
Veteran Cruncher Joined: Dec 19, 2005 Post Count: 566 Status: Offline Project Badges:
|
It looks like there are other problems as I look around.
----------------------------------------I had six WUs marked to late, even though they were returned in less than 24 hours. I also had a bunch aborted, after considerable CPU time. I have lost 42 WUs in the last couple of days, with a considerable loss of work, some as long as 10 hours of cpu time and 169 points. Theat explains why my total points per day has gone down. The good news is that I have only three CEP WU in progress. Result Name Device Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit E000767_ 098C_ 002x0580c_ 6-- Foxconn-6400 Too Late 6/30/09 04:34:25 6/30/09 15:46:14 6.36 127.1 / 0.0 E000768_ 278C_ 002y09800_ 4-- Foxconn-6400 Too Late 6/29/09 21:35:17 6/30/09 07:39:07 6.49 129.7 / 0.0 E000766_ 022C_ 002x0520h_ 5-- ASUS-i7-965 Too Late 6/29/09 21:13:57 6/30/09 07:55:04 7.03 179.2 / 0.0 E000769_ 854C_ 005v0240k_ 3-- winfast-6000 Too Late 6/28/09 22:50:59 6/29/09 06:41:51 3.84 70.6 / 0.0 E000766_ 785C_ 003003508_ 2-- chaintech-4200 Too Late 6/28/09 00:50:03 6/28/09 17:22:02 9.57 128.8 / 0.0 Result Name Device Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit E000771_ 715C_ 003e0150g_ 3-- fox-amd-9950 Aborted 6/30/09 13:03:38 6/30/09 14:21:25 0.00 0.0 / 0.0 E000766_ 319C_ 00300790x_ 4-- ASUS-i7-965 Aborted 6/30/09 12:59:50 6/30/09 13:40:25 0.00 0.0 / 0.0 E000771_ 559C_ 002y06915_ 3-- ECS-AMD9500 Aborted 6/30/09 10:06:37 6/30/09 16:58:04 3.93 47.9 / 0.0 E000770_ 387C_ 005v02714_ 3-- fox-amd-9950 Aborted 6/30/09 10:02:47 6/30/09 14:21:25 1.26 22.2 / 0.0 E000769_ 053C_ 002x0230y_ 5-- fox-amd-9950 Aborted 6/30/09 09:55:20 6/30/09 14:21:25 1.90 33.4 / 0.0 E000772_ 215C_ 002y0350w_ 2-- EricClaver-PC Aborted 6/30/09 07:39:24 6/30/09 15:08:38 4.21 104.7 / 0.0 E000771_ 492C_ 003e0320h_ 2-- ASUS-i7-965 Aborted 6/30/09 06:50:33 6/30/09 13:40:25 4.13 103.3 / 0.0 E000771_ 330C_ 002y03008_ 2-- ASUS-i7-965 Aborted 6/30/09 06:50:12 6/30/09 13:40:25 4.75 118.8 / 0.0 E000771_ 698C_ 002x0880q_ 2-- ASUS-i7-965 Aborted 6/30/09 04:44:35 6/30/09 13:40:25 6.42 160.8 / 0.0 E000769_ 839C_ 003e0090n_ 5-- MSI-I7-920 Aborted 6/30/09 00:36:47 6/30/09 14:02:28 8.43 116.8 / 0.0 E000769_ 035C_ 003e08501_ 5-- chaintech-4200 Aborted 6/30/09 00:33:27 6/30/09 16:46:56 0.00 0.0 / 0.0 ![]() ![]() ![]() |
||
|
|
gordoma
Veteran Cruncher Windsor, UK Joined: Jul 21, 2005 Post Count: 729 Status: Offline Project Badges:
|
The good news is that I have only three CEP WU in progress. You're lucky*. I was only running CEP and DDDT and due to the temporary "dry hopper" at DDDT, my queues were full of CEP when this happened. Took a big hit yesterday! Looks like many of the faulty WUs have been pulled by the project hense some showing as "aborted by project" and others showing "too late". I don't want to assume that all faulty units have been pulled and it's safe to return to CEP until we have the all clear. As Sek said - safer to switch to something else until we hear it's been resolved. (*reminds me of Python's 4 Yorkshire men sketch!) |
||
|
|
KodeX
Advanced Cruncher Germany Joined: Aug 17, 2006 Post Count: 96 Status: Offline Project Badges:
|
I am also getting the same error:
----------------------------------------<core_client_version>6.6.20</core_client_version> WU: E000773_966C_003e0360q I took CEP out of my active projects. [Edit 1 times, last edit by KodeX at Jul 1, 2009 11:14:29 AM] |
||
|
|
Viktors
Former World Community Grid Tech Joined: Sep 20, 2004 Post Count: 653 Status: Offline Project Badges:
|
We have cancelled E000765 through E000785 work units because of a problem of exceeding a file size. These will be rebuilt as new work units by the Harvard team to avoid this problem. Sorry for the troubles. New work units should be available now.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
We have cancelled E000765 through E000785 work units because of a problem of exceeding a file size. These will be rebuilt as new work units by the Harvard team to avoid this problem. Sorry for the troubles. New work units should be available now. Credit for problem work units completed?? |
||
|
|
|