Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 24
Posts: 24   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4951 times and has 23 replies Next Thread
foxfire
Advanced Cruncher
United States
Joined: Sep 1, 2007
Post Count: 121
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error

Over the past 3 days I've gotten errors on:

OS________Nb
Linux______68
Win7 (64)__16
WinXP (32)__2

I think the only reason I'm seeing more on Linux is because I have more PCs running it than the other OS and Linux processes them faster.
----------------------------------------

[Mar 25, 2014 1:58:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dayle Diamond
Senior Cruncher
Joined: Jan 31, 2013
Post Count: 452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error

Correct me if I'm wrong, but FAAH/FAHV work units are run without a duplicate on computers where the last few day's results have been reliable.

So while these work units aren't directly wasting any time, there will be plenty of computers that are no longer deemed trustworthy, and that will halve crunching efficiency on any affected system.
[Mar 25, 2014 3:41:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Tex1954
Cruncher
Joined: Nov 3, 2005
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
smile Re: Error

I experienced this issue only on linux. Is there any body getting this error on the Windows?


Read UP a little... my errors on Windows-7 64b machines.


:)
[Mar 25, 2014 6:17:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error

Sorry for missleading...

I have hit a plenty of ERROR WUs on the Windows machines last night for both Vista and 7. I have realised by tracking the scheduling history. Actually, no
faah WUs have been downloaded on my Windows machines for a week but FAHVs while no FAHVs for Linux in this period.

It seems the team has stopped to send out the new faah WUs. I can see no faahs on the Windows machines and all faahs on the Linux machines are re-sent WUs which may end up with ERROR.

Kiyo.
[Mar 26, 2014 1:17:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error

Yes, there was an issue with the way work units were loaded. We have stopped sending out new work for those and I have recently cleared out the error work units so we can rebuild them. My plan at the moment is to resume the project during normal hours.

Note: This is only for FightAIDS@Home - Autodock version.

Thank you for your patience,
-Uplinger
[Mar 26, 2014 6:50:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error

I have re-enabled FightAIDS@Home - Autodock, work should start being distributed shortly.

Thanks again for your patience and participation,
-Uplinger
[Mar 26, 2014 3:59:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error

Now I'm getting faah WUs. Hope they run with no problem.

Thanks. Kiyo.
[Mar 26, 2014 11:18:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error

Only ever rarely visit specific project forums, but this one required to go to page 3 before finding the thread looked for, 16 sticky posts had the top of the faah forum. How effective is that?

So this error / these errors is / are prolific, only just now looked at a set and forgot box to see it's reached daily quota of one and thus idle cores been munching on the SIMAP jobs as the backup project. Don't care so much what they compute on long as they compute when left on.

This is the error logged:

Result Name: FAHV_ x3NF6_ B_ IN_ Y3a_ rig_ 0206730_ 3091_ 1--
<core_client_version>7.2.34</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:

These are some event log extracts

1332 World Community Grid 27-03-2014 09:40 Started download of fahv_image06_7.20.tga
1333 World Community Grid 27-03-2014 09:40 Finished download of fahv_image06_7.20.tga
1334 World Community Grid 27-03-2014 09:40 [error] Unable to verify fahv_image06_7.20.tga using certificates
1335 World Community Grid 27-03-2014 09:40 [error] Checksum or signature error for fahv_image06_7.20.tga

and

1408 World Community Grid 27-03-2014 15:21 Requesting new tasks for CPU
1409 World Community Grid 27-03-2014 15:21 [sched_op] CPU work request: 43200.00 seconds; 2.00 devices
1410 World Community Grid 27-03-2014 15:21 Scheduler request completed: got 0 new tasks
1411 World Community Grid 27-03-2014 15:21 [sched_op] Server version 701
1412 World Community Grid 27-03-2014 15:21 No tasks sent
1413 World Community Grid 27-03-2014 15:21 No tasks are available for FightAIDS@Home - Vina
1414 World Community Grid 27-03-2014 15:21 No tasks are available for FightAIDS@Home - AutoDock
1415 World Community Grid 27-03-2014 15:21 No tasks are available for the applications you have selected.
1416 World Community Grid 27-03-2014 15:21 This computer has finished a daily quota of 1 tasks
1417 World Community Grid 27-03-2014 15:21 Project requested delay of 121 seconds
1418 World Community Grid 27-03-2014 15:21 [sched_op] Deferring communication for 00:02:01
1419 World Community Grid 27-03-2014 15:21 [sched_op] Reason: requested by project

Hitting update 6 hours later earned the same log response, presuming some time after midnight the node gets another chance. As the project is now on a very long back-off counter, please do not rush to fix.
[Mar 27, 2014 8:29:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error

The 24 hours had passed and nothing improved with the quota used set to 1 again, though 2 fahv were sent. They failed crc/md5 checks, probably a new version being pushed in relation to the new wcg logo:
World Community Grid	3/28/2014 8:35:06 AM	[sched_op] Starting scheduler request	
World Community Grid 3/28/2014 8:35:06 AM Sending scheduler request: To fetch work.
World Community Grid 3/28/2014 8:35:06 AM Requesting new tasks for CPU
World Community Grid 3/28/2014 8:35:06 AM [sched_op] CPU work request: 16901.49 seconds; 2.00 devices
World Community Grid 3/28/2014 8:35:10 AM Scheduler request completed: got 2 new tasks
World Community Grid 3/28/2014 8:35:10 AM [sched_op] Server version 701
World Community Grid 3/28/2014 8:35:10 AM Project requested delay of 121 seconds
World Community Grid 3/28/2014 8:35:10 AM [sched_op] estimated total CPU task duration: 123417 seconds
World Community Grid 3/28/2014 8:35:10 AM [sched_op] Deferring communication for 00:02:01
World Community Grid 3/28/2014 8:35:10 AM [sched_op] Reason: requested by project
World Community Grid 3/28/2014 8:35:12 AM Started download of wcgrid_fahv_vina_7.20_windows_intelx86
World Community Grid 3/28/2014 8:35:12 AM Started download of wcgrid_fahv_vina_prod_32.exe.7.20
World Community Grid 3/28/2014 8:35:15 AM Finished download of wcgrid_fahv_vina_7.20_windows_intelx86
World Community Grid 3/28/2014 8:35:15 AM Finished download of wcgrid_fahv_vina_prod_32.exe.7.20
World Community Grid 3/28/2014 8:35:15 AM Started download of wcgrid_fahv_graphics_prod_32.exe.7.20
World Community Grid 3/28/2014 8:35:15 AM Started download of fahv_image01_7.20.tga
World Community Grid 3/28/2014 8:35:15 AM [error] Unable to verify wcgrid_fahv_vina_7.20_windows_intelx86 using certificates
World Community Grid 3/28/2014 8:35:15 AM [error] Checksum or signature error for wcgrid_fahv_vina_7.20_windows_intelx86
World Community Grid 3/28/2014 8:35:15 AM [error] Unable to verify wcgrid_fahv_vina_prod_32.exe.7.20 using certificates
World Community Grid 3/28/2014 8:35:15 AM [error] Checksum or signature error for wcgrid_fahv_vina_prod_32.exe.7.20
World Community Grid 3/28/2014 8:35:17 AM [sched_op] Deferring communication for 00:01:56
World Community Grid 3/28/2014 8:35:17 AM [sched_op] Reason: Unrecoverable error for task FAHV_x3NF6_B_IN_Y3b_rig_0206858_0226_2
World Community Grid 3/28/2014 8:35:17 AM [sched_op] Deferring communication for 00:02:56
World Community Grid 3/28/2014 8:35:17 AM [sched_op] Reason: Unrecoverable error for task FAHV_x3NF6_B_IN_Y3b_rig_0206814_0843_2


SIMAP continues to fill the gap. Will learn in another 24 hours if the technicians got to fixing this, and the over abundance of old news stickies.

cheerio
[Mar 28, 2014 10:36:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error

Is anyone of the wcg team even reading? 24 hours later, exactly the same thing, not one but seven faah tasks assigned all going out with the same download failure error.

faah763212_ ZINC01646278_ x3NF8ledgfA0335_ 00_ 1-- 2718223 Error 3/29/14 07:43:00 3/29/14 07:48:52 0.00 / 0.00 46.6 / 0.0
faah763210_ ZINC00615883_ x3NF8ledgfA0333_ 00_ 1-- 2718223 Error 3/29/14 07:39:22 3/29/14 07:43:00 0.00 / 0.00 46.6 / 0.0
faah763210_ ZINC04522231_ x3NF8ledgfA0333_ 00_ 1-- 2718223 Error 3/29/14 07:39:22 3/29/14 07:43:00 0.00 / 0.00 46.6 / 0.0
faah763183_ ZINC01706126_ x3NF8ledgfA0306_ 00_ 1-- 2718223 Error 3/29/14 06:52:50 3/29/14 06:58:57 0.00 / 0.00 48.8 / 0.0
faah763181_ ZINC01676213_ x3NF8ledgfA0304_ 00_ 1-- 2718223 Error 3/29/14 06:48:36 3/29/14 06:50:44 0.00 / 0.00 48.8 / 0.0
faah763180_ ZINC01196937_ x3NF8ledgfA0303_ 00_ 0-- 2718223 Error 3/29/14 06:44:47 3/29/14 06:48:36 0.00 / 0.00 48.8 / 0.0
faah763180_ ZINC01621981_ x3NF8ledgfA0303_ 00_ 0-- 2718223 Error 3/29/14 06:44:47 3/29/14 06:48:36 0.00 / 0.00 48.8 / 0.0


Result Log

Result Name: faah763212_ ZINC01646278_ x3NF8ledgfA0335_ 00_ 1--
<core_client_version>7.2.47</core_client_version>
<![CDATA[
<message>
app_version download error: couldn't get input files:
<file_xfer_error>
<file_name>wcgrid_faah_7.16_windows_intelx86</file_name>
<error_code>-123 (no signature)</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>wcgrid_faah_autodock_prod_graphics.exe.7.16</file_name>
<error_code>-123 (no signature)</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>faah_image01_7.16.tga</file_name>
<error_code>-123 (no signature)</error_code>
</file_xfer_error>

</message>
]]>

Scanning the forums there appears to be a longer history with application content delivery. Regrettably, only faah are being send, no fahv, it thus not possible to tell if just this one or more. It is weekend, meaning this device will most probably be doing SIMAP for the next 72 hours at least, never an incident there.

For sake of giration, did project reset, did detach, did fetch the 7.2.47 client installed over 7.0.34, but no effect on the issue.

Oh, the linux node with 7.2.42 now has the same, but on there wcg is just a sideshow as four more active projects are attached to this agent.

cheerios
[Mar 29, 2014 9:45:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 24   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread