Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 964 times and has 5 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Scheduler Request Detachs 1200 Workunits

scheduler request causes about 1200 workunits to become detached. Message says results no longer usable which is normally associated with being too late. All the workunits had been in the queue less than 72 hours with a 10 day deadline. Result status says detached. Server never reset or detached from project. Time shown is local time or 07:05:05 UTC

15-Mar-2016 02:05:05 [World Community Grid] Sending scheduler request: To report completed tasks.
15-Mar-2016 02:05:05 [World Community Grid] Reporting 23 completed tasks
15-Mar-2016 02:05:05 [World Community Grid] Not requesting tasks: too many runnable tasks
15-Mar-2016 02:11:28 [---] Project communication failed: attempting access to reference site
15-Mar-2016 02:11:28 [World Community Grid] Scheduler request failed: Timeout was reached
15-Mar-2016 02:11:41 [---] Internet access OK - project servers may be temporarily down.
15-Mar-2016 02:13:13 [World Community Grid] Sending scheduler request: To report completed tasks.
15-Mar-2016 02:13:13 [World Community Grid] Reporting 23 completed tasks
15-Mar-2016 02:13:13 [World Community Grid] Not requesting tasks: too many runnable tasks
15-Mar-2016 02:13:18 [World Community Grid] Scheduler request completed
15-Mar-2016 02:17:53 [World Community Grid] Computation for task OET1_0001874_xSDGP-S_rig_85705_1 finished
15-Mar-2016 02:17:53 [World Community Grid] Starting task OET1_0001874_xSDGP-S_rig_96729_0
15-Mar-2016 02:17:55 [World Community Grid] Started upload of OET1_0001874_xSDGP-S_rig_85705_1_r1997059196_0
15-Mar-2016 02:17:59 [World Community Grid] Finished upload of OET1_0001874_xSDGP-S_rig_85705_1_r1997059196_0
15-Mar-2016 02:18:52 [World Community Grid] Computation for task OET1_0001874_xSDGP-S_rig_84929_1 finished
15-Mar-2016 02:18:52 [World Community Grid] Starting task OET1_0001874_xSDGP-S_rig_96765_0
15-Mar-2016 02:18:54 [World Community Grid] Started upload of OET1_0001874_xSDGP-S_rig_84929_1_r1185439569_0
15-Mar-2016 02:18:58 [World Community Grid] Finished upload of OET1_0001874_xSDGP-S_rig_84929_1_r1185439569_0
15-Mar-2016 02:19:00 [World Community Grid] Sending scheduler request: To fetch work.
15-Mar-2016 02:19:00 [World Community Grid] Reporting 2 completed tasks
15-Mar-2016 02:19:00 [World Community Grid] Requesting new tasks for CPU
15-Mar-2016 02:19:17 [World Community Grid] Scheduler request completed: got 120 new tasks
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_83453_1 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_86657_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_88587_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_88601_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_89133_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_90881_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_89120_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_88745_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_9113_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_90724_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_87968_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_91695_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_92256_0 is no longer usable
15-Mar-2016 02:19:17 [World Community Grid] Result OET1_0001874_xSDGP-S_rig_91565_0 is no longer usable

Subsequent scheduler request:

5-Mar-2016 02:20:47 [World Community Grid] Finished download of 22e7f3376ec8594e3a4cffd2de5e08c8.zip
15-Mar-2016 02:20:47 [World Community Grid] Finished download of 562d23e68d48441dd6cb5be41e92f348.pdbqt
15-Mar-2016 02:20:47 [World Community Grid] Finished download of 86b88e161732423a78b22ab26fdd115e.job
15-Mar-2016 02:20:47 [World Community Grid] Finished download of 74e1fb7d495e353b4b7c3a0ede921d68.zip
15-Mar-2016 02:20:47 [World Community Grid] Finished download of 6d7da58e3dae64f3fc9704f995df1852.pdbqt
15-Mar-2016 02:21:19 [World Community Grid] Fetching scheduler list
15-Mar-2016 02:21:22 [World Community Grid] Master file download succeeded
15-Mar-2016 02:21:27 [World Community Grid] Sending scheduler request: To report completed tasks.
15-Mar-2016 02:21:27 [World Community Grid] Reporting 1000 completed tasks
15-Mar-2016 02:21:27 [World Community Grid] Requesting new tasks for CPU
15-Mar-2016 02:21:46 [World Community Grid] Scheduler request completed: got 123 new tasks
15-Mar-2016 02:21:53 [World Community Grid] Started download of oet1.xMBGP-S_rig.pdbqt
15-Mar-2016 02:21:53 [World Community Grid] Started download of 65eec4b29e47473eac8e767b6e8899ee.job
15-Mar-2016 02:21:53 [World Community Grid] Started download of 04b7004d48ab68d37999e73da5cfff3f.zip
15-Mar-2016 02:21:53 [World Community Grid] Started download of 8b620622bf63f99c9657ddd1c3757fba.pdbqt
15-Mar-2016 02:21:53 [World Community Grid] Started download of 1ff35b987047e15a9ea05d525d190f5d.job
----------------------------------------
[Edit 1 times, last edit by Doneske at Mar 15, 2016 4:31:41 PM]
[Mar 15, 2016 4:26:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduler Request Detachs 1200 Workunits

What does the distribution show on the RS pages for the 'detached' units? If somehow the server got a tilt signal to say the inventory on your device lost due a detach, they get re-issued, and your copy then marked as no longer considered.

Not seen this line before:

Not requesting tasks: too many runnable tasks. Dipping into the source code it says:
The branch, master has been updated
via 4d47e2f client: don't request work from a project w/ > 1000 runnable jobs
from 4cb34a1 client: don't apply CPU throttling to apps that use < .5 CPUs (like GPU, NCI)


Is there a second cap in addition to 35 per device-core?
[Mar 15, 2016 4:44:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduler Request Detachs 1200 Workunits

Oh yes, since that change was checked in during 2013, what client version are you running?
[Mar 15, 2016 4:48:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduler Request Detachs 1200 Workunits

1000 is the maximum tasks a client can have regardless of # of CPUs.

Additional "funky" scheduler activity on a second server:

14-Mar-2016 21:55:40 [World Community Grid] Started upload of OET1_0001872_xSDGP-S_rig_58569_0_r2062771933_0
14-Mar-2016 21:55:44 [World Community Grid] Finished upload of OET1_0001872_xSDGP-S_rig_58569_0_r2062771933_0
14-Mar-2016 21:55:47 [World Community Grid] Sending scheduler request: To fetch work.
14-Mar-2016 21:55:47 [World Community Grid] Reporting 1 completed tasks
14-Mar-2016 21:55:47 [World Community Grid] Requesting new tasks for CPU
14-Mar-2016 22:00:53 [---] Project communication failed: attempting access to reference site
14-Mar-2016 22:00:53 [World Community Grid] Scheduler request failed: Failure when receiving data from the peer
14-Mar-2016 22:00:56 [---] Internet access OK - project servers may be temporarily down.
14-Mar-2016 22:02:23 [World Community Grid] Sending scheduler request: To fetch work.
14-Mar-2016 22:02:23 [World Community Grid] Reporting 1 completed tasks
14-Mar-2016 22:02:23 [World Community Grid] Requesting new tasks for CPU
14-Mar-2016 22:02:27 [World Community Grid] Scheduler request completed: got 15 new tasks
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_57379_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_58400_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_58447_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_58797_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_58853_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_58924_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_59009_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_59055_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_5917_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_59217_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_59268_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_59302_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_5934_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_59329_0
14-Mar-2016 22:02:27 [World Community Grid] Resent lost task OET1_0001883_xZAGP-FW_rig_5939_0
14-Mar-2016 22:02:29 [World Community Grid] Started download of d664c7caf137ccfb5f3b39de4a5c7b67.job
14-Mar-2016 22:02:29 [World Community Grid] Started download of 8dc8dd40c15f54a38b0d485306059830.zip
14-Mar-2016 22:02:29 [World Community Grid] Started download of 0fa271ec8862f18b3876a680dfd6371b.pdbqt
14-Mar-2016 22:02:29 [World Community Grid] Started download of 2b36622c1f99dccae6ce07d68dfda98d.job
14-Mar-2016 22:02:29 [World Community Grid] Started download of 815d78cf2246a9300698e5bae37d8fb9.zip
14-Mar-2016 22:02:29 [World Community Grid] Started download of 68ec4f3eba099de7ac03b619b8282c00.pdbqt
14-Mar-2016 22:02:29 [World Community Grid] Started download of e33de0c58ba31147dca8906f88e22988.job
14-Mar-2016 22:02:29 [World Community Grid] Started download of 4ec61ea9f9d0ffb42629ac151356a8cc.zip
[Mar 15, 2016 4:49:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduler Request Detachs 1200 Workunits

Ah, found a "too many runnable tasks" occurrence at Universe which happened when before this was logged...

Scheduler request failed: HTTP internal server error
[Mar 15, 2016 4:52:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduler Request Detachs 1200 Workunits

The "detached" results were resent to other machines.

Client version is 7.6.22 on all machines except my one Windows which is 7.6.29
[Mar 15, 2016 4:54:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread