Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: Mapping Cancer Markers Forum Thread: Incredibly Quick Deadline |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 14
|
Author |
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2084 Status: Offline Project Badges: |
Sightings of Incredilbly Quick Deadlines for tasks as part of a MCM1-workunit can be posted here.
(Output generated by 'wcgstats -frSJ= 310330324') workunit 310330324 MCM1_0199264_3862_0 CentOS Linux Valid 2023-05-25T06:00:42 2023-05-25T07:38:15 1.62/1.62Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199264_3862_0 CentOS Linux Valid 2023-05-25T06:00:42 2023-05-25T07:38:15 1.62/1.62 Deadline: 47 seconds. (Please note: This thread is only meant for MCM1, not for other projects.) |
||
|
Cyclops
Senior Cruncher Joined: Jun 13, 2022 Post Count: 295 Status: Offline |
Sightings of Incredilbly Quick Deadlines for tasks as part of a MCM1-workunit can be posted here. (Output generated by 'wcgstats -frSJ= 310330324') workunit 310330324 MCM1_0199264_3862_0 CentOS Linux Valid 2023-05-25T06:00:42 2023-05-25T07:38:15 1.62/1.62Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199264_3862_0 CentOS Linux Valid 2023-05-25T06:00:42 2023-05-25T07:38:15 1.62/1.62 Deadline: 47 seconds. (Please note: This thread is only meant for MCM1, not for other projects.) Hi adriverhoef, thanks for bringing this to our attention. We are looking into this and will comment a follow up if we find anything that can help you avoid this issue in the future. |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2084 Status: Offline Project Badges: |
My advice would be to not give this issue higher priority than necessary, Cyclops. At the moment we're just figuring out if we - volunteers - can see a repeating pattern in what we detect (e.g. are they all arising at CentOS Linux?) and the frequency of the issue seems to be < 1%.
Thanks for your reply, Adri |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2084 Status: Offline Project Badges: |
Been sifting through another 1260 workunits and found these 7 cases, so I'd say it's still a minor issue as the rate of this problem is < 1%.
(Output generated by 'wcgstats -frSJ* MCM1_0199251_7888') workunit 310248783 MCM1_0199251_7888_0 CentOS Linux Valid 2023-05-25T02:25:44 2023-05-25T04:14:00 1.80/1.80Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199251_7888_0 CentOS Linux Valid 2023-05-25T02:25:44 2023-05-25T04:14:00 1.80/1.80Deadline: 45 seconds (Output generated by 'wcgstats -frSJ* MCM1_0199303_3064') workunit 310774610 MCM1_0199303_3064_0 Linux Fedora In Progr. 2023-05-26T02:30:00 2023-06-01T02:30:00 0.00/0.00Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199303_3064_1 CentOS Linux P. Valid. 2023-05-26T02:30:00 2023-05-26T04:06:52 1.61/1.61Deadline: 16 seconds (Output generated by 'wcgstats -frSJ* MCM1_0199289_0947') workunit 310575477 MCM1_0199289_0947_0 CentOS Linux Valid 2023-05-25T17:20:48 2023-05-25T18:55:20 1.54/1.54Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199289_0947_0 CentOS Linux Valid 2023-05-25T17:20:48 2023-05-25T18:55:20 1.54/1.54Deadline: 1 minute and 24 seconds (Output generated by 'wcgstats -frSJ* MCM1_0199281_8043') workunit 310504470 MCM1_0199281_8043_0 CentOS Linux Valid 2023-05-25T14:08:05 2023-05-25T15:52:04 1.69/1.69Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199281_8043_0 CentOS Linux Valid 2023-05-25T14:08:05 2023-05-25T15:52:04 1.69/1.69Deadline: 2 minutes, 7 seconds (Output generated by 'wcgstats -frSJ* MCM1_0199267_1592') workunit 310308321 MCM1_0199267_1592_0 CentOS Linux Valid 2023-05-25T05:01:40 2023-05-25T06:31:26 1.49/1.49Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199267_1592_0 CentOS Linux Valid 2023-05-25T05:01:40 2023-05-25T06:31:26 1.49/1.49Deadline: 4 minutes, 21 seconds (Output generated by 'wcgstats -frSJ* MCM1_0199244_2395') workunit 310308325 MCM1_0199244_2395_0 CentOS Linux Valid 2023-05-25T05:01:40 2023-05-25T07:22:26 2.34/2.34Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199244_2395_0 CentOS Linux Valid 2023-05-25T05:01:40 2023-05-25T07:22:26 2.34/2.34Deadline: 4 minutes, 21 seconds (same as MCM1_0199267_1592_0 above) (Output generated by 'wcgstats -frSJ* MCM1_0199176_0097') workunit 309563894 MCM1_0199176_0097_0 CentOS Linux Valid 2023-05-23T19:26:58 2023-05-23T21:55:07 2.46/2.46Details: --------------------------------------------------------------------------------------------------------------------------------------- MCM1_0199176_0097_0 CentOS Linux Valid 2023-05-23T19:26:58 2023-05-23T21:55:07 2.46/2.46Deadline: 3 minutes, 26 seconds Adri |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7576 Status: Offline Project Badges: |
MCM1_0199303_3064_0 Linux Fedora In Progr. 2023-05-26T02:30:00 2023-06-01T02:30:00 0.00/0.00
----------------------------------------MCM1_0199303_3064_1 CentOS Linux P. Valid. 2023-05-26T02:30:00 2023-05-26T04:06:52 1.61/1.61 MCM1_0199303_3064_2 Linux In Progr. 2023-05-26T02:30:23 2023-06-01T02:30:23 0.00/0.00 This instance does not appear to be a case of a quick deadline. The two "in progress ' cases show a deadline which is 6 days away. The common factor in a number of these cases is the machine which is running Cent-OS which exhibits a remarkably short turnaround. the issue I originally referenced was: Why are 3 work units being issued in such quick succession when only 2 are normally necessary ? It leads me to wonder if the due time is changed by the server after the work unit has been returned. If you could catch some of these while they are listed as i"in progess" , record the due date and see what the due date is when the work unit s returned, may shed some light on the situation. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 869 Status: Offline Project Badges: |
Adri,
Oh, look -- they're all CentOS Linux again... :-) And so was the one I saw recently (result extracted from my MCM1 wingman database) -- here's the key information: MCM1_0199015_3293_0 workunit 307378752 I wonder if they are all from the same system[1], as that might help narrow down any possible causes; I also see a fair number of Detached wingmen on the same version of CentOS (which may or may not be related?) By the way, I turned up 101 _0 results that had non-standard deadlines (excluding those that got 26-day or 40-day deadlines because of the recent downtime) out of a sample of 22924 different MCM1 work units since 2022-07-10; the vast majority of them are actually a few minutes under 6 days, so although they are odd they aren't problematic. only 3 of them had deadlines under 1 day (and the two not cited above were from relatively early on.) So, as you say, a minor issue -- definitely something to be curious about, though! Cheers - Al [1] At CPDN, Einstein, MilkyWay or TN-Grid I could find out for sure (as long as the workunit hadn't been assimilated) -- here I can only approximate. WCG has always been a lot more privacy-conscious than most other BOINC sites (because of its desire to be acceptable to corporate environments, I presume.) Not that I would publish the information if it were available, but I might refer it to the SysAdmin... |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 869 Status: Offline Project Badges: |
Sgt. Joe,
I didn't see your post until I'd replied to Adri -- I might've made a combined reply otherwise :-) If I understand your post correctly, you are wondering whether the Due time Adri (and I) might be capturing is different from that in place for that task at issue time. If the system received the task with that bizarre Due time it should probably have aborted it without starting it, so that's a reasonable point! However, I suspect that (as with grace day adjustments) the client would have been told the correct (6 day) deadline but the server tagged it incorrectly in the BOINC database for some reason -- it must have done that by the time the retry was issued (as that is done by the transitioner looking at the due time stored in the database) so that due time must have already been wrong within 23 seconds in this particular case. I can't immediately think of a triggering event that might suddenly choose to reduce that task's deadline after the request has been processed, so I'm willing to bet it was wrong all the time :-) By the way, I suspect that there are some systems out there that operate very short queues so I'm not that surprised by the rapid turnaround -- my two Ryzen systems can comfortably process an MCM1 unit in under two hours (even a VMethod=NFCV case!) but my typical turnaround is between 6 and 12 hours because I run a small (but significant) buffer... Cheers - Al. P.S. Same thing seems to be happening over at SCC1 - an extra oddity there is that I've been seeing a lot of wingmen on the same CentOS version being marked Detached (but they all have reasonable Due time values!) |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2084 Status: Offline Project Badges: |
Sgt.Joe:
----------------------------------------It leads me to wonder if the due time is changed by the server after the work unit has been returned. If you could catch some of these while they are listed as i"in progess" , record the due date and see what the due date is when the work unit s returned, may shed some light on the situation. I tried to do what you advised, Sgt.Joe. A few minutes ago, two tasks arrived here, both _2, one of them due to a Detached task _0, the other one due to a No Reply task _0. The latter case is now showing as: (Output generated by 'wcgstats -frSJ* SCC1_0004157_MyoD1-B_62249') workunit 298912296 SCC1_0004157_MyoD1-B_62249_0 CentOS Linux No Reply 2023-05-26T18:13:32 2023-05-26T18:13:43 0.00/0.00Details: --------------------------------------------------------------------------------------------------------------------------------------- SCC1_0004157_MyoD1-B_62249_0 CentOS Linux No Reply 2023-05-26T18:13:32 2023-05-26T18:13:43 0.00/0.00 Task _0 received a No Reply after 11 seconds. The other two are now In Progress. Adri [Edit 2 times, last edit by adriverhoef at May 26, 2023 6:26:07 PM] |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2084 Status: Offline Project Badges: |
My task (_2) started to run immediately an hour ago and has finished just now.
The workunit is now displaying as: (Output generated by 'wcgstats -frSJ* SCC1_0004157_MyoD1-B_62249') workunit 298912296 SCC1_0004157_MyoD1-B_62249_0 CentOS Linux Valid 2023-05-26T18:13:32 2023-05-26T18:39:06 0.42/0.42Details: --------------------------------------------------------------------------------------------------------------------------------------- SCC1_0004157_MyoD1-B_62249_0 CentOS Linux Valid 2023-05-26T18:13:32 2023-05-26T18:39:06 0.42/0.42 Adri |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7576 Status: Offline Project Badges: |
I have checked all 176 SCC units I currently have in progress and noted none of the above anomalies.
----------------------------------------I have put together a spreadsheet so I can easily download all of the in progress work units and easily check to see if something is out of whack. i will do this for both MCM and SCC for a few days to see if anything shows up that looks odd. I would not discount the possible CentOS influence but at this point it is pure speculation. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|