Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: Smash Childhood Cancer Thread: Is there any way of finding your wingman’s host-Id? |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 65
|
Author |
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 873 Status: Offline Project Badges: |
Adri -- thanks for confirming that someone else is seeing the "not going to send to a Ryzen at the moment" behaviour. I wonder it also happens to Windows Ryzen users when the availability situation is reversed :-)
Without detailed access to the WCG BOINC database or their project configuration files I don't think it'll be possible to find a reason for this. but I am beginning to suspect that there's a situation in which some SCC1 retries fail a CPU hardware match check if there's an AMD processor asking and there's an Intel task for that WU still regarded as live.[*1] I'd be interested to know if anyone has seen the same sort of issue with non-SCC1 work; as I mentioned above, I don't recall having seen it elsewhere -- "other platforms" usually seems to apply on all my systems at once for MCM1 if/when Linux work is in short supply because of bulk retries... Cheers - Al. [*1] I haven't got time to do the sort of code dive that would be needed to see if that's a possibility, but I know that the platform-checking code does have some CPU-specific stuff in it that may or may not be exercised for SCC1 :-) |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 747 Status: Offline Project Badges: |
A lot of the repair work units for that 5.15.107+ person is stuck in "waiting to be sent" status again. A ton of new work is being issued, but it's kind of backwards that new work is prioritized over letting the system catch up on the repair work, especially if it takes 1-3 weeks before it's sent back out (like last time).
----------------------------------------Are the techs aware of this behavior?
|
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7581 Status: Recently Active Project Badges: |
You are probably right. The blizzard of resends seems to have ceased for the moment, so there could still be a bunch in the waiting to be sent penalty box. Maybe on Monday someone will notice and release them. In the meantime the supply of both MCM and SCC appears to be steady.
----------------------------------------Cheers Edit:spelling
Sgt. Joe
----------------------------------------*Minnesota Crunchers* [Edit 1 times, last edit by Sgt.Joe at Oct 1, 2023 1:30:03 AM] |
||
|
Jake1402
Senior Cruncher USA Joined: Dec 30, 2005 Post Count: 181 Status: Offline Project Badges: |
You mean here: https://boinc.berkeley.edu/download_all.php where I read These versions may not be current. 7.16.6 Development version (MAY BE UNSTABLE - USE ONLY FOR TESTING) 7.4.22 Recommended version where the first line makes everything suspect. yep...you will notice the date of April 2020. I have been running this for several years and it works fine for me. Also I installed the one in Software Sources which is the "development" version. It appears nobody is developing/updating Linux in Berkeley. YMMV
Join the Chicago-IL-USA team!
2 AMD FX 8320/AMD R9 270X/Win 10 2 AMD FX 8320/AMD RX 560/Linux Mint 20.3 (both computers DOA) Intel Pentium G240/Win 10 |
||
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 337 Status: Offline Project Badges: |
You mean here: https://boinc.berkeley.edu/download_all.php where I read These versions may not be current. 7.16.6 Development version (MAY BE UNSTABLE - USE ONLY FOR TESTING) 7.4.22 Recommended version where the first line makes everything suspect. yep...you will notice the date of April 2020. I have been running this for several years and it works fine for me. Also I installed the one in Software Sources which is the "development" version. It appears nobody is developing/updating Linux in Berkeley. YMMV The most up to date version is 7.20.5 in costamagnagianfranco’s ppa. It is stable and works well. |
||
|
Spiderman
Advanced Cruncher United States Joined: Jul 13, 2020 Post Count: 113 Status: Offline Project Badges: |
RE: Linux/Android...
I note a few To-Do Tasks in BOINC's upcoming Hacktoberfest 2023 for Linux/Android, so perhaps we will see updates in the Client generated from that in coming weeks/months... https://github.com/orgs/BOINC/projects/11/views/1 |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 873 Status: Offline Project Badges: |
hchc, Sgt. Joe -- regarding the Waiting to be sent issue...
[Some of this may well have been said elsewhere...] Given that this doesn't seem to happen for MCM1, and that it didn't seem to happen for SCC1 until fairly recently, I suspect it's an unintended consequence of whatever they did to try to ensure new SCC1 work went out [over weekends?] when there were a fair number of posts about not being able to get work a few weeks ago... It's behaving as if precedence is being given to WU numbers in [more or less] descending order -- the usual would be the opposite (and retries of the oldest WUs would tend to clear out first.) This probably wouldn't be a severe problem if the number of tasks failing because of time-outs was low, but even without the recent cluster incident I was still seeing fairly high numbers of retries (and I wonder if it was worse on the Windows side? -- I have no data for that...) I notice that since the new SCC1 work units created on 30th September have started being issued I haven't seen a single SCC1 "Waiting to be sent" retry go out[*1]; these are all week-old WUs (or older)... Over the preceding few days, the total count of Waiting tasks seen was going down faster than new ex-cluster No Reply tasks were coming in. At the time of writing, I have 175 stuck (I'll bet Sgt. Joe has a lot more than that!), and I wonder if they'll still be stuck when this current set of new tasks is all issued. By then, some of the WUs will be well over a week old and may struggle to clear out because there will be missed deadlines for tasks from the new WUs, even without the 5.15.107+ devices (which don't appear to be picking up new work as far as I can tell[*2], thank goodness!) Let's just hope there's not a repeat of the cluster issue (or anything else that has the same effect...) And in closing, some data relating to the original, narrower, topic of this thread: since mid-September I have processed tasks from 4819 SCC1 WUs and there were 1577 No Reply responses from 5.15.107+ systems in that set, with 1456 different device names. If I also consider MCM1 tasks there were a total of 2594 distinct devices! (And that may be the end of it; I think those systems last picked up work on 24th when the new work ran out...) Cheers - Al. [*1] I have API scripts that collect WU progress daily, so I'm not depending on noticing changes via the Web site :-) [*2] I'd love to know what happened there, but I don't reckon the cluster manager(s) will come and post about it :-) |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7581 Status: Recently Active Project Badges: |
At the time of writing, I have 175 stuck (I'll bet Sgt. Joe has a lot more than that!) i currently have 2449 awaiting validation which is down considerably from 4000+ I had recently, so apparently progress is being made. Edit: The oldest ones are from Sept. 21 and are from the notorious cluster group. I randomly checked those from Sept.21 and they were in the "waiting to be sent' category. Cheers
Sgt. Joe
----------------------------------------*Minnesota Crunchers* [Edit 1 times, last edit by Sgt.Joe at Oct 1, 2023 6:01:49 PM] |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 747 Status: Offline Project Badges: |
Most of my "Waiting to be sent" are from September 19-23 issue date from the cluster.
----------------------------------------[*2] I'd love to know what happened there, but I don't reckon the cluster manager(s) will come and post about it :-) If they're a big enough volunteer to have those resources, one would think they participate in the forums, but I dunno. with 1456 different device names. How can you tell that? Very interesting.
|
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2089 Status: Offline Project Badges: |
Al wrote:
hchc, Sgt. Joe -- regarding the Waiting to be sent issue... I notice that since the new SCC1 work units created on 30th September have started being issued I haven't seen a single SCC1 "Waiting to be sent" retry go out They seemingly tend to go out. However … (did anyone say 'slow'?) This morning I was able to catch 16 workunits in a row (using my 'wcgstats' script) that were "Waiting to be sent" by the looks of their _2 retry: * Showing page 67/68 of all SCC1-tasks with status ’P/Q’ on all of your devices: It must be about 12 hours later now and only the last two of them have moved to "In Progress": <13> SCC1_0004366_KLF15-B_71882_0 Linux No Reply 2023-09-19T07:05:13 2023-09-25T07:05:13 Is there a chronological order in which these last two _2 tasks were finally sent, compared to their workunit-ID? To find that out, we need to inspect the JSON-datafile that was generated by 'wcgstats'. (Spoiler: the answer is no.) Extracting task-ID, workunit-ID, taskname and Sent date from the JSON-datafile for these 16 tasks looks like this: 651074596 383041947 SCC1_0004395_KLF15-B_8143_1 2023-09-19T09:14:17+0000 The workunit-IDs for these last two tasks ('71869_1' and '71882_1') are 382878850 and 382878859. So there are 8 workunits between them. Also, there are about 8,000 workunits between '71882_1' and '79211_1'. (Their workunit-IDs are 382878859 and 382886826, respectively.) The distance between '71882_1' and '79211_1' is only small, compared to '71882_1' and '8143_1' (382878859 and 383041947, respectively). So, one could say that it shouldn't take too much work/time relatively to skip to the next "Waiting to be sent". Adri |
||
|
|