Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 65
Posts: 65   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 90164 times and has 64 replies Next Thread
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 873
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

Adri -- thanks for confirming that someone else is seeing the "not going to send to a Ryzen at the moment" behaviour. I wonder it also happens to Windows Ryzen users when the availability situation is reversed :-)

Without detailed access to the WCG BOINC database or their project configuration files I don't think it'll be possible to find a reason for this. but I am beginning to suspect that there's a situation in which some SCC1 retries fail a CPU hardware match check if there's an AMD processor asking and there's an Intel task for that WU still regarded as live.[*1]

I'd be interested to know if anyone has seen the same sort of issue with non-SCC1 work; as I mentioned above, I don't recall having seen it elsewhere -- "other platforms" usually seems to apply on all my systems at once for MCM1 if/when Linux work is in short supply because of bulk retries...

Cheers - Al.

[*1] I haven't got time to do the sort of code dive that would be needed to see if that's a possibility, but I know that the platform-checking code does have some CPU-specific stuff in it that may or may not be exercised for SCC1 :-)
[Sep 30, 2023 8:15:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 747
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

A lot of the repair work units for that 5.15.107+ person is stuck in "waiting to be sent" status again. A ton of new work is being issued, but it's kind of backwards that new work is prioritized over letting the system catch up on the repair work, especially if it takes 1-3 weeks before it's sent back out (like last time).

Are the techs aware of this behavior?
----------------------------------------
  • i3-8100 (Coffee Lake, 4C/4T) @ 3.6 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • E5800 (Wolfdale, 2C/2T) @ 3.2 GHz

[Sep 30, 2023 10:23:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7581
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

You are probably right. The blizzard of resends seems to have ceased for the moment, so there could still be a bunch in the waiting to be sent penalty box. Maybe on Monday someone will notice and release them. In the meantime the supply of both MCM and SCC appears to be steady.
Cheers

Edit:spelling
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Oct 1, 2023 1:30:03 AM]
[Oct 1, 2023 1:29:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Jake1402
Senior Cruncher
USA
Joined: Dec 30, 2005
Post Count: 181
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

You mean here:

https://boinc.berkeley.edu/download_all.php

where I read

These versions may not be current.

7.16.6 Development version (MAY BE UNSTABLE - USE ONLY FOR TESTING)
7.4.22 Recommended version

where the first line makes everything suspect.


yep...you will notice the date of April 2020. I have been running this for several years and it works fine for me. Also I installed the one in Software Sources which is the "development" version. It appears nobody is developing/updating Linux in Berkeley.

YMMV
----------------------------------------
Join the Chicago-IL-USA team!
2 AMD FX 8320/AMD R9 270X/Win 10
2 AMD FX 8320/AMD RX 560/Linux Mint 20.3 (both computers DOA)
Intel Pentium G240/Win 10
[Oct 1, 2023 2:36:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

You mean here:

https://boinc.berkeley.edu/download_all.php

where I read

These versions may not be current.

7.16.6 Development version (MAY BE UNSTABLE - USE ONLY FOR TESTING)
7.4.22 Recommended version

where the first line makes everything suspect.


yep...you will notice the date of April 2020. I have been running this for several years and it works fine for me. Also I installed the one in Software Sources which is the "development" version. It appears nobody is developing/updating Linux in Berkeley.

YMMV


The most up to date version is 7.20.5 in costamagnagianfranco’s ppa. It is stable and works well.
[Oct 1, 2023 7:01:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Spiderman
Advanced Cruncher
United States
Joined: Jul 13, 2020
Post Count: 113
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

RE: Linux/Android...

I note a few To-Do Tasks in BOINC's upcoming Hacktoberfest 2023 for Linux/Android, so perhaps we will see updates in the Client generated from that in coming weeks/months...

https://github.com/orgs/BOINC/projects/11/views/1
[Oct 1, 2023 1:38:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 873
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

hchc, Sgt. Joe -- regarding the Waiting to be sent issue...

[Some of this may well have been said elsewhere...]

Given that this doesn't seem to happen for MCM1, and that it didn't seem to happen for SCC1 until fairly recently, I suspect it's an unintended consequence of whatever they did to try to ensure new SCC1 work went out [over weekends?] when there were a fair number of posts about not being able to get work a few weeks ago...

It's behaving as if precedence is being given to WU numbers in [more or less] descending order -- the usual would be the opposite (and retries of the oldest WUs would tend to clear out first.) This probably wouldn't be a severe problem if the number of tasks failing because of time-outs was low, but even without the recent cluster incident I was still seeing fairly high numbers of retries (and I wonder if it was worse on the Windows side? -- I have no data for that...)

I notice that since the new SCC1 work units created on 30th September have started being issued I haven't seen a single SCC1 "Waiting to be sent" retry go out[*1]; these are all week-old WUs (or older)... Over the preceding few days, the total count of Waiting tasks seen was going down faster than new ex-cluster No Reply tasks were coming in.

At the time of writing, I have 175 stuck (I'll bet Sgt. Joe has a lot more than that!), and I wonder if they'll still be stuck when this current set of new tasks is all issued. By then, some of the WUs will be well over a week old and may struggle to clear out because there will be missed deadlines for tasks from the new WUs, even without the 5.15.107+ devices (which don't appear to be picking up new work as far as I can tell[*2], thank goodness!)

Let's just hope there's not a repeat of the cluster issue (or anything else that has the same effect...)

And in closing, some data relating to the original, narrower, topic of this thread: since mid-September I have processed tasks from 4819 SCC1 WUs and there were 1577 No Reply responses from 5.15.107+ systems in that set, with 1456 different device names. If I also consider MCM1 tasks there were a total of 2594 distinct devices! (And that may be the end of it; I think those systems last picked up work on 24th when the new work ran out...)

Cheers - Al.

[*1] I have API scripts that collect WU progress daily, so I'm not depending on noticing changes via the Web site :-)

[*2] I'd love to know what happened there, but I don't reckon the cluster manager(s) will come and post about it :-)
[Oct 1, 2023 2:35:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7581
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

At the time of writing, I have 175 stuck (I'll bet Sgt. Joe has a lot more than that!)

i currently have 2449 awaiting validation which is down considerably from 4000+ I had recently, so apparently progress is being made.
Edit: The oldest ones are from Sept. 21 and are from the notorious cluster group. I randomly checked those from Sept.21 and they were in the "waiting to be sent' category.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Oct 1, 2023 6:01:49 PM]
[Oct 1, 2023 5:45:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 747
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

Most of my "Waiting to be sent" are from September 19-23 issue date from the cluster.

[*2] I'd love to know what happened there, but I don't reckon the cluster manager(s) will come and post about it :-)

If they're a big enough volunteer to have those resources, one would think they participate in the forums, but I dunno.

with 1456 different device names.

How can you tell that? Very interesting.
----------------------------------------
  • i3-8100 (Coffee Lake, 4C/4T) @ 3.6 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • E5800 (Wolfdale, 2C/2T) @ 3.2 GHz

[Oct 1, 2023 7:04:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2089
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

Al wrote:
hchc, Sgt. Joe -- regarding the Waiting to be sent issue...

I notice that since the new SCC1 work units created on 30th September have started being issued I haven't seen a single SCC1 "Waiting to be sent" retry go out


They seemingly tend to go out. However … (did anyone say 'slow'?)

This morning I was able to catch 16 workunits in a row (using my 'wcgstats' script) that were "Waiting to be sent" by the looks of their _2 retry:
* Showing page 67/68 of all SCC1-tasks with status ’P/Q’ on all of your devices:
<1> SCC1_0004395_KLF15-B_8143_0 Linux No Reply 2023-09-19T09:14:13 2023-09-25T09:14:13
<1> * SCC1_0004395_KLF15-B_8143_1 Fedora Linux Pending Validation 2023-09-19T09:14:17 2023-09-19T12:00:58
<1> SCC1_0004395_KLF15-B_8143_2 Waiting to be sent

<2> SCC1_0004394_KLF15-B_23120_0 Linux No Reply 2023-09-19T09:03:29 2023-09-25T09:03:29
<2> * SCC1_0004394_KLF15-B_23120_1 Fedora Linux Pending Validation 2023-09-19T09:03:49 2023-09-19T12:00:58
<2> SCC1_0004394_KLF15-B_23120_2 Waiting to be sent

<3> SCC1_0004395_KLF15-B_3079_0 Linux No Reply 2023-09-19T08:51:07 2023-09-25T08:51:07
<3> * SCC1_0004395_KLF15-B_3079_1 Fedora Linux Pending Validation 2023-09-19T08:51:19 2023-09-19T11:40:08
<3> SCC1_0004395_KLF15-B_3079_2 Waiting to be sent

<4> SCC1_0004394_KLF15-B_17709_0 Linux No Reply 2023-09-19T08:47:03 2023-09-25T08:47:03
<4> * SCC1_0004394_KLF15-B_17709_1 Fedora Linux Pending Validation 2023-09-19T08:47:09 2023-09-19T11:42:09
<4> SCC1_0004394_KLF15-B_17709_2 Waiting to be sent

<5> SCC1_0004393_KLF15-B_14810_0 Linux No Reply 2023-09-19T08:47:03 2023-09-25T08:47:03
<5> * SCC1_0004393_KLF15-B_14810_1 Fedora Linux Pending Validation 2023-09-19T08:47:09 2023-09-19T11:40:07
<5> SCC1_0004393_KLF15-B_14810_2 Waiting to be sent

<6> SCC1_0004392_KLF15-B_27835_0 Linux No Reply 2023-09-19T08:38:38 2023-09-25T08:38:38
<6> * SCC1_0004392_KLF15-B_27835_1 Fedora Linux Pending Validation 2023-09-19T08:38:49 2023-09-19T11:27:38
<6> SCC1_0004392_KLF15-B_27835_2 Waiting to be sent

<7> SCC1_0004392_KLF15-B_23667_0 Linux No Reply 2023-09-19T08:24:09 2023-09-25T08:24:09
<7> * SCC1_0004392_KLF15-B_23667_1 Fedora Linux Pending Validation 2023-09-19T08:24:17 2023-09-19T11:25:30
<7> SCC1_0004392_KLF15-B_23667_2 Waiting to be sent

<8> SCC1_0004394_KLF15-B_11401_0 Linux No Reply 2023-09-19T08:23:26 2023-09-25T08:23:26
<8> * SCC1_0004394_KLF15-B_11401_1 Fedora Linux Pending Validation 2023-09-19T08:23:36 2023-09-19T19:47:01
<8> SCC1_0004394_KLF15-B_11401_2 Waiting to be sent

<9> SCC1_0004394_KLF15-B_11509_0 Linux No Reply 2023-09-19T08:23:27 2023-09-25T08:23:27
<9> * SCC1_0004394_KLF15-B_11509_1 Fedora Linux Pending Validation 2023-09-19T08:23:36 2023-09-19T19:13:26
<9> SCC1_0004394_KLF15-B_11509_2 Waiting to be sent

<10> SCC1_0004394_KLF15-B_11513_0 Linux No Reply 2023-09-19T08:23:26 2023-09-25T08:23:26
<10> * SCC1_0004394_KLF15-B_11513_1 Fedora Linux Pending Validation 2023-09-19T08:23:36 2023-09-19T19:47:01
<10> SCC1_0004394_KLF15-B_11513_2 Waiting to be sent

<11> SCC1_0004392_KLF15-B_21962_0 Linux No Reply 2023-09-19T08:21:02 2023-09-25T08:21:02
<11> * SCC1_0004392_KLF15-B_21962_1 Fedora Linux Pending Validation 2023-09-19T08:22:09 2023-09-19T11:25:30
<11> SCC1_0004392_KLF15-B_21962_2 Waiting to be sent

<12> SCC1_0004394_KLF15-B_7390_0 Linux No Reply 2023-09-19T08:13:14 2023-09-25T08:13:14
<12> * SCC1_0004394_KLF15-B_7390_1 Fedora Linux Pending Validation 2023-09-19T08:13:49 2023-09-19T11:06:47
<12> SCC1_0004394_KLF15-B_7390_2 Waiting to be sent

<13> SCC1_0004366_KLF15-B_94026_0 Linux No Reply 2023-09-19T07:23:41 2023-09-25T07:23:41
<13> * SCC1_0004366_KLF15-B_94026_1 Fedora Linux Pending Validation 2023-09-19T07:24:09 2023-09-19T19:13:26
<13> SCC1_0004366_KLF15-B_94026_2 Waiting to be sent

<14> SCC1_0004366_KLF15-B_79211_0 Linux No Reply 2023-09-19T07:08:51 2023-09-25T07:08:51
<14> * SCC1_0004366_KLF15-B_79211_1 Fedora Linux Pending Validation 2023-09-19T07:09:14 2023-09-19T10:14:40
<14> SCC1_0004366_KLF15-B_79211_2 Waiting to be sent

<15> SCC1_0004366_KLF15-B_71882_0 Linux No Reply 2023-09-19T07:05:13 2023-09-25T07:05:13
<15> * SCC1_0004366_KLF15-B_71882_1 Fedora Linux Pending Validation 2023-09-19T07:05:21 2023-09-19T18:21:01
<15> SCC1_0004366_KLF15-B_71882_2 Waiting to be sent

* Showing page 68/68 of all SCC1-tasks with status ’P/Q’ on all of your devices:
<1> SCC1_0004366_KLF15-B_71869_0 Linux No Reply 2023-09-19T07:05:13 2023-09-25T07:05:13
<1> * SCC1_0004366_KLF15-B_71869_1 Fedora Linux Pending Validation 2023-09-19T07:05:21 2023-09-19T18:21:01
<1> SCC1_0004366_KLF15-B_71869_2 Waiting to be sent



It must be about 12 hours later now and only the last two of them have moved to "In Progress":

<13>   SCC1_0004366_KLF15-B_71882_0  Linux         No Reply              2023-09-19T07:05:13  2023-09-25T07:05:13
<13> * SCC1_0004366_KLF15-B_71882_1 Fedora Linux Pending Validation 2023-09-19T07:05:21 2023-09-19T18:21:01
<13> SCC1_0004366_KLF15-B_71882_2 Linux openSU In Progress 2023-10-01T12:27:27 2023-10-04T12:27:27

<14> SCC1_0004366_KLF15-B_71869_0 Linux No Reply 2023-09-19T07:05:13 2023-09-25T07:05:13
<14> * SCC1_0004366_KLF15-B_71869_1 Fedora Linux Pending Validation 2023-09-19T07:05:21 2023-09-19T18:21:01
<14> SCC1_0004366_KLF15-B_71869_2 Linux Debian In Progress 2023-10-01T12:50:00 2023-10-04T12:50:00


Is there a chronological order in which these last two _2 tasks were finally sent, compared to their workunit-ID?
To find that out, we need to inspect the JSON-datafile that was generated by 'wcgstats'. (Spoiler: the answer is no.)

Extracting task-ID, workunit-ID, taskname and Sent date from the JSON-datafile for these 16 tasks looks like this:
651074596	383041947	SCC1_0004395_KLF15-B_8143_1	2023-09-19T09:14:17+0000
651058096 383033482 SCC1_0004394_KLF15-B_23120_1 2023-09-19T09:03:49+0000
651035282 383014248 SCC1_0004395_KLF15-B_3079_1 2023-09-19T08:51:19+0000
651023406 383007277 SCC1_0004394_KLF15-B_17709_1 2023-09-19T08:47:09+0000
651023407 383007289 SCC1_0004393_KLF15-B_14810_1 2023-09-19T08:47:09+0000
651011170 383000750 SCC1_0004392_KLF15-B_27835_1 2023-09-19T08:38:49+0000
650988287 382982289 SCC1_0004392_KLF15-B_23667_1 2023-09-19T08:24:17+0000
650986388 382980579 SCC1_0004394_KLF15-B_11401_1 2023-09-19T08:23:36+0000
650986502 382980851 SCC1_0004394_KLF15-B_11509_1 2023-09-19T08:23:36+0000
650986559 382980850 SCC1_0004394_KLF15-B_11513_1 2023-09-19T08:23:36+0000
650980244 382976349 SCC1_0004392_KLF15-B_21962_1 2023-09-19T08:22:09+0000
650966117 382964294 SCC1_0004394_KLF15-B_7390_1 2023-09-19T08:13:49+0000
650881874 382907946 SCC1_0004366_KLF15-B_94026_1 2023-09-19T07:24:09+0000
650854836 382886826 SCC1_0004366_KLF15-B_79211_1 2023-09-19T07:09:14+0000
650844789 382878859 SCC1_0004366_KLF15-B_71882_1 2023-09-19T07:05:21+0000
650844807 382878850 SCC1_0004366_KLF15-B_71869_1 2023-09-19T07:05:21+0000


The workunit-IDs for these last two tasks ('71869_1' and '71882_1') are 382878850 and 382878859. So there are 8 workunits between them.
Also, there are about 8,000 workunits between '71882_1' and '79211_1'. (Their workunit-IDs are 382878859 and 382886826, respectively.)
The distance between '71882_1' and '79211_1' is only small, compared to '71882_1' and '8143_1' (382878859 and 383041947, respectively).
So, one could say that it shouldn't take too much work/time relatively to skip to the next "Waiting to be sent".

Adri
[Oct 1, 2023 11:58:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 65   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread