Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 65
Posts: 65   Pages: 7   [ 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 90159 times and has 64 replies Next Thread
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Is there any way of finding your wingman’s host-Id?

I’m having a build up of pending validation tasks and well over half of them are SCC WUs on my slowest cruncher (a 4th gen i3 laptop so definitely slow).

Looking at them in a bit more detail almost all of them appear to have the same wingman who does not appear to be returning any of them as complete.

Is there any way I can find a host-id to confirm that it is the same host and carry my research forwards?
[Sep 18, 2023 5:28:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

We all get PV delays from time to time. There are quite a few in my results listing, but I can't identify the wingman's machine - just the OS of it.

After 6 days it will either be returned or sent to another cruncher.

Mike
[Sep 19, 2023 12:15:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7580
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

I’m having a build up of pending validation tasks and well over half of them are SCC WUs on my slowest cruncher (a 4th gen i3 laptop so definitely slow).

Mike is right. With these small SCC units there is a buildup of these. I have over 1000 in that category and am sure they will be validated eventually. Just keep on crunching.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Sep 19, 2023 1:07:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

We all get PV delays from time to time. There are quite a few in my results listing, but I can't identify the wingman's machine - just the OS of it.

After 6 days it will either be returned or sent to another cruncher.

Mike


I’m quite used to getting PVs but this is unusual. Normally they are slightly skewed towards my faster machines where the wingman takes longer than I do to process, this is heavily skewed to my slowest machine. Normally they tend to be MCM but these are mostly SCC. Normally there is a spread of wingmen, in this instance almost all are a single OS, Linux 5.15.107+ (rather than, for example, Ubuntu 22.04 LTS which is where 5.15 went).

It just feels different and I’m curious.
----------------------------------------
[Edit 1 times, last edit by Bryn Mawr at Sep 19, 2023 2:53:12 AM]
[Sep 19, 2023 2:51:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
supdood
Senior Cruncher
USA
Joined: Aug 6, 2015
Post Count: 333
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

I'm seeing the same thing. I usually have very low PV numbers as I now run only two old, slow laptops and am often the one keeping others' tasks in PV. Now I have a bunch (both MCM and SCC) in PV all with Linux 5.15.107+. I wonder if someone spun up a large compute core and got way too many tasks as an initial download while WCG learns the system.
----------------------------------------
Crunch with BOINC team USA
www.boincusa.com

----------------------------------------
[Edit 2 times, last edit by supdood at Sep 19, 2023 11:36:51 AM]
[Sep 19, 2023 11:35:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

I'm seeing the same thing. I usually have very low PV numbers as I now run only two old, slow laptops and am often the one keeping others' tasks in PV. Now I have a bunch (both MCM and SCC) in PV all with Linux 5.15.107+. I wonder if someone spun up a large compute core and got way too many tasks as an initial download while WCG learns the system.


This is my suspicion. It appears to be coming back for more tasks frequently without completing any.
[Sep 19, 2023 12:51:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

I'm seeing the same thing. I usually have very low PV numbers as I now run only two old, slow laptops and am often the one keeping others' tasks in PV. Now I have a bunch (both MCM and SCC) in PV all with Linux 5.15.107+. I wonder if someone spun up a large compute core and got way too many tasks as an initial download while WCG learns the system.


This is my suspicion. It appears to be coming back for more tasks frequently without completing any.


That also points to a new machine starting up or re-starting. Cache set to unlimited but restricted on download each time.

Mike
[Sep 19, 2023 2:48:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 873
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

I've been looking through my wingman data for the period from 10th September onwards, as that's when I first saw wingmen with O/S 5.15.107+ (and nothing further)... There is evidence that the systems involved are part of a large node of some form.[*1]

I'm not sure how much of it is a case of "more than can be run" and how much is that there may be a huge number of individual hosts being set up -- I've seen 616 of these as wingmen across MCM1 and SCC1 between 10th and 19th September and there were 591 distinct device names! -- If nothing else, that must be causing havoc with the non-BOINC database :-)

The comments by others regarding picking up [far] more work than is being run are accurate if my wingmen return rates (almost none!) are anything to go by. As there is no evidence so far that these systems will return "Not Started by Deadline" It is possible that they aren't staying on-line, and that new tasks are being picked up by new nodes!

Someone who knows a lot more than I do about cloud computing and/or Docker (and such like) may have a better idea of what might actually be going on :-) -- I suspect the WCG folks might need to be looking at this anyway.

Unfortunately, I'm now starting to see these systems picking up retries themselves :-(

Cheers - Al.

P.S. Unless I happen to catch one of these when it returns something with the client version in it, I can't tell whether choice of client might have anything to do with the "problem"...

[*1] The relevant information is available via the API [as device names rather than host IDs], but publishing it would be against forum rules -- there have been [temporary] bans issued for showing host names other than one's own in the past!

[Edited to alter the counts to reflect work returned on 19th September.]
----------------------------------------
[Edit 1 times, last edit by alanb1951 at Sep 20, 2023 5:30:04 AM]
[Sep 19, 2023 11:09:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7580
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

I am seeing quite a large jump in pending validation and verification work unitsfor SCC. I am seeing almost no resends. I am wondering if the volume of these real short work units is tending to overload the validator(s). The last 3 days have shown over 1 million(10^6) work units being returned for SCC.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Sep 20, 2023 1:11:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 873
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is there any way of finding your wingman’s host-Id?

I am seeing quite a large jump in pending validation and verification work unitsfor SCC. I am seeing almost no resends. I am wondering if the volume of these real short work units is tending to overload the validator(s). The last 3 days have shown over 1 million(10^6) work units being returned for SCC.
Cheers

If this is on your Linux systems I suspect you are simply seeing a consequence of the issues with those systems that report their O/S as Linux 5.15.107+ :-(

I've just processed my wingman data for 19th September, and it shows a continuing trend of giving me _1 tasks as wingman to a _0 task on one of those systems. Based on what's going on with MCM1 tasks those systems don't seem to return much, if anything; I'm starting to see their MCM1 tasks going No Reply (at last, but there are so many of them...)

Those systems, being new and not [yet] sending work back to validate, will always need an initial wingman for SCC1, and if you draw that short straw then return your _1 task in a timely fashion it'll end up sat at Pending Validation for 6 or more days. I've currently got 435 SCC1 tasks Pending Validation and over 400 of them are waiting for a "5.15.107+" system!

As for Pending Verification -- I currently have 6 SCC1 tasks Pending Verification; since SCC1 work resumed I've had 1349 tasks where I got first call. and 1312 of them validated without a second opinion. Of the other 37, most had a Pending Verification phase, but I reckon that's well within expectations. Fortunately, I haven't drawn one of those systems for my verification task yet :-) -- however, the latest one has been stuck at Waiting to send for over 5 hours so I suspect there's a little congestion in the system as the number of delayed tasks builds up and queries take longer...

Frustrating, isn't it??? :-)

Cheers - Al.
[Sep 20, 2023 5:26:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 65   Pages: 7   [ 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread