Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 81
Posts: 81   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 326634 times and has 80 replies Next Thread
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1271
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

Personally I would say no the administrators are not running. I still have over 5000 results waiting to be deleted & this number hasn't changed for a long time
----------------------------------------

[Feb 2, 2024 10:35:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2138
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

Purging still not happening. So, the "solution" Tiger Lily talks about Here ,obviously was not the right solution to the problem
[Feb 7, 2024 5:15:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1944
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

Purging still not happening. So, the "solution" Tiger Lily talks about Here ,obviously was not the right solution to the problem
+1

Unfortunately, the last almost two years, talk has been cheap... sad


Ralf
----------------------------------------

[Feb 7, 2024 10:33:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 787
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

I haven't seen any movement.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Feb 9, 2024 7:18:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2145
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

The tech team is still tinkering with the assimilators. MCM1-assimilation has come to a halt at the moment.
Five days ago (last Monday), I didn't see any FileDeleteState = 2 (so no assimilation taking place) for nearly 24 hours (Feb 5 19:00-Feb 6 15:00). Then assimilation was resumed, till the assimilation ran out of FileDeleteState = 2 again at Feb 8 06:00. (Times are UTC.) Also, all MCM1-workunit-IDs that I saw with FileDeleteState = 2 adhered to the formula Workunit-ID modulo 4 = 1. Of course, I expect that volunteers with a larger number of valid tasks than I have(*1) may observe slightly differing results.

Apparently, getting the assimilators in fully functioning condition turns out to be tricky.

The oneliner that I ran was this one:
ls -tr wcgresults.2024-02-*|while read f;do ls -l $f;sed 's/ *<Result>//' $f|perl -w00ne 'print "$a\n" if /<FileDeleteState>2</ && /<AppName>mcm1</ && (($a) = /(..)<\/WorkunitId>/)'|sort|uniq -c;done

Adri
[*1] Currently:
$ wcgstats -wsV -aMCM1 -m0 -P1 
* Let's try to locate the workunit.
Loading results ...
There are 78123 pages of results available.

[Feb 10, 2024 1:36:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 929
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

I've been seeing the same as Adri (but on a mere 15000 or so results...)

I was wondering if WCG had done a tweak to the standard BOINC assimilator wrapper (which, unlike [say] the validator wrapper, doesn't seem to have a way of constraining the [range of] workunits selected) as it kept working upwards. I was also wondering what was causing certain WUs to provoke assimilator problems, and whether it was known to be limited to older WUs...

If, for instance, the assimilator wrapper had been modified to also accept a lowest acceptable WU number (or range of WU numbers) there might have been the option of starting the other assimilators with a high enough WU number to get some assimilations done[*1]. However, if it were that simple that would surely have been done already :-) -- so it looks like we wait, and hope that the resumption of ARP1 (and, perhaps, some more SCC1?) might take the pressure off MCM1 a bit.

[Apologies for armchair SysAdmin mode...]

Cheers - Al.

P.S. I suspect that anything that facilitates a significant reduction in the number of MCM1 WUs to be queried might reduce the possibility of a recurrence of the issues that bit WCG/IBM a fair while ago...

*1 Another hack would be to make a huge increase in the number of WUs to be considered by a single pass of the assimilator. Probably a bad move from a performance standpoint :-(
[Feb 10, 2024 5:40:20 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7633
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

Not being a sysadmin, but knowing that some possible alternative actions may be possible, such as stream separation on a old/new basis or the use of holding corrals based on selective criteria may be helpful. Even restrictive flow may be helpful to stave off overload. We may not get all we want, but we may some.Without knowing the setup and any of the throughput parameters any of this is pure speculation. Sooner or later the techs in the back room will determine the nature of the problem and formulate some fix.
Good luck and Godspeed.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Feb 10, 2024 12:13:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 929
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?

Sgt. Joe,
Without knowing the setup and any of the throughput parameters any of this is pure speculation. Sooner or later the techs in the back room will determine the nature of the problem and formulate some fix.
Agreed! My "problem" (if it is such) is that before I retired I had many occasions when I was trying to fix problem systems that seemed to resist every common-sense attempt at solution, so (knowing the spaghetti nature of some of the BOINC server code) I have real sympathy for the WCG folks if they have strange errors to deal with, and am frustrated that there's nothing I can do to help!

Ah, well, it will (as you say) get sorted eventually (or the system will break completely...)

Cheers - Al.

P.S. I often had to stop and remind myself that if a fix was urgent, perfect was the enemy of good...
[Feb 11, 2024 4:00:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 342
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?


P.S. I often had to stop and remind myself that if a fix was urgent, perfect was the enemy of good...


Agree, when the system was down then a fix that relieved the symptoms without negative consequences went in.

That being said, the system is not down and appears to be able to cope with the backlog so is a fix urgent?

Sometimes it can take longer to fix a bad fix than to fix the original problem.
[Feb 11, 2024 11:21:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 929
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Are all MCM1 assimilators running?


P.S. I often had to stop and remind myself that if a fix was urgent, perfect was the enemy of good...


Agree, when the system was down then a fix that relieved the symptoms without negative consequences went in.

That being said, the system is not down and appears to be able to cope with the backlog so is a fix urgent?
That might depend on whether there are eventually knock-on consequences caused by the huge numbers of tasks effectively in limbo. Only time will tell...
Sometimes it can take longer to fix a bad fix than to fix the original problem.
Agreed - that's why I always like[d] problem avoidance using existing mechanisms whilst working on a longer-term solution :-)

It will be interesting to see if there will be an "after action report" once this does get resolved. I suspect tt would make for interesting reading :-)

Cheers - Al.
[Feb 11, 2024 8:41:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 81   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread