Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 352
Posts: 352   Pages: 36   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 30173 times and has 351 replies Next Thread
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2494
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Blame me, it must be my fault. I've been on a break from WCG for a month, not running anything. Today though, I decided to download a bunch of work, which I managed to do without any problems. A few hours later though, the problem started.

So, I guess I broke the system.... Sorry!! rolling eyes
[Jun 28, 2025 8:37:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1293
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Thanks for the detailed info Al.

Thank you Grumpy for taking responsibility for your actions laughing
----------------------------------------
[Edit 1 times, last edit by Unixchick at Jun 28, 2025 10:59:02 PM]
[Jun 28, 2025 10:58:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

I have turned off one machine with 8 threads because it ran out of ARP and MCM. I would normally switch to a back-up project but it is hot here.

My other 8 thread machine is down to 3 ARP and no MCM. I am continuing this machine so as to know when to switch the other one back on and to complete the last 3 ARP.

Mike
[Jun 29, 2025 12:11:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2494
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

No news on their FB page either. The last update there, is from April. Maybe they all went on a 5 weeks summer vacation cool
[Jun 29, 2025 1:19:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 442
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

... hmmmm pondering the situation:

What, if anything, should we tell the volunteers?

1.) nothing.

2.) We are working on the problem. will fill you in on the details when resolved. ETA: _______.

3.) We are using this issue as a catalyst to migrate to the new configuration which will improve the reliability of the system. We will fill you in on the details when completed. ETA: _____. Please be patient.


As you can probably surmise, lack of transparency is like a vacuum - it sucks the motivation and good will of the volunteers right out of the picture.
Sad, but true.
[Jun 29, 2025 2:07:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
GB033533
Senior Cruncher
UK
Joined: Dec 8, 2004
Post Count: 206
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

My last two work units started getting this error a few hours ago;

Task MCM1_0235471_5882_0 postponed for 600 seconds: Waiting to acquire lock

which repeated every 10 minutes, and then finally ended in 'Computation error'. I'd never seen this behaviour before. Has anyone else experienced this?

Now dry like everyone else, waiting to report completed work units and for the issues to be resolved. Ho-hum.
----------------------------------------

[Jun 29, 2025 2:19:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1293
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

I'm down to my last few ARP WUs. I'm grateful we had a good burst of good flow of WUs before this breakdown.

I think the techs have been patching an old system for as long as they can. We know the new system is coming soon. There might be a limit to what time and resources can do. The techs have been very good lately getting systems back up and going, so something else is going on this weekend.

They have been working hard to improve the software system they were given, and soon they will get some new hardware. It is amazing what they have done considering their funds not being where they need. I'm personally glad they have put their resources into tech not communication. That said, I do hope we get an update soon.

It is hot where I am too, so I'll give my system a rest once I've finished my queue in about 6 hours, and hope that it is an easy fix for the techs on Monday morning.
[Jun 29, 2025 2:49:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

One effect of whatever has happened this time is that there doesn't seem to have been the last midnight stats run or today's midday stats run (or the web site can't find it...) -- the ARP1 project results page has been stuck at the "statistics are being updated" holding page for over 12 hours, reporting that it should be finished in 1 minute! Yes, right... smile

Also, at the moment (15:40 UTC) the three ARP1 status files are all empty. On one or two other occasions failure to run the relevant scripts left the old versions behind but in this case it remains to be seen whether those will acquire content later in the day (as has happened a couple of times in the past...). If that doesn't happen, let's hope things are resolved by the time Monday's status files are due...

Cheers - Al.
[Jun 29, 2025 3:43:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1293
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Interesting observations Al.

We were expecting some maintenance time around early July... and here we are. Could this be a moving of the files and database to a new system??

Truthfully I was expecting an announcement of the downtime and upgrade, but as I said in an earlier email, while I really really like the info, the time is better spent on doing.
Once we have a nice new shiny system, then they can put some effort into PR and recruiting some new people.
----------------------------------------
[Edit 1 times, last edit by Unixchick at Jun 29, 2025 3:54:43 PM]
[Jun 29, 2025 3:52:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Could this be a moving of the files and database to a new system??
Unfortunately, I suspect it's just a crash that they've not been able to service for one reason or another.

If they need to migrate file store (rather than just moving the server nodes to new hardware) I'd have thought they'd want to do a controlled shutdown akin to those for the IBM->Krembil migration and the recent Christmas->New Year data centre shutdown. New WCG was fairly good about announcing the latter, so I'd expect the same again...

As it is there are probably hundreds of thousands of MCM1 results on the upload servers waiting to be reported -- not a good scenario if moving stuff around...

Ah, well, we will presumably find out on Monday (evening by my U.K. clock!)

Oh, and as I used to be a tech troubleshooter (amongst other duties) and often working solo, I agree with the idea that sometimes it's more important to be doing than to be talking about it wink

Cheers - Al.

P.S. That ongoing "The statistics update will finish in about 1 minutes" appeals to my strange sense of humour smile
----------------------------------------
[Edit 2 times, last edit by alanb1951 at Jun 29, 2025 8:10:49 PM]
[Jun 29, 2025 8:01:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 352   Pages: 36   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread