Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 27
|
![]() |
Author |
|
WCGAdmin
World Community Grid Admin Joined: Jun 9, 2020 Post Count: 171 Status: Offline |
The system outage experienced due to scheduled maintenance is now complete. We are aware of problems with OPNG work units and are investigating this issue.
https://www.worldcommunitygrid.org/about_us/article.s?articleId=799 |
||
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 345 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The system outage experienced due to scheduled maintenance is now complete. We are aware of problems with OPNG work units and are investigating this issue. https://www.worldcommunitygrid.org/about_us/article.s?articleId=799 Can you please confirm whether you have paused the defective OPNG WUs or ALL WUs? I am out of work reporting no tasks available. |
||
|
TigerLily
Senior Cruncher Joined: May 26, 2023 Post Count: 280 Status: Offline Project Badges: ![]() |
Hi Bryn,
The tech team has paused distribution of work units for SCC1, OPN1, and OPNG until they identify what the issue is. MCM work units should still be available but I have forwarded your post to the team to investigate what the problem might be. |
||
|
Aperture_Science_Innovators
Advanced Cruncher United States Joined: Jul 6, 2009 Post Count: 139 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Bryn, The tech team has paused distribution of work units for SCC1, OPN1, and OPNG until they identify what the issue is. MCM work units should still be available but I have forwarded your post to the team to investigate what the problem might be. Thanks for checking with the techs. The only WUs I've gotten in about four hours are ~10 MCM resends, nothing new. ![]() |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 802 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All I'm seeing is a bunch of "Server is out of disk space" messages, so nothing is uploading.
----------------------------------------
|
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 802 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Is there any word on the major incident that happened starting last week that took all systems down (WCG BOINC, WCG website, WCG forums) even before the planned maintenance period?
----------------------------------------It's not safe that all of those systems tend to go down at the exact same time since they should (ideally) be pretty isolated from one another. I tried checking Twitter for updates, but apparently Twitter requires an account to even see public posts now, which is inconvenient. I block Facebook in NoScript for privacy reasons but used another web browser to check WCG on Facebook, and there weren't any updates. Just a lot of people really frustrated at the lack of communication for 4-5 days and then one troll/disturbed person fighting a lot of people. It was a big turn-off. I've worked in IT for quite a few years, and any time a Sev 1 ("Severity" aka Priority 1) major incident happens, there are always Service Level Agreements (SLAs) to be met and people are paged out 24/7 to fix the incident and provide updates and documentation to both management and any stakeholders (executives and customers affected). Even if I was busy and stressed figuring out an issue, I still had to at least update everyone on what on earth was happening and what we are presently doing to fix it and a rough ETA if we had one. If I ghosted end users for hours let alone 4-5 days, I'd be fired. Edited to Add: I'm understanding that because WCG doesn't have financial resources to support the staffing levels it needs (like a small team), it's comparing apples to oranges. I think we understand that and empathize with the workload. And to be fair, while we volunteers do contribute lots of expensive, free computing resources, it's not the same as a revenue-generating corporation or government where downtime immediately affects the bottom line of the organization. I think the biggest frustration is just lack of communication and updates, which doesn't take lots of $$$ at all.
[Edit 7 times, last edit by hchc at Jul 28, 2023 7:04:35 AM] |
||
|
roundup
Veteran Cruncher Switzerland Joined: Jul 25, 2006 Post Count: 835 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The system outage experienced due to scheduled maintenance is now complete. We are aware of problems with OPNG work units and are investigating this issue. https://www.worldcommunitygrid.org/about_us/article.s?articleId=799 No word on the reasons for the outages before the scheduled maintenance? No word on the upload still not working properly? We have many WCG crunchers on our team who have been contributing regularly since the mid-2010s. From these members we hear a similar frustration to that described by hchc here. It is neither expensive nor time-consuming to send out regular notifications on social media about system bugs, with a brief explanation of the problem and an estimate of how long it will take to fix. You have to understand that the volunteers want to be appreciated, because they invest time and money (hardware, electricity) in WCG. Compared to other worthwhile projects, WCG / Krembil is clearly behind in the quality of communication. In the long run, the consequence will be that even more computing power will be taken away from WCG. Please take this as a friendly feedback and suggestion for improvement. Thank you very much. |
||
|
thunder7
Senior Cruncher Netherlands Joined: Mar 6, 2013 Post Count: 232 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Bryn, The tech team has paused distribution of work units for SCC1, OPN1, and OPNG until they identify what the issue is. MCM work units should still be available but I have forwarded your post to the team to investigate what the problem might be. So, any news on those disappearing MCM workunits? |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1951 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Bryn, The tech team has paused distribution of work units for SCC1, OPN1, and OPNG until they identify what the issue is. MCM work units should still be available but I have forwarded your post to the team to investigate what the problem might be. So, any news on those disappearing MCM workunits? They might have to forward this to the tech team first to investigate (though they got the first messages about this more than a week ago, before the latest crash&burn, but hey, better to censor supposedly unruly users than actually react to any notification that there are once again problems rising up). Ralf ![]() |
||
|
TigerLily
Senior Cruncher Joined: May 26, 2023 Post Count: 280 Status: Offline Project Badges: ![]() |
So, any news on those disappearing MCM workunits? Hi thunder7, An MCM1 work unit update was just posted. |
||
|
|
![]() |