Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 27
Posts: 27   Pages: 3   [ Previous Page | 1 2 3 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 176067 times and has 26 replies Next Thread
phillipspencer
Advanced Cruncher
France
Joined: Apr 9, 2015
Post Count: 71
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Update: July 25 system outage and defective OPNG work units

Server status page is part of the BOINC server software, every other project has it. Like this for example: https://milkyway.cs.rpi.edu/milkyway/server_status.php. Costs nothing and gives the possibility to 3rd party sites to display the information even if WCG servers are down, like for example this. They just need to use it.

I wonder if it would be as simple as "switching it on" given WCG is not a vanilla BOINC project but set up on a bespoke IBM designed system.
Given the problems since migration it might have been simpler and better for the Krembil team to start a vanilla BOINC for their projects (but, hey, hindsight's wonderful!)
[Aug 2, 2023 6:07:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Update: July 25 system outage and defective OPNG work units

roundup said:
No word on the reasons for the outages before the scheduled maintenance?
No word on the upload still not working properly?

Exactly. TigerLily, do you have any information? We're about to go into a weekend without acknowledgement about these two things. The silence is really confusing, and quite a lot of users are curious about both of them.

Thank you for your time.

Agreed. Even though the "planned outage" was flagged well in advance, whatever happened was not "as planned" and there is still no clarity on what actually happened.


What bothers me the most is the silence. Not even a "Yes, we hear you, we're looking into the Root Cause Analysis of the July 21-25 unplanned outage of all systems prior to the July 26 scheduled outage. We'll let you know once we have more information." Instead it's just silence, as if they're trying to message that it never happened.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Aug 2, 2023 5:26:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Update: July 25 system outage and defective OPNG work units

Server status page is part of the BOINC server software, every other project has it. Like this for example: https://milkyway.cs.rpi.edu/milkyway/server_status.php. Costs nothing and gives the possibility to 3rd party sites to display the information even if WCG servers are down, like for example this. They just need to use it.

I wonder if it would be as simple as "switching it on" given WCG is not a vanilla BOINC project but set up on a bespoke IBM designed system.
Given the problems since migration it might have been simpler and better for the Krembil team to start a vanilla BOINC for their projects (but, hey, hindsight's wonderful!)

I doubt it's that simple to be honest. For whatever reason, IBM customized the BOINC implementation quite a bit such that using any of the out of the box features might be broken.

1 BOINC point = 7 WCG points, for instance. I don't really see the reason that was changed to begin with.

For over 10 years, I've been wanting the feature to delete devices (with 0 results returned) as well as the ability to merge two devices. I've reinstalled Windows before and the system didn't recognize it as the original device ID and assigned a new one. I'd love to see the stats from both of those combined accurately.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Aug 2, 2023 5:29:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Update: July 25 system outage and defective OPNG work units

@hchc: As I understand it, the reason for the 1 BOINC Credit = 7 WCG points equation goes back to the early days of WCG. They started out using a work distribution system from United Devices (UD), which had its own credit allocation algorithm. When WCG switched to BOINC, granting 7 points for each new credit most accurately matched the value of the UD points.
In hindsight, I think it would have been better to have divided the UD points by 7, but then many of the crunchers would have whinged about losing 85.7% of their precious points, even though WCG points were not backed by the gold standard way back then either.
[Aug 3, 2023 3:52:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bluestang
Senior Cruncher
USA
Joined: Oct 1, 2010
Post Count: 272
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Update: July 25 system outage and defective OPNG work units

@TigerLily what's the latest on OPNG WU issues being fixed and distribution being re-enabled so work can start again?

(and take me off this stupid list where every post of mine must be moderated before actually posting!)
----------------------------------------
[Aug 15, 2023 5:35:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TigerLily
Senior Cruncher
Joined: May 26, 2023
Post Count: 280
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Update: July 25 system outage and defective OPNG work units

@TigerLily what's the latest on OPNG WU issues being fixed and distribution being re-enabled so work can start again?
Hi bluestang,

Our team is still working on fixing the error with OPNG work units. We are working with the OPN team to resolve this issue. I will release an update once I have more information about when the issue will be fixed.
[Aug 17, 2023 5:17:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Update: July 25 system outage and defective OPNG work units

@Ricjb:

@hchc: As I understand it, the reason for the 1 BOINC Credit = 7 WCG points equation goes back to the early days of WCG. They started out using a work distribution system from United Devices (UD), which had its own credit allocation algorithm. When WCG switched to BOINC, granting 7 points for each new credit most accurately matched the value of the UD points.
In hindsight, I think it would have been better to have divided the UD points by 7, but then many of the crunchers would have whinged about losing 85.7% of their precious points, even though WCG points were not backed by the gold standard way back then either.


Wow that brings me back. I started with distributed.net RC5-64 distributed computing in 1997 and when United Devices/grid.org came along (I think carved from the same team from Austin, Texas), I jumped on.

The grid.org forums were super active, and there was a project scientist in New York working on the Human Proteome Folding (Phase 1) project who was very detailed with the science. Long posts every day. Was really fun. Also FightAIDS@home Phase 1 started on UD.

When IBM WCG was spun off from UD as a non-profit instead of a profit, I was too addicted to the stats on UD to want to jump ship. In hindsight, I kind of wish I had had more badges.

Interesting on the history though... It shouldn't be too difficult to switch back to 1 WCG point = 1 BOINC point and get closer to a vanilla BOINC installation on the back-end. Might be easier to maintain going forward.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Aug 19, 2023 4:47:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 27   Pages: 3   [ Previous Page | 1 2 3 ]
[ Jump to Last Post ]
Post new Thread