Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 39
Posts: 39   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7815 times and has 38 replies Next Thread
Cyclops
Senior Cruncher
Joined: Jun 13, 2022
Post Count: 295
Status: Offline
Reply to this Post  Reply with Quote 
2022-12-12 update (ARP & HTTP errors)

Hi everyone, we are currently working to fix the BOINC HTTP errors being reported, and investigating what may have caused them, and the possible correlation with ARP workunits.

Last week, we identified an issue with the ARP pipeline where new workunits were not being distributed due to a backlog of volunteers’ completed results accumulating on WCG servers. Normally, completed results would be downloaded by the specific research teams, archived to tape in our datacenter, marked as done, and removed from the production system to make room in the pipeline for new work. However, due to a download issue on the ARP side, backlog of completed WUs started to grow, triggering an automated project pause. We have since been in contact with the ARP team to address the backlog. They confirmed it is only a temporary issue, and the team plans to resume downloading completed results tomorrow. Once that happens, we will be able to revert the changes we put into effect late last week that increased the upper limit on the backlog we allow before distribution of new ARP workunits is halted. While we did not expect these changes to have any adverse consequences, we are investigating the possibility and examining the specific workunits that have since erred.

We will share a technical breakdown of the situation once we can rule out coincidences from cause.

If you have any comments or questions, please leave them in this thread for us to answer. Thank you for your support, patience and understanding.

WCG team at Krembil Research Institute
[Dec 12, 2022 9:53:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

Hi everyone, we are currently working to fix the BOINC HTTP errors being reported, and investigating what may have caused them, and the possible correlation with ARP workunits.

Last week, we identified an issue with the ARP pipeline where new workunits were not being distributed due to a backlog of volunteers’ completed results accumulating on WCG servers. Normally, completed results would be downloaded by the specific research teams, archived to tape in our datacenter, marked as done, and removed from the production system to make room in the pipeline for new work. However, due to a download issue on the ARP side, backlog of completed WUs started to grow, triggering an automated project pause. We have since been in contact with the ARP team to address the backlog. They confirmed it is only a temporary issue, and the team plans to resume downloading completed results tomorrow. Once that happens, we will be able to revert the changes we put into effect late last week that increased the upper limit on the backlog we allow before distribution of new ARP workunits is halted. While we did not expect these changes to have any adverse consequences, we are investigating the possibility and examining the specific workunits that have since erred.

We will share a technical breakdown of the situation once we can rule out coincidences from cause.

If you have any comments or questions, please leave them in this thread for us to answer. Thank you for your support, patience and understanding.

WCG team at Krembil Research Institute
Well, thanks for the update, better late than never.

And just to add to the issues, it seems external stats doesn't seem to have been running since yesterday...

Ralf
----------------------------------------

[Dec 12, 2022 10:16:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Cyclops
Senior Cruncher
Joined: Jun 13, 2022
Post Count: 295
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

Hi everyone, we are currently working to fix the BOINC HTTP errors being reported, and investigating what may have caused them, and the possible correlation with ARP workunits.

Last week, we identified an issue with the ARP pipeline where new workunits were not being distributed due to a backlog of volunteers’ completed results accumulating on WCG servers. Normally, completed results would be downloaded by the specific research teams, archived to tape in our datacenter, marked as done, and removed from the production system to make room in the pipeline for new work. However, due to a download issue on the ARP side, backlog of completed WUs started to grow, triggering an automated project pause. We have since been in contact with the ARP team to address the backlog. They confirmed it is only a temporary issue, and the team plans to resume downloading completed results tomorrow. Once that happens, we will be able to revert the changes we put into effect late last week that increased the upper limit on the backlog we allow before distribution of new ARP workunits is halted. While we did not expect these changes to have any adverse consequences, we are investigating the possibility and examining the specific workunits that have since erred.

We will share a technical breakdown of the situation once we can rule out coincidences from cause.

If you have any comments or questions, please leave them in this thread for us to answer. Thank you for your support, patience and understanding.

WCG team at Krembil Research Institute
Well, thanks for the update, better late than never.

And just to add to the issues, it seems external stats doesn't seem to have been running since yesterday...

Ralf

Noted, we're not really sure if these issues are connected yet. Hopefully we'll have an answer to that soon.
[Dec 12, 2022 10:29:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 297
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

@Cyclops

How will I know if you folks have successfully received information I sent via the CONTACT hyperlink -or- the SUPPORT email?
[Dec 12, 2022 10:38:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Cyclops
Senior Cruncher
Joined: Jun 13, 2022
Post Count: 295
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

@Cyclops

How will I know if you folks have successfully received information I sent via the CONTACT hyperlink -or- the SUPPORT email?

Hi bfmorse, have you not been receiving any responses to your emails? I've been replying to several of them and have sent some to the tech team. If you haven't seen any responses, then we'll have to check if we have an email issue too.
[Dec 12, 2022 10:46:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 297
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

@Cyclops

How will I know if you folks have successfully received information I sent via the CONTACT hyperlink -or- the SUPPORT email?

Hi bfmorse, have you not been receiving any responses to your emails? I've been replying to several of them and have sent some to the tech team. If you haven't seen any responses, then we'll have to check if we have an email issue too.

So far, I have received nothing in my inbox on either account. Will double check junk folders in about 5 hours.
[Dec 12, 2022 11:06:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Cyclops
Senior Cruncher
Joined: Jun 13, 2022
Post Count: 295
Status: Offline
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

@Cyclops

How will I know if you folks have successfully received information I sent via the CONTACT hyperlink -or- the SUPPORT email?

Hi bfmorse, have you not been receiving any responses to your emails? I've been replying to several of them and have sent some to the tech team. If you haven't seen any responses, then we'll have to check if we have an email issue too.

So far, I have received nothing in my inbox on either account. Will double check junk folders in about 5 hours.

Weird, I'll let the rest of the team know.
[Dec 12, 2022 11:57:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 297
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

@Cyclops

How will I know if you folks have successfully received information I sent via the CONTACT hyperlink -or- the SUPPORT email?

Hi bfmorse, have you not been receiving any responses to your emails? I've been replying to several of them and have sent some to the tech team. If you haven't seen any responses, then we'll have to check if we have an email issue too.

So far, I have received nothing in my inbox on either account. Will double check junk folders in about 5 hours.

Weird, I'll let the rest of the team know.

Thanks! In the meantime, I’ll continue to crunch.
[Dec 13, 2022 12:03:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7662
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

Thanks for the update.
Just as an aside, I was wondering if Cyclops is doing any crunching. If he or she is, by monitoring periodically their cruncher(s) they may be able to identify some of the discussed issues more readily. Just curious.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Dec 13, 2022 1:35:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 297
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-12-12 update (ARP & HTTP errors)

@Cyclops

How will I know if you folks have successfully received information I sent via the CONTACT hyperlink -or- the SUPPORT email?

Hi bfmorse, have you not been receiving any responses to your emails? I've been replying to several of them and have sent some to the tech team. If you haven't seen any responses, then we'll have to check if we have an email issue too.

So far, I have received nothing in my inbox on either account. Will double check junk folders in about 5 hours.

Weird, I'll let the rest of the team know.

Thanks! In the meantime, I’ll continue to crunch.

The ONLY email from WCG on my system was the one to the Alpha testers sent on 10/1/2022.
[Dec 13, 2022 5:37:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 39   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread