Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Community Forum: Chat Room Thread: Project Status (First Post Updated) |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 131
|
Author |
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7580 Status: Recently Active Project Badges: |
Still dry here at 21:27 UTC
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
I got a MCM resend. It looks like it is ramping up. That more is being sent out over time, but it will take a while to fill everyone's caches.
|
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
MCM and OPNG are flowing, but not enough. I get a WU here and there, but rarely. Please bring back ARP soon.
|
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 873 Status: Offline Project Badges: |
Scheduler requests started to report something odd this afternoon, and I happened to catch the transition between "working, but no work available" and "something is wrong" states on one of my systems...
----------------------------------------Sat 02 Dec 2023 15:17:20 GMT | World Community Grid | Requesting new tasks for CPU and NVIDIA GPU Ever since then, requests have seen the same response (apart from the occasional but infrequent HTTP error). Uploads still seem to work but with scheduler access broken they can't be reported, and no new work can be requested :-( Other users have already noted this in a thread in the Mapping Cancer Markers forum [Edit]A bit of research on this shows that it's a lock-file issue: scheduler requests use a per-host lock file to ensure that there aren't two concurrent requests from one host. The file is created at the start of the request, holds the PID of the scheduler instance, and is deleted at the end of the request. There are two possible error conditions, one of which is that the lock file can't be acquired in the first place, the other that there is an existing lock. Unfortunately, although the message written to the server log distinguishes the two cases, the message sent to the client does not. In this case, I suspect the issue is an inability to create the lock file in the first place :-( Cheers - Al. [Edit 1 times, last edit by alanb1951 at Dec 2, 2023 10:02:17 PM] |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
TigerLily ! HELP !
No WUs are flowing, and no completed WUs can be returned. Al's above post would be useful info for the techs. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12146 Status: Offline Project Badges: |
2 messages receeived recently.
03/12/2023 02:34:02 | World Community Grid | Another scheduler instance is running for this host 03/12/2023 01:54:34 | World Community Grid | Not requesting tasks: don't need (CPU: not highest priority project; Intel GPU: ) 03/12/2023 01:54:39 | World Community Grid | Scheduler request to https://scheduler.worldcommunitygrid.org/boinc/wcg_cgi/fcgi failed: HTTP service unavailable Mike |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7580 Status: Recently Active Project Badges: |
I wonder if this particular hiccup is related to the validated but not purged problem. Since my oldest work unit is from Oct. 24 I looked and there have been 49,999,047 units returned. Make me wonder if some file ran out of space or somehow caused the system to lock up. Might just be coincidence or it may have nothing to do with the current problem. However, inquiring minds want to know, even if we are blessed with the current and on going lack of communication.
----------------------------------------Cheers Edit: Makes me wonder if the powers that be prohibit communication on the weekends.
Sgt. Joe
----------------------------------------*Minnesota Crunchers* [Edit 1 times, last edit by Sgt.Joe at Dec 3, 2023 3:20:18 AM] |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
I had a similar thought Sgt. Joe. Wouldn't we have gotten a disk full message?? just wondered if it was a different error msg.
I don't think they prohibit weekend communication, just don't force communication on the weekends. The server room techs don't work weekends as we have learned in past down times. They have stronger personal work/life boundaries than most American tech workers I know. It's refreshing to see. |
||
|
The_Mole
Cruncher Joined: Nov 10, 2007 Post Count: 17 Status: Offline Project Badges: |
Tasks keep trickling in but it's by far not enough and it seems like all the hosts that currently request work bottleneck the server:
----------------------------------------Yesterday evening my PC has 82 tasks ready to report, but... 12/2/2023 8:50:52 PM | World Community Grid | Sending scheduler request: To report completed tasks. Over night 70 were uploaded, 12 are still left: 12/3/2023 10:53:57 AM | World Community Grid | Scheduler request to https://scheduler.worldcommunitygrid.org/boinc/wcg_cgi/fcgi failed: HTTP service unavailable (Timestamps are GMT +1) A couple days ago, I attached Einstein@home with a resource share of 0. That way it only fills up idle threads without creating a waiting queue and any WCG tasks will displace them, once they come in. |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
I'm also on einstein at resource share 0 looking for pulsars until WCG gets fixed. I love that Boinc has a mechanism for a backup project or two.
You can look at https://wuprop.boinc-af.org/active_projects.py to find an active project for a backup. I also find it weird that this site found a ARP and OPN ghost WU |
||
|
|