Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Community Forum: Chat Room Thread: WHY DOES IT TAKE ME 1/2 HOUR+ TO FIND OUT (OR NOT) IF WCG HAS A PROBLEM? |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 10
|
Author |
|
pgalioni
Cruncher Milky Way Joined: Nov 16, 2004 Post Count: 14 Status: Offline Project Badges: |
Ok, I AM NOT reading through 50 years of posts and 90K threads --
----------------------------------------WHY CAN'T THERE BE A WCG PAGE FOR KNOWN PROBLEMS THAT STAYS PINNED AT THE TOP SO IT'S EASY TO FIND!!!!???? I have 3 computers running 24/7 for WCG -- ALL seem to show a slowdown in points per day - or units per day or how ever folks measure 'output' -- I have one SUPER FAST computer that for several days has NO units available for download -- because it's so fast I keep 5+5 days of units. The other two, one VERY fast still shows work being done but running LOW on work-packets, and it's running with NO WORK LOAD AT ALL! It has 3+5 days of work units saved, the slowest of the bunch shows no problem - though the cache (3+3) is getting lower. I HAVE NO IDEA WHERE TO EVEN START LOOKING TO SEE IF THIS IS SYSTEM WIDE - OR LOCAL (my ISP or something with MY computers) -- IF ANYONE CAN SEND ME THE NAME OF THE THREAD THAT ***ALWAYS*** HAS WCG ISSUED NOTICES OF SYSTEM PERFORMANCE -- PLEASE LET ME KNOW!!! Please!!! I'll check by on THIS thread -- but to tell ya'll the Truth, I'm STARTING to find it bothersome and burdensome to spend half an hour or more looking for a notice about some kind of system problem -- that's not to say it's not there, just that even in retirement I'm too overwhelmed with the amount of time I'd have to invest in finding out that there's a glitch somewhere --- I VERY RARELY ever complain - and when the Great Migration took place, and there was talk of pinning OFFICIAL NOTICES OF KNOWN PROBLEMS at or near the top of the Forum -- and I would also expect to see an Admin post "This problem has now been resolved" and maybe lock the replies - ALL THREADS POSTED THERE WOULD GO THROUGH ADMIN - just to keep order in the threads and answers. Am I crazy in thinking that there could be a pinned 'official notice of current/continuing problems' -- so to find out all I have to do is order by 'newest first' -- and see just WTF is going on - or isn't going on (as the case might be)? If there's a reason this can't be done, I'd like to know so others don't keep wondering the same thing - I'd sure like to know if it's my ISP, my LAN, or my computers themselves that are the problem. Thanks for letting me know if this is a good, but stupid idea (because it can't be done) or if's a stupid idea because the most useful place for 'our system is having problem X which causes Y' and we are working to fix it." OR ditto with 'we are awaiting funding in order to fully investigate (or hire someone who can trouble-shoot/work around) the problem. Thanks everyone - just hate to see machines capable of MUCH more work doing as much as they can to help our brothers and sisters live in better circumstances than they do currently.
-- PAUL --
THE MORE YOU LEARN, THE MORE YOU WANT TO SCREAM! |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 746 Status: Offline Project Badges: |
First, you are not alone in being frustrated at the lack of communication from WCG, especially when it comes to both system outages and work unit shortages. Many of the regular, most vocal contributors including myself have vented on here the past couple years. You're not alone!
----------------------------------------If there's updates, there's usually threads posted (during or a day or two after the fact when systems are up) in the News subforum. Sometimes the outages only affect the WCG Forums (here) but not the BOINC work systems, so it's impossible to get help here. There used to be updates posted on Facebook and/or Twitter, but that's not consistent. I agree the silence can be frustrating. To give them a bone, they're severely understaffed and underfunded. Dr. Jurisica is contributing finances from his own lab grants to WCG, when really this burden should fall on Krembil/University Health Network (the largest academic research organization in Canada?), especially since they get all this free advertising and goodwill. Dr. Jurisica has prioritized any funding to the actual systems, but there's not enough to hire more dedicated staff. We have a part-time communications professional (TigerLily at the moment), and she does a good job interfacing between us and the technical system and database admins in the datacenter. We DO usually get explanations and some technical details about why systems went down, but it can sometimes take a few days or a week or so after the fact before we get any updates. During the time of the outage when there are no forums, no updates on Facebook/Twitter, no BOINC notices in the apps (which would be a perfect place to put status updates assuming BOINC is up). It's definitely a pain point. There's also a decades-long feature request for a more out-of-the-box BOINC "Status Page" like other BOINC projects, but IBM heavily customized WCG when they were in charge, so it would take many hundreds of hours of work to undo all that customization. To recap, you're not alone in being frustrated as hell at the work unit shortages or the lack of communication. There's good news at the least -- MCM work units were backlogged on the Results page since... November 2023. But the past week the processes have been stable and we're seeing some major progress. November and December 2023 are deleted, and we just have January, February, and March 2024. Hopefully in the next week! MCM reverted from sarcoma work units (started about 2 years ago) back to ovarian cancer (the target before sarcoma), so that's what we're all crunching now. I don't know when ovarian will be done again and we can finish up sarcoma. And there's a few other cancer targets Dr. Jurisica talked about. I know pancreatic cancer was one of them and I hope it's chosen next. I think maybe prostate and other cancers? Africa Rainfall Project -- there's talk about it resuming soon, but we'll see. If/when that does, it may drag down the whole system again since the results are massive. Smash Childhood Cancer has been on hold but I hope there's more soon. Honestly I almost think Help Stop TB should be retired since the researchers moved to a new university. Either close the project or give us an update if we'll resume in 2024. Again, I'm just as annoyed or frustrated as you. You are not alone. Lots of people have just left or never came back. I just deal with the frustrations. Maybe if I win the $1 billion lottery I can start my own research organization and then see if I can take on WCG haha. But for now, all we can do is run backup BOINC projects or other non-BOINC projects. I really like Folding@home -- there's plenty of CPU and GPU work from different researchers around the USA/world. Lots of cancer, influenza, COVID, etc. It's just a totally different system, app, stats, learning curve to use Folding@home's "Advanced" page. The other option is just dealing with the lack of work and saving on heat and electricity as spring and summer come. Hope any of my rambling helps.
|
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7545 Status: Offline Project Badges: |
hchc: Good response. I also echo those frustrations, but at least it "seems" to be getting better.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12120 Status: Offline Project Badges: |
It seems to me that we find problems before the techs. Not surprising as they do not operate 24/7.
Mike |
||
|
MJH333
Senior Cruncher England Joined: Apr 3, 2021 Post Count: 224 Status: Offline Project Badges: |
I HAVE NO IDEA WHERE TO EVEN START LOOKING TO SEE IF THIS IS SYSTEM WIDE - OR LOCAL (my ISP or something with MY computers) I would suggest you go to the "Recent Threads" page on the Forums and look for the latest thread entitled "Project Status (First Post Updated)", which is currently this one.This thread was created by one of the crunchers here, Unixchick, with assistance from Headcrash and others. The figures in the table in the first post are based on figures from WUProp, so show only a fraction of the actual WUs. But this is the nearest approximation we have to a real-time server status page, given that WCG doesn't have one. When MCM is running well, the "Number of workunits the last 24 hours" figure for MCM tends to be about 80,000. As I write this, it is around 11,000, so the system is clearly not operating at anywhere near full capacity. And when there are problems, you will usually find them being discussed in that thread. Note that Unixchick has to create a new "Project Status (First Post Updated)" thread from time to time. This is because Unixchick edits the first post regularly, but there is a limit on the number of times a post can be edited. So, once that limit has been reached, Unixchick creates a new "Project Status (First Post Updated)" thread. That is why I suggest navigating to that thread via the Recent Threads page rather than, e.g., bookmarking the current version of that thread. Cheers, Mark [Edit 1 times, last edit by MJH333 at Mar 15, 2024 10:24:06 AM] |
||
|
pgalioni
Cruncher Milky Way Joined: Nov 16, 2004 Post Count: 14 Status: Offline Project Badges: |
Thanks for the time you took to let me know. I recall when The Great Migration took place -- I mostly kept quiet because I knew it would NOT be fun for anyone involved --and picked up on the funding problems then -- and like all projects, it took longer and was underfunded more than most thought it would be.
----------------------------------------Didn't know a post could only be edited x number of times before it crashed and burned -- that explains some of the problem with communicating problems. When I had a real job, I rarely - if ever - checked WCG, or read my log. Now that I'm having issues with my ISP - Comcast Xfinity in a RURAL area I don't know if it's me, or them since recently it's been far more them than anything else - at least I don't yearn for big cities and paved roads!!! Only for speeds somewhere near what they say they deliver. Thanks again -- not sure why anyone would leave completely - LOL - WHAT A LIE!!! - I was looking at the amount of computing time - and how it's timed, so would have very little effect if WCG went down other projects could fill that void but the frustration I now see can drive people crazy - MY frustration was FINDING IF there was a problem - now I know around where to look - so it's not a half hour, and if I can't find it in 5 min, then screw it - will presume it's them. again, thanks for your time. L,p
-- PAUL --
THE MORE YOU LEARN, THE MORE YOU WANT TO SCREAM! |
||
|
pgalioni
Cruncher Milky Way Joined: Nov 16, 2004 Post Count: 14 Status: Offline Project Badges: |
Yeah, I think he did a good job too - and expresses many of the feelings I've seen here, esp since The Great Migration -- thanks for all the time YOU have put into helping hold this group together - over the years I always see you helping others, or putting others into a better frame of mind. Thanks SarJoe!
----------------------------------------
-- PAUL --
THE MORE YOU LEARN, THE MORE YOU WANT TO SCREAM! |
||
|
pgalioni
Cruncher Milky Way Joined: Nov 16, 2004 Post Count: 14 Status: Offline Project Badges: |
Thanks -- now I know where to start looking - I used to do that, then it seemed that several years ago it all up and fragmented and I was lost again - looking here, poking there, peeking in around another corner of the digest -- Silly me! I thought I'd find the listing under 'official' notices and news. Don't follow on Facebook - though I may start - I loved it when I never had to check on anything - but when I retired, I was playing with super-fast chips and ram and cards. To see if tweaks were working I'd end up in the forums looking high and low and that was about the time of the migration - the WORST time possible to see if I could tweak the performance of my PC. Now I just check my graph and read my log and have a peak at what units are running -- and when I see a blank page with no obvious explanation in the forum -- I do the reload routine and let it 'repair' . Before I realized just how much trouble the Migration caused, I was even reloading the entire program from scratch, and of course that did nothing at all except kill some time that I didn't really want dead. Odd mindset Americans have - Kill it and start again, even if it didn't need killing in the first place. At least I could laugh and learn from my Self - thanks for pointing me in the right direction! Guess I'll go back to Boot and university where I can learn, again, exactly how to hurry up and wait. Thanks for your time!
----------------------------------------
-- PAUL --
THE MORE YOU LEARN, THE MORE YOU WANT TO SCREAM! |
||
|
Barnsley_Tatts
Senior Cruncher Joined: Nov 3, 2005 Post Count: 280 Status: Offline Project Badges: |
It's frustrating, We're all invested in WCG for various reasons with a common goal.
----------------------------------------WCG was more stable when IBM had it, but they had much more resources than Krembil. They're trying their best I'm sure. I also crunch for DENIS@Home. Quite a small project and the supply of WUs is often exhausted especially at weekends. But they do have a status page - something WCG desperately needs. |
||
|
alged
Master Cruncher FRANCE Joined: Jun 12, 2009 Post Count: 2340 Status: Offline Project Badges: |
THKS vm for these explanations above .As i am not a geek it is nice to have people who could state well the pbs and sometimes the possible answers.
----------------------------------------Staying with WCG at mom and crunching MCM while waiting SCC to return. Regards |
||
|
|