Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 20
Posts: 20   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 6453 times and has 19 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

The final, and most important reason why not is because World Community Grid is so very, very much more complicated than these other projects. WCG have 6 active projects, and multiple physical servers, multiple schedulers and whole layers that aren't in the standard BOINC package.

I'll say it's complicated... It even sends you work you didn't ask for and don't want...
----------------------------------------
[Edit 1 times, last edit by Former Member at Dec 14, 2008 4:29:30 PM]
[Dec 14, 2008 4:28:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

was it "sends" or is it "sent"? Does it still do after implementation of the "New projects, auto opt-in"? If so, please document .
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Dec 14, 2008 4:55:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nasher
Veteran Cruncher
USA
Joined: Dec 2, 2005
Post Count: 1423
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server Status

yes it would be nice to be able and see there is a problem with the validator or such but it also dosnt help us much since im not able to call anyone at WCG and say hey you guys have a problem with your....


@ night owl

there are 2 options you should check on your pages

1) do you have this checked
--- If there is no work available for my computer for the projects I have selected above, please send me work from another project.
2) how about this
--- Please opt me in to new projects as they become available.

both of these are on the my projects page if eithor of these is checked then technically you asked for the work by default if not then i dont have an answer for ya
----------------------------------------

[Dec 14, 2008 9:49:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

....but if you look at the recent threads on the validator problem, it took hours before it was realized that there was a problem.....a status page takes minutes!

Appears to be happening again, have a couple of Faah jobs, quorum of one pending for three hours.
A server status site is only useful to a point,It only shows the darn things are turned on not that they are actually working which can mask a problem as I have found on cpdn many a time biggrin
What is needed is louder bells and whistles wink
[Dec 16, 2008 10:17:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

....but if you look at the recent threads on the validator problem, it took hours before it was realized that there was a problem.....a status page takes minutes!

Appears to be happening again, have a couple of Faah jobs, quorum of one pending for three hours.
A server status site is only useful to a point,It only shows the darn things are turned on not that they are actually working which can mask a problem as I have found on cpdn many a time biggrin
What is needed is louder bells and whistles wink

...but the other status pages also show the queues, so you immediately see a build up when the queue just grows and grows.....
[Dec 16, 2008 10:33:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

...but the other status pages also show the queues, so you immediately see a build up when the queue just grows and grows.....

Yup agreed most definitely the more info the better..
Somethings working as all my jobs just validated smile
Chris.
[Dec 16, 2008 10:45:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

We had a record day yesterday, so I imagine that until the site move has happened we'll be seeing this more often.

Whenever bottlenecks develop, certain none critical daemons are paused to allow stabilization. Many processes run on a single server, so that makes sense and size does matter. There's no comparing as mentioned of WCG with any other project, second after SETI in Tflops throughput and number of clients served.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Dec 16, 2008 11:10:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server Status

The server status page is interesting to look at but I'm not sure it would provide as much use as folks think it would.

There are many times throughout the day that we suspend the validators/assimilators/transitioners/file_deletor/db_purge via scripts in order to reduce contention on the database while other processes are executing certain tasks. They are probably off 15% of the day or so.

As far as monitoring the status goes, we have scripts in place that will send text messages to several staff members cell phones, work and personal email addresses alerting them when BOINC processes have stopped running. We also have the same thing for a lot of the standard system monitoring (filesystems, is apache running, etc).

This past weekends issue was unusual becuase one of the backend servers completely froze up. Depending how the server status page was implemented, then the page might have continued to report that the processes were running (the default status page provided with BOINC would have shown this).

As far as displaying various queues go, those queries are extra load on the BOINC database which is severely taxed right now. In our new environment they might be tolerable.

All of this is not to say that we won't do this someday, but we are going to be focusing on things such as updating the forum software before we work on this. I am very much hoping that we can get the kinks worked out of the new environment quickly and settle into a nice period next year of focusing on improvements rather than the more mundane but absolutely essential work of upgrading hardware and underlying software.
[Dec 16, 2008 2:36:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

As far as monitoring the status goes, we have scripts in place that will send text messages to several staff members cell phones, work and personal email addresses alerting them when BOINC processes have stopped running. We also have the same thing for a lot of the standard system monitoring (filesystems, is apache running, etc)..
....that satisfies me Kevin. That is a much better solution to identifying any problems and reacting quickley to them.

I remember a problem, around the Xmas period, some years ago when both UD and WCG were both affected. I cannot remember if the problems were related, or, it was just a coincidence that both of you had a problem around the same time.

I do remember giving the 'staff' at UD a rough time as it appeared that they could not be bothered to fix the problem until well after the holidays, but, WCG sent someone in to fix theirs and were back in operation quickly......I also pointed this out to those over at UD at the time!


....and I concur that fixing the foruums is much better than the server displays...

Thanks Kevin.
[Dec 16, 2008 3:07:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Status

I understand all of this, and I agree that it is a low priority compared to actually getting this thing to run smoothly. I, for one, don't care about the "Server Status" so much as the "Ready to Send" component.

I have the option to get any work available set for WCG and so never have a problem with workload. However, I grab the RTS information from other projects that I run as part of a calculation used to assign a priority designation to each. I then use the designation result to rotate runtimes for them under the BOINC client. Not having the RTS information for WCG skews my results. I have been able to make adjustments, but...

There are multiple projects being run under WCG, but if only indicating RTS, a way to do it would be the way CPDN does it. While CPDN also shows the SS information, they do break up the different projects under the RTS showing how much work is available to each.

If it is possible to show only the RTS information, it would provide an indication of how close to completion any project is or if new work has been added. It might help those that like to opt-in to certain projects by indicating to them that it may be time to pick another.

If the information could be incorporated into the main stats page, it would eliminate the need for a new page, but may add to webpage generation times depending on whether the information is live or sampled during a stats update. Heck, it could be made accessible via XML or RSS and kept off the website entirely and then would also be subject to live queries or sampled updates.

Well, thanks for your time. I hope these suggestions are considered when your priority list gets to the status page. In the meantime, thanks for keeping this place going as well as you have.
[Dec 21, 2008 8:46:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 20   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread