| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 4
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
my friend winkie (mark) has written a monitoring program for rosetta. i know what you're thinking, what does that have to do with wcg? well as it happens, the rv program also monitors wcg. it will alert you if a job has stalled and then if you want, automatically abort it. along with rv, he has now added a vista gadget.
this is a screenshot of it on one of my machines. if you are interested in using the rosettaview program and the gadget, you can read more about it, and download them free from here |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hi vavega,
----------------------------------------Particularly the alert part has great interest. The HPF2 and other jobs stall ** at times and a suspending of WCG in the project tab and resume (Leave in Memory preferred to be off) or Exiting BOINC and Restart, will at a very very high percent pick up from the last checkpoint save and finish the job without further delay. Thus, if it were able to do exactly that, for some it would be a boon. thanks for sharing ciao ** Often these jobs continue to clock CPU time in the Tasks view, just the progress percent freezing indefinitely and the projected completion time ever increasing.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at May 17, 2008 8:26:22 AM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
vavega,
----------------------------------------If you read this, can you pass to your friend winkie: Is rosettaview known to grab the client_state.xml? Remote monitored systems report: 27-Jan-2009 05:18:16 [World Community Grid] [checkpoint_debug] result me416_00011_2 checkpointed 27-Jan-2009 05:18:31 [---] Can't rename current state file to previous state file; Impossibile accedere al file. Il file è utilizzato da un altro processo. (0x20) 27-Jan-2009 05:18:43 [---] Can't rename state file; Impossibile creare un file, se il file esiste già. (0xb7) 27-Jan-2009 05:18:43 [---] [error] Couldn't write state file: rename() failed; giving up StartServiceCtrlDispatcher being called. This may take several seconds. Please wait. 27-Jan-2009 08:29:32 [---] Starting BOINC client version 6.2.28 for windows_intelx86 At 5:18 the service was shot down. This repeated later twice in 30 minute increments, which is the time I set rv to check job progress. The localhost was not impacted suggesting that network access slowness causes conflict. thanks
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Bump for those who have windows and like to be auto-alerted on stuck jobs, on any client in a LAN, so can monitor Linux/Mac clients remotely too. It can even auto-abort long stuck tasks.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
|