| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 24
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I had a disk crash on one machine. The disk happened to have WCG data on it. So, I take the easy solution for the time being of restoring a 2-day old backup of the data to another disk and making a symlink so that the data can be accessed by exactly the same path as usual. It tries to re-report the old WUs which are already complete. So far, so good.
Then it does this "Generated new host CPID: blah blah", changes to the wrong "venue", tries to get WUs from the wrong project. Stupid. Annoying. What triggers the generation of a new host ID and how can I get it to not do this? |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Did you come here for support? Leave out all the laments of suck, stupid, hideous, annoying and just give the problem details and we'll get going with it!
----------------------------------------There is at least 1 FAQ explaining how to avoid generating a new device. If it did create a new host entry, 1 or more of the 6 elements that WCG tests did not match. http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=16521
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There is at least 1 FAQ explaining how to avoid generating a new device. If it did create a new host entry, 1 or more of the 6 elements that WCG tests did not match. http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=16521 No, none of those items have changed. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Leave out all the laments of suck, stupid, hideous, annoying and just give the problem details and we'll get going with it! I quite understand where you're coming from on that, and I'd feel the same way in your position. However, don't forget that WCG isn't there for my personal benefit either. A hard drive fails. Time to reload from backup: About 3 mins. Everything works perfectly ... except WCG. Okay, maybe it doesn't link the symlinking? Time to re-partition one of the other drives in the machine and reload from backup? About 10 mins. WCG still not happy ... * creates new cpuid for no obvious reason * doesn't say why it's done it (if the message from the server said what the mis-match was, I could try to make it happy) Time of mine wasted trying different things: About 2 hours so far. All because software which doesn't benefit me at all behaves in illogical ways, is poorly documented and doesn't explain what it's doing. I've said it before and I'll say it again: this boinc setup works great under normal circumstances, but under some failure conditions which should be dead simple to recover from, it's an absolute dog. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It's working as designed. If you don't like it, take it up with BOINC.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It's working as designed. If you don't like it, take it up with BOINC. That appears to be untrue. Nothing whatsoever about the hardware or networking have changed in the 2 days since my last backup, including none of the 6 items which are supposed to be checked according to the FAQ. However, I have fixed the problem. So, here's a recovery procedure for anyone else who experiences this in future: 1) [optional - so as not to be messy and download tons of WUs we're never going to process] The machine in my case was not using the default profile, but a machine with a new CPID will use default. Set the default profile not to get any WUs. ie. Set it to AC@H only. 2) Restore data. Remove WU data (ie. slot/?/*). Remove all entries from client_state.xml concerning currently running jobs. ie. The "file info", "workunit", "result" and "active task" stanzas. Save this copy of client_state.xml. 3) Increment "rpc_seqno" in saved copy of client_state.xml and copy the file to boinc data directory. 4) Run Boinc. If it reports generating a new CPID, kill it and go to step 3. Once you hit the right rpc_seqno, the server will supply new copies of the WUs the client was working on previously, and work will continue from the start of those WUs again. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Okay, there are two problems here: you are confusing CPID with host ID. They are not the same, or even related. Second, you are mixing up the WCG host merging mechanism (with doesn't apply here) with the BOINC RPC sequence mechanism that treats restoring from backup in exactly the same way as a cloned installation - it creates a new host ID.
And that's all working as designed. Whether the design is right - that's another question. However, I really don't think the effort you went to is worth it. If one has to restore from backup, work in progress is lost. It makes perfect sense to me. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Okay, there are two problems here: you are confusing CPID with host ID. They are not the same, or even related. They certainly are related. In this situation, when it creates a new CPID, it creates a new hostid, new rpc_seqno, etc. Then you have multiple devices with the same name in the Device Manager, some of them with one profile and some with another. It's a shambles. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Not true. Please look up the details if you want to learn more.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Not true. Please look up the details if you want to learn more. Firstly, it is true. Please don't waste my time and try to mislead others by telling me I'm not seeing what I'm seeing. Secondly, no, I don't really want to waste more time "learning" why. Feel free to research it if you want. Thirdly, while we're at it, don't tell me that I'm "confusing" CPID with host ID. client_state.xml contains different host_cpid and hostid than before. There is, self-evidently, a relationship between the two, in that they are both changed by some process that is triggered by the situation and they are the only significant variables that are changed. What the mechanism(s) are, the user shouldn't need to know. Assuming that the messages from the server are in chronological order, the host ID gets changed first. This is not explicitly reported by the server. It is only implicitly reported if, as in this case, the location (aka venue) changes. The new CPID gets then gets assigned for this "new" host. However, the CPID is the only one explicitly reported in the log, which is the reason I mentioned it. The change in CPID is reported in an obvious manner to the user, but for the change in host ID, you have to dig around in the client_state file. [Edit 2 times, last edit by Former Member at Apr 16, 2008 11:46:54 AM] |
||
|
|
|