Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 43
Posts: 43   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7171 times and has 42 replies Next Thread
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

By reading your post, I came on the idea that the log file could maybe represent the cause of the problem. Could a very big log file cause that boinc does not succeed to manage it correctly?
This is probably the easiest culprit to eliminate: the size of the stdoutdae files is more or less constant and BOINC is simply pushing message lines out as needed to fit with this size. So the difference between the log file of a very busy client (like our multicore ones with HCMD2 currently) and a very quiet one like a very slow machine is only the number of days of activity which are kept.
Currently the stdoutdae.txt of my quad contains about 15 hours of activity (without checkpoint logging) while that of my eeePC (which I removed from crunching in September 2011 smile ) contains messages covering 15 days with checkpoint logging!

No, I do think that the problem has to do with communication between the client and the server, i.e. when exchanging the client_state.xml files between both after an update request.
If latest changes on the server side improve the ability of the server to deal with the big update activity generated by the small grand children HCMD2 WUs currently, maybe these BOINC client failures might be over. Otherwise we will have to wait for the end of the HCMD2 final cleaning, or for a better designed version of the BOINC client which would simply drop/ignore the returned client_state.xml when it is corrupted or incomplete and retry the update request from the beginning.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Apr 5, 2012 1:56:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1671
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

It sounds reasonable.
For my-self, because of lack of available time, I am not so familiar with the boinc internal mechanisms.
Just for info, within 8 hours, boinc generated around 12'000 entries in the event file. Because of ring buffer behaviour, only the 2'000 last entries remain available.
Yves
----------------------------------------
[Apr 5, 2012 2:20:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

Bad guess from my part: it just happened again and this time there was no update request in progress, only a mere upload of a HCMD2 result:
Started upload of CMD2_2175-1NZW_A.clusters.....
Can't open client_state_next.xml: fopen() failed
Couldn't write state file: fopen() failed; giving up


No user activity either and the number of tasks in the cache was the lowest of all these last days, only a little more than 500, and about 30 WUs ready to report.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Apr 5, 2012 11:30:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

No more WUs? Both of my HCMD2 only machines ran dry last night. Message tab says no work available. sad
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Apr 5, 2012 1:13:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Thargor
Veteran Cruncher
UK
Joined: Feb 3, 2012
Post Count: 1291
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

Same here, getting no more work-units on one of my HCMD2-only machines - the other downloaded a huge chunk of other WCG units, before I set it to HCMD2-only.

Also showing the project as down for maintenance, with uploads disabled.
----------------------------------------

[Apr 5, 2012 2:09:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Pete Broad
Senior Cruncher
Wales
Joined: Jan 3, 2007
Post Count: 167
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

I'm still picking the odd one up, I've had 20 or so in the last few hours


Pete
----------------------------------------

[Apr 5, 2012 7:21:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

We are still catching up on various backend tasks due to our problems earlier in the week. HCMD2 is loading up work now and since we have resolved our issues, you should have a steady stream moving forward from here.
[Apr 5, 2012 8:08:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

We are still catching up on various backend tasks due to our problems earlier in the week. HCMD2 is loading up work now and since we have resolved our issues, you should have a steady stream moving forward from here.

I'm still getting no work available for CMD2 messages in my logs on 2 machines. confused
EDIT: For some unknown reason it was asking for GPU tasks. It seems to have straightened itself out and I'm now getting CPU tasks.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


----------------------------------------
[Edit 1 times, last edit by nanoprobe at Apr 6, 2012 8:49:05 PM]
[Apr 6, 2012 3:52:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1671
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

Because of several daily hang-ups for each Linux-based host, I deselected HCMD2 for the next 7 days. I have to business travel and I will not be able to baby sit the hosts.
The hang-up issue seems to appear rarely on Windows-based hosts but too often (around every 6 to 8 hours) on Linux-based hosts.
I am a little bit frustrated since I need around 200 crunching days for achieving 30 years on HCMD2.
Cheers,
Yves
----------------------------------------
[Apr 6, 2012 10:31:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCMD2: very short tasks

A new BOINC client failure last night which gives me the opportunity to report another sequence of messages that I have already seen days ago before reporting these failures here:
Computation for task CMD2_2175-1NZW_A.clustersOccur-2PKD_B.clustersOccur_0_7296_9119_7846_8028_1 finished
Signature verification error for wcg_hcmd2_maxdo_6.40_i686-pc-linux-gnu
Can't open client_state_next.xml: fopen() failed
Couldn't write state file: fopen() failed; giving up

----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Apr 8, 2012 7:29:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 43   Pages: 5   [ Previous Page | 1 2 3 4 5 | Next Page ]
[ Jump to Last Post ]
Post new Thread