Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 36
Posts: 36   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3018 times and has 35 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

Hi,

If you go to the beginning of the BOINC Manager message log and copy paste the first 25 lines to where computing resumes, sample below, we can do a basic read of your configuration. Large jobs like FAAH of recent have high momentary system demands. Just want to make sure it's not your setup.

RICE will run 8 CPU hours, from batch R00200 9 CPU hours, regardless of the device power. That's for many part time crunchers a bonus.

ttyl

29/10/2008 11:46:15||Starting BOINC client version 6.2.25 for windows_intelx86
29/10/2008 11:46:15||log flags: task, file_xfer, sched_ops, checkpoint_debug
29/10/2008 11:46:15||Libraries: libcurl/7.19.0 OpenSSL/0.9.8i zlib/1.2.3
29/10/2008 11:46:15||Running as a daemon
29/10/2008 11:46:15||Data directory: D:\Mijn Documenten\BOINCData
29/10/2008 11:46:15||Running under account boinc_master
29/10/2008 11:46:17||Processor: 2 GenuineIntel Intel(R) Core(TM)2 CPU T5600 @ 1.80GHz [x86 Family 6 Model 15 Stepping 6]
29/10/2008 11:46:17||Processor features: fpu tsc pae nx sse sse2 mmx
29/10/2008 11:46:17||OS: Microsoft Windows XP: Professional x86 Editon, Service Pack 3, (05.01.2600.00)
29/10/2008 11:46:17||Memory: 1.50 GB physical, 2.85 GB virtual
29/10/2008 11:46:17||Disk: 47.98 GB total, 5.58 GB free
29/10/2008 11:46:17||Local time is UTC +1 hours
29/10/2008 11:46:17|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 95711; location: work; project prefs: work
29/10/2008 11:46:17||General prefs: from World Community Grid (last modified 29-Jul-2008 18:31:51)
29/10/2008 11:46:17||Computer location: work
29/10/2008 11:46:17||General prefs: using separate prefs for work
29/10/2008 11:46:17||Reading preferences override file
29/10/2008 11:46:17||Preferences limit memory usage when active to 1380.69MB
29/10/2008 11:46:17||Preferences limit memory usage when idle to 1380.69MB
29/10/2008 11:46:17||Preferences limit disk usage to 5.35GB
29/10/2008 11:47:16|World Community Grid|Restarting task X0000052741460200507050859_0 using hcc1 version 606
29/10/2008 11:47:16|World Community Grid|Restarting task X0000052800872200507051100_1 using hcc1 version 606
29/10/2008 11:52:38||Suspending computation - user request
29/10/2008 11:53:15||Resuming computation
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 31, 2008 7:51:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TOMinAZ
Cruncher
United States
Joined: Feb 11, 2007
Post Count: 40
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

I wasn't sure if it made a difference that I'm not running an FAAH WU at the moment, but here's the info:



10/31/2008 6:58:34 PM||Starting BOINC client version 6.2.19 for windows_intelx86
10/31/2008 6:58:34 PM||log flags: task, file_xfer, sched_ops
10/31/2008 6:58:34 PM||Libraries: libcurl/7.18.0 OpenSSL/0.9.8e zlib/1.2.3
10/31/2008 6:58:34 PM||Data directory: C:\Documents and Settings\All Users\Application Data\BOINC
10/31/2008 6:58:34 PM||Running under account Thomas Ford
10/31/2008 6:58:38 PM||Processor: 1 GenuineIntel Intel(R) Celeron(R) M processor 1.30GHz [x86 Family 6 Model 13 Stepping 8]
10/31/2008 6:58:38 PM||Processor features: fpu tsc pae nx sse sse2 mmx
10/31/2008 6:58:38 PM||OS: Microsoft Windows XP: Home x86 Editon, Service Pack 2, (05.01.2600.00)
10/31/2008 6:58:38 PM||Memory: 1.24 GB physical, 2.34 GB virtual
10/31/2008 6:58:38 PM||Disk: 25.05 GB total, 7.67 GB free
10/31/2008 6:58:38 PM||Local time is UTC -7 hours
10/31/2008 6:58:39 PM|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 487185; location: home; project prefs: home
10/31/2008 6:58:39 PM||General prefs: from World Community Grid (last modified 14-Feb-2007 16:11:20)
10/31/2008 6:58:39 PM||Computer location: home
10/31/2008 6:58:39 PM||General prefs: using separate prefs for home
10/31/2008 6:58:39 PM||Reading preferences override file
10/31/2008 6:58:39 PM||Preferences limit memory usage when active to 953.53MB
10/31/2008 6:58:39 PM||Preferences limit memory usage when idle to 953.53MB
10/31/2008 6:58:39 PM||Preferences limit disk usage to 3.73GB
10/31/2008 6:58:52 PM|World Community Grid|Restarting task R00184_18884c42cd7d09d91a2722205ffc681e_02_000_3 using rice version 617
[Nov 1, 2008 4:19:06 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

Those parms look healthy enough to tackle any current WCG task.

After 76 million results one would expect to be running a proven piece of software[FAAH]. That said, when there are problems, it's often the graphics part. Disable that and the ss and issues have gone away. You could test that if you want and do check if your display card drivers are the latest.

cheers

[Edit: Clarified the 76 million relating specifically to FAAH]
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Nov 2, 2008 8:00:20 AM]
[Nov 1, 2008 6:29:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TOMinAZ
Cruncher
United States
Joined: Feb 11, 2007
Post Count: 40
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

The screensaver issue isn't the issue, I don't use it. So, put that aside.

The issue is that BOINC closes. A few times, when I restart the agent, the WU starts over again, losing all work it has done. I had about 20 hours of CPU time, BOINC closed, I clicked to open it again, and the progress starts at 0. It "forgets" it was running the WU and just starts again.

I left the room, BOINC was running, happily crunching away. I came back, BOINC wasn't running. The WU didn't resume.

I guess you don't understand what the problem is.

FAAH is the only project that this happens, and it started a few weeks ago, shortly after I downloaded the latest version of BOINC.

Again, in short, BOINC mysteriously closes while running FAAH, and I lose any crunching it was working on. Like closing a book you were reading, without a bookmark, and you start from page 1 again because you don't remember where you left off. It's done it a few times, always with FAAH. No screensaver was involved.
[Nov 2, 2008 1:21:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

Hi Tom.
Have you tried to find error messages which would have been produced before Boinc collapses?
If not this is how to proceed.
- Go to your Boinc Data directory.
- Copy the file stdoutdae.txt to a temp/work directory.
- Look at this copy with your favorite editor and search for "client version".
That will find the start of each session still available in that file.
- See if you find unusual error messages in the preceding lines, i.e. at the end of the previous session.

Something else that you could do is to activate the logging of checkpoints. It seems that you have at least two different problems, i.e. FAAH jobs killing Boinc, and FAAH not taking checkpoints, unless you tell us that FAAH jobs are failing too early for having ever been able to take a checkpoint.

To activate the logging of checkpoints-
- edit the file cc_config.xml (also in the Boinc Data directory)
- add the following line
<checkpoint_debug>1</checkpoint_debug>
in the log flags section
- save the file
- from the Advanced view of BoincMgr select Advanced>Read configuration file to activate the logging.

Edit: Corrected the name of the file cc_config.xml (and not cc-config.xml)

Cheers. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
----------------------------------------
[Edit 1 times, last edit by JmBoullier at Nov 2, 2008 2:12:57 AM]
[Nov 2, 2008 2:10:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TOMinAZ
Cruncher
United States
Joined: Feb 11, 2007
Post Count: 40
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

Oh my, I'm not a programmer at all. I've never used any kind of code editor.
I'm a theologian, so if we need to pray for it.... LOL.

I did try to find the BOINC data file, but I'm not sure where to look.
I checked under my C drive, then Program Files, then found the folder for BOINC. There's a subfolder called "locale", with many 2 letter folder names, looks like an alphabetic directory. I tried to find something called "data", then I looked for "stdoutdae.txt", but found neither.

I remember that when I restarted the agent, I went to Advanced View, to see if any messages were there, but it hadn't saved anything from before the restart. I think it said it was starting, not resuming. But the last time that happened was the last FAAH WU I ran, before switching projects.

I'll see if one of my geek associates can help me with that, probably Sunday night (in Arizona, it's now Saturday night).
[Nov 2, 2008 3:53:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

OK Tom, no problem. I have not detailed everything (some people more knowledgeable think you see them as idiots in that case) but I can.

First don't be afraid. When I say your favorite editor I mean something like Notepad or Wordpad, nothing more sophisticated. You will need it only for viewing the file and for finding (Ctrl+F) the starting line of every session. These lines are the only ones which contain "client version", so you see the game.

It is normal that the messages shown from the advanced view start with the current session. This is the reason you will have to look in that stdoutdae.txt file which usually contains several past sessions altogether.

The file stdoutdae.txt is in the Boinc data directory which, according to the message log that you have provided, is accessed via the following path:
C:\Documents and Settings\All Users\Application Data\BOINC

If you do not have a temp or work directory of your own for receiving a copy of the file, either create one or simply copy it to the desktop. You will clean it when you do not need it any longer.

To view this copy of the file-
- right click on its icon
- move your mouse down to "Open with"
- select Notepad or WordPad. That's it.

From my own viewpoint prayers are not mandatory, but they cannot hurt. smile
Good luck. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Nov 2, 2008 6:35:58 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TOMinAZ
Cruncher
United States
Joined: Feb 11, 2007
Post Count: 40
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

OK, I copied it to Notepad.

It looks like every time I find "Client Version", it's preceded by:
"Exit requested by user"
which I take to mean I stopped the program.

The only suspicious message I see is:
26-Aug-2008 02:10:04 [---] [error] Integer benchmark ran only 0.359375 sec; ignoring
26-Aug-2008 02:10:04 [---] [error] CPU benchmarks error

It that something we're looking for?
[Nov 3, 2008 6:46:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

It looks like every time I find "Client Version", it's preceded by:
"Exit requested by user"
which I take to mean I stopped the program.

You are right that this is what it should mean, but it is not... sad
In fact Boinc says that when it received the information that it should stop. Whomever or whichever was the sender. So that can be you directly, or the system because you have asked for rebooting or shutting down. Or possibly the system because automatic updates are activated and after some updates have been done an automatic reboot is triggered. Or... who knows what that I am not thinking of right now.

The interesting thing is that if you have this message at least that means that Boinc has not crashed on an error of its own.

Regarding the benchmark error I have seen it sometimes without any annoying consequence. I think it is issued after a benchmark is started if something else happens in Boinc that would make the benchmark worthless. I presume that it is listed as an error because if Boinc were perfect it should not start a benchmark while it is waiting for the answer to a request for work, for example. I don't know if this is a possible situation in the current version but that could theoretically happen if/when various tasks of Boinc are not well coordinated.

Have you been able to activate checkpoints logging as I suggested?

Cheers. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Nov 4, 2008 12:07:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TOMinAZ
Cruncher
United States
Joined: Feb 11, 2007
Post Count: 40
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: The same problem Re: Strange faah4454 Wu´s behavior

OK, I found the config file. It looks like this:

- <cc_config>
<log_flags />
- <options>
<dont_contact_ref_site>1</dont_contact_ref_site>
</options>
</cc_config>

Where do I put that line
"<checkpoint_debug>1</checkpoint_debug>"
you gave me?

Should I do this:

- <cc_config>
<log_flags />
<checkpoint_debug>1</checkpoint_debug>
- <options>
<dont_contact_ref_site>1</dont_contact_ref_site>
</options>
</cc_config>
[Nov 5, 2008 10:17:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 36   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread