Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 11
Posts: 11   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2912 times and has 10 replies Next Thread
Gollumer
Senior Cruncher
Joined: Mar 23, 2006
Post Count: 194
Status: Offline
Reply to this Post  Reply with Quote 
angry Core 2 Duo computation problems?

Anyone else having problems with BOINC and HDC non-beta workunits on Core2 Duo processors?

New PC, new install of Windows (32 bit).

Runs Games BF2/UT2004 like a champ, no probs no hangs. The CPU never gets above 41 degrees.

On some workunits BOINC *HANGS* the PC. After rebooting and logging back in, BOINC reports COMPUTATION ERRORS on the workunits.

I re-installed BOINC, forgot to save the logs, I'll postem in the troubleshooting when it happens again.

Just curious if anyone else has had this problem, or am I too far on the bleeding edge?
[Oct 24, 2006 8:11:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

Have u checked in Taskmanager whether the Sciences app like wcg_hdc_tma_5.05_windows_intelx86 or wcg_faah_autodock_5.26_windows_intelx86 is still counting up time and merely the BOINCmgr.exe has become unresponsive? If so, kill BOINCmgr.exe and restart it. Check whether your firewall is the cause i.e. deactivate temporarily.

The 'hanging' of BOINCmgr.exe on the firewall may cause the whole system to appear stuck...ctrl-alt-del to bring up taskmanager to check above.

plz advise.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Oct 24, 2006 8:21:23 PM]
[Oct 24, 2006 8:17:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Gollumer
Senior Cruncher
Joined: Mar 23, 2006
Post Count: 194
Status: Offline
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

When this happens the PC stops responding. I can't get to task manager, the mouse won't move/stops working. I let it go for 15 min once to see if I could try to shut it down or kill the task, but the only option is to reset or power off.
----------------------------------------
[Edit 1 times, last edit by Gollumer at Oct 24, 2006 8:24:27 PM]
[Oct 24, 2006 8:22:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

There is no inherent problem with c2d's and hdc; I have 5 running 24/7 myself and many others on XS team have many more. Because it runs both cores at 100% constantly, wcg is a great stability test. It could be simply that your system is building up heat over time. Can you give more details about your system, such as chip,mobo,ram, fsb and ram speed, type of cooler, number of case fans, temps?
[Oct 25, 2006 3:32:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

Would you be Over-Clocking by any chance? I've seen this problem with OC'ing on projects like these. I refuse to OC myself.

This program will test reliability if so and detect errors in hardware.

http://www.passmark.com/products/bit.htm
[Oct 25, 2006 4:38:58 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Gollumer
Senior Cruncher
Joined: Mar 23, 2006
Post Count: 194
Status: Offline
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

Hardware:

Intel Core 2 Duo E6600
ASUS P5B-E Socket T Intel P965 (Latest BIOS on it)
CORSAIR XMS2 2GB (2 x 1GB) DDR2 800
Antec Phantom 500 ATX12V 500W Power Supply - Retail
Titan Robela Case (waterblock for GPU and CPU)
GeForce 7950GT 512MB PCI Express x16
1x Maxtor SATA 120GB Drive
1x cd/dvd reader

No overclocking, all standard/default bios settings.
ASUS AI overclocking turned off in the bios.
[Oct 25, 2006 3:58:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

Suggest u have a look in the files stderrdae.txt & stdoutdae.txt if anything was logged in there prior to the hang.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 25, 2006 4:26:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
confused Re: Core 2 Duo computation problems?

Hi Gollumer,
I don't have a clue. I am assuming you are running Windows XP SP2. So, maybe experimenting is in order. To reduce (though not eliminate) the possibility of a video device driver problem, switch the screen saver to (blank). Also, since dual HDC programs could take a lot of memory and produce a lot of streaming disk I/O, try reducing the BOINC threads to 1 in your preferences to see if that changes things. It shouldn't, but if it does then we have something to puzzle over.

Lawrence
[Oct 25, 2006 6:00:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RCTCGrid
Cruncher
Joined: Apr 6, 2006
Post Count: 10
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

You are not the only one who is seeing problem with work units failing. We are seeing similar problems with our newer chipset machines running the BOINC client. Our problem began around 10/5 and we are losing about 50% of our cpu time on a large number of our machines to errored results. We were thinking that it might have been an issue with the new 3d screensaver on the boinc client but as of yet we have not had any luck resolving the problem. The computers run the units for anywhere from 30 mins to 9 hours then throw back an error.
Result Log

<core_client_version>5.4.11</core_client_version>
<message>
- exit code -529697949 (0xe06d7363)
</message>
<stderr_txt>
World Community Grid AutoDock (projects/www.worldcommunitygrid.org/wcg_faah_autodock_5.26_windows_intelx86) version Failed to get VersionInfo size: 1812

Failed to get VersionInfo size: 1812
INFO: projects/www.worldcommunitygrid.org/wcg_faah_autodock_5.26_windows_intelx86 Start AutoGrid...
INFO:[20:34:35] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
wcg_checkpoint() called
Starting to checkpoint ...
Checkpoint complete
INFO:[20:36:00] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
Setting maxGen to 6750
autodock4: WARNING: Unrecognized keyword in docking parameter file, in line:
compute_unbound_extended # compute extended ligand energyINFO: No state to restore. Start from the beginning.
About to enter main loop...(dockings already completed: 0)
call_glss(): pop_size: 200 num_evals: 10000000 start: [20:36:13]
_maxGenSeenSoFar changed: 6750
********Start app_graphics_init********
Total used = 1515585536
Difference = 1515585536
********After gfxData********
Total used = 1515585536
Difference = 0
********After DockingSpheresInit()********
Total used = 1515585536
Difference = 0
********After opengl calls********
Total used = 1515585536
Difference = 0
********After LoadTGA********
Total used = 1523838976
Difference = 8253440
********After boinc_get_init_data********
Total used = 1523838976
Difference = 0
********After loadModel********
Total used = 1517150208
Difference = -6688768
********End of app_graphics_init********
Total used = 1517150208
Difference = 0
********Before app_graphics_resize********
Total used = 1517150208
Difference = 0
********After app_graphics_resize********
Total used = 1517150208
Difference = 0
********Start app_graphics_init********
Total used = 1477865472
Difference = -39284736
....... pages of this data cut out.....
********After app_graphics_resize********
Total used = 1728442368
Difference = 0
call_glss(): end: [04:16:50]
wcg_checkpoint() called
Starting to checkpoint ...

</stderr_txt>

It seems to never return from the checkpoint or returns with the error code and gives up maybe something with the checkpoint process, the 3d rendering or maybe the machines are running out of memory? I don't know with the crash debug for a 0xe06d7363 might indicate.
These machines are not overclocked its a standard config.
Most of our machines are the new clientpro 375 so
3 GHZ to 3.8 GHZ P4 processor either HT or Dual core
IntelĀ® 945G chipset
Integrated Intel Graphics Accelerator 950
512 MB of dual channel 533 DDR2 memory
all the other specs can be found on the mpc website
Good Luck with your bug let us know if you do discover the source of the problem.
-Melchior
[Oct 25, 2006 8:57:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Core 2 Duo computation problems?

RCTCGrid,

Your problem may be that you are running out of RAM on these machines. As your log shows, you are using 1.5gigs of ram on a machine with only 512 MB of RAM. The reason this is showing up is because of the device drivers used in the graphics part of the program. When initially testing on my laptop I only noticed a 5MB graphics increase, but on another machine with an ATI graphics card it bumped up to 50MB. This machine was tested on other projects and encountered the same boost for graphics.

Since your machines were running fine before graphics were enabled, I would suggest changing your screen saver on the machines to something other than BOINC. Can you test this on say 10 of your machines that are having this problem? Please note the machine ID numbers (host ID) from the client_state.xml file. Here is an example of what you will see in that file.

<project>
<master_url>http://www.worldcommunitygrid.org/</master_url>
<project_name>World Community Grid</project_name>
<user_name>uplinger</user_name>
<team_name>Austin Grid Team</team_name>
<email_hash>663ab4a907a4a436c6997fbdbd9332a6</email_hash>
<cross_project_id>d6b988571681f53f7df5536c6ac6c0a0</cross_project_id>
<user_total_credit>909630.837965</user_total_credit>
<user_expavg_credit>2231.992249</user_expavg_credit>
<user_create_time>1116875469.000000</user_create_time>
<rpc_seqno>984</rpc_seqno>
<hostid>49967</hostid>
<host_total_credit>15028.450168</host_total_credit>
<host_expavg_credit>151.775644</host_expavg_credit>


In there you should see <hostid>xxxxx</hostid>. Please post these and I can monitor them if you would like.

-Uplinger
[Oct 26, 2006 2:04:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 11   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread