| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 25
|
|
| Author |
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
I'm getting what I think is an unusual number of computation errors on 1 machine. The errors are coming on 3 different projects, FAAH, C4CW and C4CW betas.
----------------------------------------C4CW error log Result Name: c4cw_ target01_ 205222425_ 1-- <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> Commandline = projects/www.worldcommunitygrid.org/wcg_c4cw_lmps_6.13_windows_intelx86 -screen none -in in.wcg.acc -var wcgsteps1 1000 -var wcgsteps2 5000 -var loop 0 -var restart 0 -var rinterval 100 -var ifile in.wcg.acc -var wcgseed 205222425 Abort called errorcode = 1 </stderr_txt> ]]> FAAH error log Result Name: faah16080_ ZINC00074312_ xEN_ 3rd_ md07910_ 02_ 1-- <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> - exit code -1 (0xffffffff) </message> <stderr_txt> Failed to get VersionInfo size: 2 INFO:[03:03:43] Start AutoGrid... autogrid: autogrid4: Successful Completion. INFO:[03:04:45] End AutoGrid... Beginning AutoDock... autodock4: wrong number of values read in. Check grid map! The C4CW Beta error log is extremely long. I will post it if necessary. I'm running basically the same project mix on 4 machines and this is the only one producing errors. Win7 Pro 64 bit i7-860 running @ 3.6 GHz. 4GB (2x2) PC3-8500 memory Microsoft Security Essentials Windows firewall
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
----------------------------------------![]() ![]() [Edit 2 times, last edit by nanoprobe at Aug 28, 2010 1:55:01 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hi nanoprobe,
----------------------------------------Of course when you saw multiple sciences failing and on the highest hex too - exit code -1 (0xffffffff) you did a soft system boot, so expecting them to not show again soon. Let us know.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
Thanks Sekerob. I just rebooted this machine 2 days ago after some MS updates but I'll do it again and see what gives. Stay tuned.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
|
RT
Master Cruncher USA - Texas - DFW Joined: Dec 22, 2004 Post Count: 2636 Status: Offline Project Badges:
|
Are you running additiona projects on that machine (Other than the ones mentioned)?
---------------------------------------- |
||
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
Are you running additional projects on that machine (Other than the ones mentioned)? The machine in question runs FAAH, C4CW and DDDT2 when available. It also receives betas as they are available. I know computation errors happen but this machine had 6 yesterday. 4 C4CW and 1 each FAAH and C4CW beta.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Would not want to point finger at MS either nor AV/Firewall. Maybe a little hardware diag if it starts appearing again, most certainly on FAAH... the old faithful amongst WCG sciences and maybe a project reset to refresh all the science app components after first running cache dry. Duo and Quad chirpingly happy with the C4CW jobs, Linux cranking them out at 1.25 to 1.27 BOINC hours.
----------------------------------------
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Aug 28, 2010 4:04:34 PM] |
||
|
|
RT
Master Cruncher USA - Texas - DFW Joined: Dec 22, 2004 Post Count: 2636 Status: Offline Project Badges:
|
I have quite a few machines running c4cw at the moment. Among them are Ubuntu, Win7, Vista and XP. Quads, Core2 Duo, P4, Atom... So quite a variety. I have sucessfully processed 1,379 c4cws and have had 15 inconclusives but no errors. All of the non Ubuntu machines have Microsoft Security Essentials and the Windows firewall. It looks like you are O/Clocked and perhaps you are a bit too close to the limit for entirely reliable operation. You might try backing off your settings at least during the warmer periods. -- Just a suggestion.
---------------------------------------- |
||
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
I have quite a few machines running c4cw at the moment. Among them are Ubuntu, Win7, Vista and XP. Quads, Core2 Duo, P4, Atom... So quite a variety. I have sucessfully processed 1,379 c4cws and have had 15 inconclusives but no errors. All of the non Ubuntu machines have Microsoft Security Essentials and the Windows firewall. It looks like you are O/Clocked and perhaps you are a bit too close to the limit for entirely reliable operation. You might try backing off your settings at least during the warmer periods. -- Just a suggestion. I had 8 more computation errors this morning all on C4CW. This box is overclocked and has been running with no errors for months. I originally had it running 24/7 at 3.8 GHz but decided to back off to 3.6 Ghz to eliminate any potential stability problems. It doesn't crash or reboot. I have other boxes running at equal or higher % overclocked than this one without issues. I'm going to back off to 3.4 Ghz and see what happens. Don't know what else to do but I don't think the overclock is the culprit.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
|
RT
Master Cruncher USA - Texas - DFW Joined: Dec 22, 2004 Post Count: 2636 Status: Offline Project Badges:
|
I have quite a few machines running c4cw at the moment. Among them are Ubuntu, Win7, Vista and XP. Quads, Core2 Duo, P4, Atom... So quite a variety. I have sucessfully processed 1,379 c4cws and have had 15 inconclusives but no errors. All of the non Ubuntu machines have Microsoft Security Essentials and the Windows firewall. It looks like you are O/Clocked and perhaps you are a bit too close to the limit for entirely reliable operation. You might try backing off your settings at least during the warmer periods. -- Just a suggestion. I had 8 more computation errors this morning all on C4CW. This box is overclocked and has been running with no errors for months. I originally had it running 24/7 at 3.8 GHz but decided to back off to 3.6 Ghz to eliminate any potential stability problems. It doesn't crash or reboot. I have other boxes running at equal or higher % overclocked than this one without issues. I'm going to back off to 3.4 Ghz and see what happens. Don't know what else to do but I don't think the overclock is the culprit. Of course it could be many more things than I know but it sounded like that might be the cause. These things are often temperature sensitive and I was thinking that the machine may be in a slightly warmer environment due to the weather or other factors or may have accumulated some dust bunnys inside that could cause the temporary errors (which would cause continually increasing problems). In any case, I hope I have not caused a wild goose chase. Have a great day and thanks for crunching here at WCG. ![]() |
||
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
Of course it could be many more things than I know but it sounded like that might be the cause. These things are often temperature sensitive and I was thinking that the machine may be in a slightly warmer environment due to the weather or other factors or may have accumulated some dust bunnys inside that could cause the temporary errors (which would cause continually increasing problems). In any case, I hope I have not caused a wild goose chase. Have a great day and thanks for crunching here at WCG. I do appreciate any input. This box is water cooled so heat is not an issue. Even @ 3.8 GHz the temps hovered around 60c. At 3.6 GHz temps are mid 50s. It also has 3 other 120mm fans (2 in 1 out on top). I'm beginning to think the culprit is the mobo. It's an ECS which I traded for and have never used before this one. The BIOS setup is the worst I've ever seen but that's another topic. I backed the OC down to 3.4 GHz and we'll see what happens. I'm going to let it finish the cache of WUs it has and then switch it to Linux for a comparison test. Thanks for the help.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
|
|