Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2006 times and has 8 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Unable to complete HPF2 WUs while FAAH works fine

Hello,

Yesterday I installed the latest BOINC version 5.10.13 on my Windows XP. Until then I was using UD Agent without problems, working on all available projects. I noticed that now I can't get a HPF2 WU to finish, all WU compute for a few minutes and abort with one of these 2 messages:

1.
"Result Log

<core_client_version>5.10.13</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
Failed to get VersionInfo size: 1812
ERROR:: Exit at: .\dock_structure.cc line:401

</stderr_txt>
]]>"



2.
"Result Log


<core_client_version>5.10.13</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
Failed to get VersionInfo size: 1812
sin_cos_range ERROR: 12328.990 is outside of [-1,+1] range
ERROR:: Exit at: .\utility/sin_cos_range.h line:66

</stderr_txt>
]]>"



FAAH WUs look fine:
"
...
Checkpoint complete

________________________________________________________________________________

autodock4: Successful Completion on "World Community Grid device"

________________________________________________________________________________

AutoDock finishing with return code: 0

</stderr_txt>
]]>"


For now I set it to get FAAH only work so as not to squander away good CPU time.

Please help,
Best regards
[Aug 2, 2007 1:36:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

hi Petre_Huica

Think i'll report to the technicians. Makes no sense, particularly as i've been running 5.10.13 on WXP pro SP2 for quite a while without any issue on HPF2.

Can you check in the Result Status page and click on the Work Unit Name to see the list of 19 in the quorum. Can you tell us if any other is showing 'error'. When reporting back, can you post a work unit number like le965_xxxxxxx

Standby
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Aug 2, 2007 2:11:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

Hi Sekerob,

Regarding HPF2 WUs, I checked and they always end in "Error" status after a few minutes. Here's the first typical result:

WU lf123_00047:

Result Log

<core_client_version>5.10.13</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
Failed to get VersionInfo size: 1812
sin_cos_range ERROR: -34.036636 is outside of [-1,+1] range
ERROR:: Exit at: .\utility/sin_cos_range.h line:66

</stderr_txt>
]]>

The value is arbitrary.


Workunit Status

Project Name: Human Proteome Folding - Phase 2
Created: 07/31/2007 13:36:28
Name: lf123_00047
Minimum Quorum: 15
Initial Replication: 19


Result Name Status Sent Time Time Due /
Return Time CPU Time (hours) Claimed/ Granted BOINC Credit
lf123_ 00047_ 19-- Valid 08/02/2007 00:09:12 08/02/2007 08:36:24 8.36 58.0 / 58.7
lf123_ 00047_ 9-- Valid 08/01/2007 13:42:51 08/01/2007 21:56:52 3.05 52.6 / 58.7
lf123_ 00047_ 8-- Valid 08/01/2007 13:42:41 08/02/2007 11:34:33 4.46 66.8 / 58.7
lf123_ 00047_ 1-- Valid 08/01/2007 13:42:37 08/02/2007 20:16:51 7.96 66.3 / 58.7
lf123_ 00047_ 5-- In Progress 08/01/2007 13:42:03 08/12/2007 13:42:03 0.00 0.0 / 0.0
lf123_ 00047_ 13-- Valid 08/01/2007 13:41:22 08/02/2007 07:38:39 6.24 46.7 / 58.7
lf123_ 00047_ 3-- Valid 08/01/2007 13:41:16 08/02/2007 00:28:03 8.72 61.3 / 58.7
lf123_ 00047_ 11-- Valid 08/01/2007 13:41:00 08/02/2007 05:12:15 5.18 40.9 / 58.7
lf123_ 00047_ 2-- Valid 08/01/2007 13:40:50 08/02/2007 17:26:37 4.03 51.7 / 58.7
lf123_ 00047_ 10-- Valid 08/01/2007 13:39:41 08/02/2007 04:53:03 9.97 43.7 / 58.7
lf123_ 00047_ 17-- In Progress 08/01/2007 13:39:33 08/12/2007 13:39:33 0.00 0.0 / 0.0
lf123_ 00047_ 6-- Valid 08/01/2007 13:38:51 08/02/2007 10:25:06 6.03 55.7 / 58.7
lf123_ 00047_ 0-- Error 08/01/2007 13:38:45 08/02/2007 00:06:47 0.15 1.4 / 0.0
lf123_ 00047_ 12-- Valid 08/01/2007 13:38:42 08/03/2007 15:52:58 9.38 76.8 / 58.7
lf123_ 00047_ 16-- In Progress 08/01/2007 13:38:35 08/12/2007 13:38:35 0.00 0.0 / 0.0
lf123_ 00047_ 7-- Valid 08/01/2007 13:38:35 08/02/2007 05:57:12 6.10 65.6 / 58.7
lf123_ 00047_ 4-- Valid 08/01/2007 13:38:08 08/02/2007 11:53:51 7.70 61.3 / 58.7
lf123_ 00047_ 15-- Valid 08/01/2007 13:37:52 08/02/2007 03:22:01 3.55 62.1 / 58.7
lf123_ 00047_ 14-- Valid 08/01/2007 13:37:29 08/01/2007 23:38:27 5.02 47.5 / 58.7
lf123_ 00047_ 18-- Valid 08/01/2007 13:37:12 08/02/2007 10:22:07 5.67 56.0 / 58.7

I'm the only one with "Error" and this is consistent with my other HPF2 results, bar some occasional "Error"s from others which are rare enough (1, max 2 per WU). But I'm always "erring", which is strange.



Here's the second typical result:

WU lf141_00023:

Result Log

<core_client_version>5.10.13</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
Failed to get VersionInfo size: 1812
ERROR:: Exit at: .\dock_structure.cc line:401

</stderr_txt>
]]>


Workunit Status

Workunit Status


Project Name: Human Proteome Folding - Phase 2
Created: 07/31/2007 20:11:09
Name: lf141_00023
Minimum Quorum: 15
Initial Replication: 19


Result Name Status Sent Time Time Due /
Return Time CPU Time (hours) Claimed/ Granted BOINC Credit
lf141_ 00023_ 19-- Valid 08/02/2007 00:38:20 08/02/2007 05:45:01 3.78 46.7 / 53.4
lf141_ 00023_ 15-- Valid 08/02/2007 00:25:50 08/02/2007 10:00:27 4.31 41.1 / 53.4
lf141_ 00023_ 0-- Valid 08/02/2007 00:25:28 08/02/2007 06:18:30 3.87 57.2 / 53.4
lf141_ 00023_ 10-- Valid 08/02/2007 00:24:37 08/03/2007 12:14:12 6.39 55.6 / 53.4
lf141_ 00023_ 13-- Valid 08/02/2007 00:24:02 08/02/2007 21:42:41 8.04 66.8 / 53.4
lf141_ 00023_ 6-- Error 08/02/2007 00:23:43 08/02/2007 00:34:08 0.06 0.5 / 0.0
lf141_ 00023_ 1-- Valid 08/02/2007 00:23:22 08/02/2007 13:28:49 4.62 42.6 / 53.4
lf141_ 00023_ 2-- Valid 08/02/2007 00:22:56 08/02/2007 08:51:43 5.06 45.2 / 53.4
lf141_ 00023_ 8-- In Progress 08/02/2007 00:21:48 08/13/2007 00:21:48 0.00 0.0 / 0.0
lf141_ 00023_ 11-- Valid 08/02/2007 00:21:43 08/03/2007 13:39:51 10.57 61.5 / 53.4
lf141_ 00023_ 16-- Valid 08/02/2007 00:21:36 08/02/2007 10:20:35 4.63 48.3 / 53.4
lf141_ 00023_ 17-- Valid 08/02/2007 00:20:58 08/02/2007 19:34:38 4.69 54.6 / 53.4
lf141_ 00023_ 12-- Valid 08/02/2007 00:20:02 08/03/2007 09:48:07 8.42 49.8 / 53.4
lf141_ 00023_ 5-- Valid 08/02/2007 00:19:48 08/02/2007 15:42:27 9.35 63.2 / 53.4
lf141_ 00023_ 3-- Valid 08/02/2007 00:18:35 08/03/2007 03:07:33 6.68 66.7 / 53.4
lf141_ 00023_ 4-- In Progress 08/02/2007 00:18:08 08/13/2007 00:18:08 0.00 0.0 / 0.0
lf141_ 00023_ 9-- Valid 08/02/2007 00:15:26 08/03/2007 12:00:16 28.59 58.2 / 53.4
lf141_ 00023_ 18-- Valid 08/02/2007 00:14:58 08/02/2007 14:17:53 4.81 41.9 / 53.4
lf141_ 00023_ 14-- Valid 08/02/2007 00:14:01 08/03/2007 05:06:17 5.06 56.8 / 53.4
lf141_ 00023_ 7-- Valid 08/02/2007 00:13:32 08/02/2007 11:15:43 4.39 54.1 / 53.4

Like the first one, I'm the only one with "Error" .



I also checked the FAAH WUs in more detail and unlike what I said in the first place, they're not fine at all either. Here I get either "Invalid" or "Inconclusive" status, even if the result log always ends with:

"
...
Checkpoint complete

________________________________________________________________________________

autodock4: Successful Completion on "World Community Grid device"

________________________________________________________________________________

AutoDock finishing with return code: 0

</stderr_txt>
]]>"



One thing that may be worth mentioning is this: my device is a laptop. We had some hot weather in july and I installed NHC (2.0 pre-release 6, 14.05.2007) in a bid to lower my cpu temp by undervolting. I did it by the book, gradually working to the sweet spot between stability and coolness. I tested each step extensively for stability exactly because I was crunching and I didn't want to give out bad results. At the time I was running UD Agent, with no apparent errors. However I don't know how (or if it's possible) to see the result logs in UD, so I can't make a direct comparison with the current situation (running BOINC), but maybe this holds for something.

Petre
[Aug 3, 2007 9:24:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

This looks like a clear hardware error.

Sadly, UD does not report errors or invalid work, so this has probably been going on for some time.

I can't recommend undervolting your CPU. Start by resetting everything to the factory defaults, then you can consider alternative ways of cooling. Underclocking is safer than undervolting.
[Aug 3, 2007 9:42:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

The exact same thing happened to me.

You've lowered the voltage too far. The NHC testing doesn't put much stress on the cpu.

You can use prime95 to test for proper voltage, if it fails with an error, hpf2 will fail also.
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 5, 2007 2:46:32 PM]
[Aug 5, 2007 2:45:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

Didactylos: thank you for the info, I will go up in voltage until I get good results

Questar: what are your system specs, what was your default voltage and how far were you able to go down in voltage and still be 100% stable (ie. get good crunch results)?
[Aug 5, 2007 10:20:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

Didactylos: thank you for the info, I will go up in voltage until I get good results

Questar: what are your system specs, what was your default voltage and how far were you able to go down in voltage and still be 100% stable (ie. get good crunch results)?


What is stable is going to differ from system to system.

Once I found the problem I didn't do a lot of testing to see just how low I could go, I just picked something that worked.

I'm running a 1.7Ghz Pentium M, with these settings.
6x 0.732v
8x 0.812v
10x 1.084
12x and above 1.132v
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 6, 2007 11:32:56 PM]
[Aug 6, 2007 11:31:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

I have a 1.8Ghz Pentium M and I only worry about the maximum multiplier (18x), because the cpu is always maxed out crunching. I used p95 and I had to go from 1.052v up to 1.116v to be 100% stable (18h + no errors). I'm satisfied with this and now I'm waiting for BOINC results to come up.
[Aug 7, 2007 8:32:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Unable to complete HPF2 WUs while FAAH works fine

Valid BOINC results are pouring down, HPF2 and FAAH alike smile
Problem solved, thx for your help.
[Aug 12, 2007 4:32:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread