Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 16
Posts: 16   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 6067 times and has 15 replies Next Thread
courine
Master Cruncher
Capt., Team In2My.Net Cmd. HQ: San Francisco
Joined: Apr 26, 2007
Post Count: 1794
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
cool Re: Please Help! NRW Ver6.15

What I'm doing to localize the problem I have, is to set the cache for only 2 days and run only 1 project at a time. Then cycle the projects in the device profile at 1 week intervals. If you have one machine, then the graphs in "My Grid" should provide enough info to base a judgement on where to crunch. If you have more machines (of random types), then you need to look at them individualy in the device manager and adjust it accordingly.

I'm doing this because of a different problem. The number of hour on my static grid dedicated machines dont match the number of hours reported. But this is for another thread, when I have more data to present.
----------------------------------------



[May 30, 2008 9:03:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Please Help! NRW Ver6.15

Sample from a few machines with different OS and under very severe load to include a heavy math project that if run concurrently causes several degrees C CPU temp increase. These systems have not been rebuild since coming on line, BUT not used for gaming! Zero error results since version 6.15 (project has lowest load and error rate of all WCG projects - See Matrix in Start Here forum).

R12_ 410fb2965b3374164369b4f41503b086_ 22_ 0-- A Valid 05/27/2008 13:38:47 05/31/2008 10:52:13 8.00 123.0 / 146.3
R12_ 2e3c257e57e41d137f9c3231f2235619_ 06_ 5-- A Valid 05/27/2008 12:35:28 05/31/2008 09:59:01 8.01 123.1 / 138.2
R12_ 1a0281e7e4852f2b5ed8bbd80fcffc1b_ 03_ 18-- A Valid 05/27/2008 08:03:24 05/30/2008 16:28:34 8.00 122.8 / 138.6
R12_ 1452daf3af307ec3e30e7b4fd8de2b85_ 13_ 8-- A Valid 05/27/2008 08:02:11 05/30/2008 16:28:34 8.01 122.9 / 141.5
R12_ 0a4f40d1901e47d0a6eac8fd3f1ec2f7_ 14_ 2-- A Valid 05/27/2008 08:00:54 05/30/2008 13:50:31 8.01 122.8 / 136.6
R11_ f9929d14a027a37885dddeb7f9710902_ 03_ 7-- A Valid 05/27/2008 05:39:22 05/30/2008 13:44:50 8.02 123.0 / 131.9
R11_ b2690254d70b36b6a180558e533e6707_ 22_ 10-- A Valid 05/26/2008 19:27:00 05/30/2008 07:52:20 8.01 123.1 / 134.3
R11_ f9929d14a027a37885dddeb7f9710902_ 05_ 11-- A Valid 05/27/2008 05:38:10 05/30/2008 07:19:04 8.01 123.0 / 141.5
R11_ bd3e2c6aaae0104ed769c801e3f1d73e_ 06_ 18-- A Valid 05/26/2008 23:15:29 05/30/2008 05:56:07 8.01 123.3 / 130.1
R11_ d09a8a449d1420be2d6de05bc7726f8e_ 12_ 1-- A Valid 05/26/2008 23:12:17 05/30/2008 05:47:59 8.01 123.0 / 143.5
R11_ d0989c8bc3a4869435407b322a8171b9_ 12_ 14-- A Valid 05/26/2008 23:04:22 05/30/2008 05:39:29 8.02 123.4 / 133.2
R11_ af2ef1aab6cdb103efa24f176a545614_ 07_ 6-- A Valid 05/26/2008 18:16:38 05/29/2008 21:53:01 8.01 122.9 / 150.7
R11_ b3382b4bd2caeded7c440463d4857f84_ 03_ 17-- A Valid 05/26/2008 18:10:23 05/29/2008 21:43:44 8.01 122.9 / 131.9
R11_ a64f28b9efdf95a321eb4d3002ea7da6_ 01_ 7-- A Valid 05/26/2008 18:08:12 05/29/2008 21:35:02 8.00 122.8 / 146.3
R11_ a64f28b9efdf95a321eb4d3002ea7da6_ 09_ 16-- A Valid 05/26/2008 15:32:15 05/29/2008 15:11:35 8.01 122.8 / 127.1
R11_ 88b0c23afb44c053edd40f0f90cf75b2_ 18_ 4-- A Valid 05/26/2008 14:51:50 05/29/2008 13:50:20 8.01 122.9 / 132.1
R11_ 8de93d5b1e9ee865bb8b37a4e47f4d02_ 03_ 15-- A Valid 05/26/2008 14:47:38 05/29/2008 13:40:44 8.01 122.8 / 124.6
R11_ 973dfe1d1be670450c8f9b6b3a0d1b55_ 20_ 1-- A Valid 05/26/2008 14:46:28 05/29/2008 13:32:16 8.01 123.0 / 140.2
R11_ 8a25573d233af029adffd33da174ca4d_ 01_ 12-- A Valid 05/26/2008 12:53:00 05/29/2008 06:32:35 8.01 122.9 / 120.9
R11_ 8c2d9f1e17b18733cd4a0881d6468433_ 09_ 7-- A Valid 05/26/2008 12:20:44 05/29/2008 05:04:31 8.01 122.9 / 126.4
R10_ ee0452bf0cc615d7d34a6e2c119d780d_ 01_ 4-- A Valid 05/25/2008 08:05:21 05/29/2008 04:55:52 8.01 122.9 / 128.2
R11_ 80fa1d89cdb0b73926c86893b616274c_ 22_ 17-- A Valid 05/26/2008 12:18:26 05/29/2008 04:44:30 8.02 123.1 / 126.8
R10_ ec0e9ac3423077384468e289296f4b6b_ 08_ 4-- A Valid 05/25/2008 08:04:01 05/28/2008 21:19:53 8.00 122.8 / 129.2
R10_ ec0e9ac3423077384468e289296f4b6b_ 16_ 16-- A Valid 05/25/2008 08:00:42 05/28/2008 19:56:34 8.02 123.3 / 128.5
R10_ ec0e9ac3423077384468e289296f4b6b_ 14_ 4-- A Valid 05/25/2008 08:00:42 05/28/2008 19:45:01 8.03 124.1 / 125.7
R10_ 9283d1af84edb700ed27bb62c0ffa2ec_ 18_ 11-- A Valid 05/24/2008 20:49:32 05/28/2008 19:45:01 8.02 124.0 / 141.1
R10_ 95c9301a7c04980be6ab5a05e6b4a94b_ 04_ 7-- A Valid 05/24/2008 19:53:49 05/28/2008 12:21:23 8.02 123.5 / 148.4
R10_ 8f133400c7900d114e455ad90ef64cec_ 18_ 6-- A Valid 05/24/2008 18:45:13 05/28/2008 10:58:32 8.01 123.8 / 136.3
R10_ 6531350f80861ef55fa7601af963145d_ 21_ 16-- A Valid 05/24/2008 15:02:06 05/28/2008 10:52:53 8.00 123.4 / 146.1
R10_ 5ce6e56b9f57b42a681e3ec7acd86906_ 15_ 15-- A Valid 05/24/2008 12:28:47 05/28/2008 10:43:13 8.02 123.4 / 143.6

R11_ 5c2cd1e4d0bf3ab44b90f11dd2897233_ 12_ 3-- B Valid 05/26/2008 08:00:07 05/29/2008 23:00:56 8.01 82.8 / 87.6
R11_ 22b04faaf37cec3c8ed734a6d0222840_ 14_ 11-- B Valid 05/25/2008 22:40:44 05/29/2008 19:01:52 8.00 82.8 / 82.9
R02_ d49baa6e03394c16329aba662c117aed_ 31_ 18-- B Valid 05/25/2008 16:23:10 05/28/2008 21:07:42 8.02 83.0 / 91.6
R00_ 820ec1ad7253215f3a9a3ae5bbe3ca41_ 29_ 8-- B Valid 05/25/2008 11:25:00 05/28/2008 12:34:54 8.00 81.6 / 76.3
R10_ ee0452bf0cc615d7d34a6e2c119d780d_ 00_ 17-- B Valid 05/25/2008 08:03:49 05/28/2008 08:48:30 8.00 81.6 / 87.5
R10_ 82be1c2e849f205e46092f5cc3ee9b28_ 17_ 5-- B Valid 05/24/2008 17:36:22 05/26/2008 08:01:19 8.02 81.7 / 86.9
R10_ ec0e9ac3423077384468e289296f4b6b_ 18_ 16-- B Valid 05/25/2008 08:00:03 05/26/2008 08:01:19 8.04 82.0 / 87.8
R10_ 63dc0986d3931a6ea9b003575ebd3af8_ 20_ 4-- B Valid 05/24/2008 13:10:32 05/25/2008 19:32:20 8.01 81.8 / 86.1
R10_ 3d7fcdf3253dbfd674633f626de9fc16_ 19_ 16-- B Valid 05/24/2008 07:47:05 05/25/2008 13:01:58 8.00 81.6 / 92.7
R10_ 37d28bbd35edc83cf8a8befa045a425a_ 09_ 2-- B Valid 05/24/2008 07:45:56 05/25/2008 08:39:25 8.01 81.7 / 92.0
R09_ ccc8144ff19f02771e8d9ed94e6be465_ 17_ 4-- B Valid 05/23/2008 16:41:49 05/25/2008 08:03:49 8.00 81.6 / 82.9
R08_ bfc398609cb7f7e5e58f41ab401f6ab8_ 06_ 12-- B Valid 05/22/2008 08:01:19 05/24/2008 21:04:14 8.01 81.8 / 84.0
R08_ bcc0cc32d1ecb2053af90d5b8de4c690_ 22_ 7-- B Valid 05/22/2008 08:00:06 05/24/2008 13:19:35 8.00 82.0 / 86.8
R08_ 8a8c9c5822851b0b6212718c2017794a_ 05_ 10-- B Valid 05/21/2008 23:35:32 05/24/2008 09:14:55 8.02 81.9 / 99.9
R08_ 7a828cf9eac836d06ed70245563507db_ 03_ 15-- B Valid 05/21/2008 21:48:05 05/24/2008 07:47:05 8.00 81.6 / 94.4

Senza altre Verbe o Parole
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[May 31, 2008 2:59:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Please Help! NRW Ver6.15

@MacDitch & techs

Had a series today of NRW failing with your same error and all going down within first few minutes (it's actually the only project running on the remote box). Was on Vista with a test BOINC, but that's not important as it had been running since it came out some weeks ago. What I was able to reconstruct was that they all occurred during a very sluggish remote-control session. Possibly a loss of 'heartbeat' or similar though none of the logs anywhere indicate this. More possible is a security issue during the session.

<core_client_version>6.2.2</core_client_version>
<![CDATA[
<message>
Funzione non corretta. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
wcg_seed 208279988
running time: 83.444935
wcg_seed 959677790
running time: 164.347053
wcg_seed 363867003
could not open file best1.pdb

</stderr_txt>
]]>

All 8 logs I looked into showed a fail at the first few seeds. Anyway all jobs after the RC session were perfect again. Just that people know that it might be a good idea to just snooze/suspend BOINC for the duration rather than cycling through jobs.

So, learned something not to do on that Vista screamer from remote.

ciao

PS, not all seemed to suffer the same faith. The jobs that started a wee bit before the RC session, finished properly.
R00013_ 1739de49b982dcc700be979b462759b8_ 19_ 14-- 361642 Valid 05/28/2008 19:43:46 06/01/2008 14:14:32 8.00 122.8 / 145.2
R00013_ d4972088283242460d07bc05de79b246_ 10_ 13-- 361642 Error 05/29/2008 17:33:01 06/01/2008 13:13:40 0.07 1.1 / 0.0
R00013_ 961fae06805f3b3b88cd213e3b6500e2_ 14_ 4-- 361642 Error 05/29/2008 13:49:01 06/01/2008 13:09:04 0.06 0.9 / 0.0
R00013_ aafad40d0f84d1c98347fdfaf77c0918_ 06_ 15-- 361642 Error 05/29/2008 13:07:17 06/01/2008 12:02:20 0.09 1.4 / 0.0
R00013_ 1739de49b982dcc700be979b462759b8_ 16_ 13-- 361642 Valid 05/28/2008 19:27:02 06/01/2008 11:55:49 8.02 123.1 / 131.1
R00013_ 81cfb2bc3d48c02678b88077b209de62_ 12_ 18-- 361642 Error 05/29/2008 08:05:01 06/01/2008 11:45:57 0.04 0.6 / 0.0
R00013_ 7ff865a7f3531623b291b3b3698c157c_ 13_ 3-- 361642 Error 05/29/2008 06:29:15 06/01/2008 11:43:25 0.07 1.1 / 0.0
R00013_ 627ece30a0f9e2c9dcf27da92325a1e0_ 13_ 9-- 361642 Error 05/29/2008 06:13:58 06/01/2008 11:38:44 0.01 0.2 / 0.0
R00013_ 627ece30a0f9e2c9dcf27da92325a1e0_ 16_ 2-- 361642 Error 05/29/2008 06:24:09 06/01/2008 11:38:44 0.01 0.2 / 0.0
R00013_ 52f5e82827df7f327d046020092fea2e_ 24_ 11-- 361642 Error 05/28/2008 22:11:13 06/01/2008 11:38:44 0.05 0.8 / 0.0

----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 1, 2008 6:00:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Please Help! NRW Ver6.15

Sorry for the week or two silence.

I took the NRW files to an offline cruncher and it crunched them all fine - so still don't know what is wrong with this machine, but I'll just keep it off NRW for the time being.

The new problem is that while they were being crunched WCG marked them as in Error (for no reason I can see) and despite them all completing, uploading & reporting successfully they remain shown as 'Error' with no information or credit awarded. This has happened to about 20 w/u now so I've swapped off NRW for the moment.

Anyone know why this is happening?
[Jun 11, 2008 4:34:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Please Help! NRW Ver6.15

No idea how you did this:
"I took the NRW files to an offline cruncher and it crunched them all fine"

Implies that the tasks were on 2 machines, with the original receiving machine having a brief go at them and communicating with servers? It is even for experts a daunting task to take job-files across, so usually they make a copy of the complete install, take it to an off-line cruncher, keeping the client on the source machine in limbo, and crunching the lot on a sneaker machine and after transporting the whole set back.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jun 11, 2008 4:58:54 PM]
[Jun 11, 2008 4:57:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Please Help! NRW Ver6.15

No idea how you did this:
"I took the NRW files to an offline cruncher and it crunched them all fine"

Implies that the tasks were on 2 machines, with the original receiving machine having a brief go at them and communicating with servers? It is even for experts a daunting task to take job-files across, so usually they make a copy of the complete install, take it to an off-line cruncher, keeping the client on the source machine in limbo, and crunching the lot on a sneaker machine and after transporting the whole set back.

Bad wording on my part - I did indeed move the entire install (as you call it) to the offline machine.
To the best of my knowledge the 'receiving' machine didn't communicate with WCG in between times, but I guess it could have. Hmm, not a problem I've seen before, and I've been using an offline cruncher for a while. Nevermind, I'll just not use it for WCG but move it back to CPDN.
[Jun 12, 2008 10:51:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 16   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread