Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 37
Posts: 37   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 6946 times and has 36 replies Next Thread
rembertw
Senior Cruncher
Belgium
Joined: Nov 21, 2005
Post Count: 275
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why did wu error?

info for the techs:

Between 2010.02.24 and today 2010.02.25 I noticed 7 errors on different computers, and about 45-60 validated units. The same machines had errors and validated results. Example of an error and a validated on the same machine:

CMD2_ 0349-MYH3.clustersOccur-1FNT_ B.clustersOccur_ 36_ 92351_ 93220_ 1-- validated.

CMD2_ 0349-MYH3.clustersOccur-2JIK_ A.clustersOccur_ 5_ 33112_ 35731_ 34586_ 35731_ 0-- errored.

Maybe this helps in the trouble shooting.
[Feb 25, 2010 8:42:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Why did wu error?

Who's running 64 bit OS / Client? Looking up I see 6.2.28 suggesting these are the 32 bit clients, but what OS?

My 32 bit duo continues to return valid results, the 64 bit turn mush. So happens we had for W7 an out of cycle patch day which required a boot.

RICE/HFCC/HPF2 continue to validate, the CMD2's go all bad either directly upon return or when the wingman comes in to kick off the validation.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 25, 2010 8:45:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Why did wu error?

getting some validations from later wus.

Project Name: Help Cure Muscular Dystrophy - Phase 2
Created: 21/02/10
Name: CMD2_0351-MYH3.clustersOccur-1H6V_C.clustersOccur_411
Minimum Quorum: 2
Replication: 2



Result Name App Version Number Status Sent Time Time Due /
Return Time CPU Time (hours) Claimed/ Granted BOINC Credit
CMD2_ 0351-MYH3.clustersOccur-1H6V_ C.clustersOccur_ 411_ 1-- 614 Valid 25/02/10 05:55:29 25/02/10 15:49:42 4.72 99.9 / 88.5
CMD2_ 0351-MYH3.clustersOccur-1H6V_ C.clustersOccur_ 411_ 0-- 614 Valid 25/02/10 05:55:22 25/02/10 21:14:11 4.91 77.1 / 88.5


Running windows 7 64 bit Boinc 6.10 18
[Feb 25, 2010 9:42:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Col323
Senior Cruncher
Joined: Nov 4, 2008
Post Count: 372
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why did wu error?

My Error WUs were from my sole Win 7 box - 64 bit Boinc on 64 bit OS. However, of the 0349 variety I show 23 Valid and only 2 Error. 10 of the valid have been returned after my first error appeared.
[Feb 25, 2010 9:55:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Why did wu error?

Well I may as well add to the list - from a reliable computer in the past

CMD2_ 0349-MYH3.clustersOccur-2PA2_ A.clustersOccur_ 24_ 130174_ 130838_ 130495_ 130563_ 1-- Error 2/24/10 19:01:53 2/25/10 01:04:46 0.12 2.1 / 0.0

CMD2_ 0349-MYH3.clustersOccur-2HJH_ A.clustersOccur_ 230_ 743648_ 744512_ 1-- Error 2/24/10 11:32:26 2/24/10 19:01:53 4.21 73.8 / 0.0

CMD2_ 0349-MYH3.clustersOccur-1GK4_ D.clustersOccur_ 154_ 789013_ 791274_ 0-- Error 2/24/10 07:22:49 2/24/10 14:28:57 3.32 58.3 / 0.0

From the above errored WU's appears to be occurring right now in the 349/350 WUs

And it appears these were in the same boat as Seks - errored out when wingman returned their unit - so was errored in the validation process.

the last 2 especially appear to have ran normal

Edited for: added text
----------------------------------------
[Edit 3 times, last edit by Former Member at Feb 25, 2010 10:57:39 PM]
[Feb 25, 2010 10:53:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Why did wu error?

FYI the above WUs were on a 32bit client running WinXP Pro so it isn't just happening to 64bit clients and there was nothing in the log that indicated anything unusual
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 25, 2010 11:18:11 PM]
[Feb 25, 2010 11:17:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
GB033533
Senior Cruncher
UK
Joined: Dec 8, 2004
Post Count: 206
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why did wu error?

I notice that for the first of my three recent 'errors', the two extra copies sent out have now been returned and validated, so it wasn't a rogue wu. And they had similar runtimes and claim to mine and my original wingman;

CMD2_ 0350-MYH3.clustersOccur-1WAK_ A.clustersOccur_ 381_ 456662_ 457253_ 3-- 614 Valid 2/25/10 01:43:27 2/25/10 06:56:14 3.03 55.3 / 66.4
CMD2_ 0350-MYH3.clustersOccur-1WAK_ A.clustersOccur_ 381_ 456662_ 457253_ 2-- 614 Valid 2/25/10 01:43:26 2/25/10 23:37:48 3.06 77.6 / 66.4
CMD2_ 0350-MYH3.clustersOccur-1WAK_ A.clustersOccur_ 381_ 456662_ 457253_ 1-- 614 Error 2/24/10 07:02:41 2/25/10 01:36:57 3.10 58.4 / 0.0
CMD2_ 0350-MYH3.clustersOccur-1WAK_ A.clustersOccur_ 381_ 456662_ 457253_ 0-- 614 Error 2/24/10 07:01:10 2/24/10 21:19:19 3.64 56.7 / 0.0 <-- mine

I know my original 'error' wu that started this thread is a lost cause, but I'm still hoping I might receive credit for the more recent three, before they drop off the results list..... or that gold badge remains a distant hope.
----------------------------------------

[Feb 26, 2010 9:45:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Why did wu error?

Think there was a mishap entirely ltd to the validators. Word from techs has been zip, so can't tell why 18 CPU hours of my quad went south for validations in that time frame somewhere late 24th, early 25th. Glad I had some other project's work to switch to when first observing this. Jobs I suspended half way and then released later finished properly.

From the stats the afternoon session returned to BAU numbers, so the mic is over to the people that know.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 26, 2010 9:55:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Col323
Senior Cruncher
Joined: Nov 4, 2008
Post Count: 372
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why did wu error?

GB033533 said:
I notice that for the first of my three recent 'errors', the two extra copies sent out have now been returned and validated, so it wasn't a rogue wu. And they had similar runtimes and claim to mine and my original wingman

Yes, my CMD2_0349-MYH3.clustersOccur-2ASS_A.clustersOccur_77 unit has done exactly as yours, so at least we aren't in an infinite loop even if you and I and don't get credit for completing these.

On a related note, I see another one of my boxen has picked up a few _2 and _3 copies of WUs which match this behavior. At least something in my little arsenal is still trustworthy. smile
[Feb 26, 2010 1:19:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
GB033533
Senior Cruncher
UK
Joined: Dec 8, 2004
Post Count: 206
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Why did wu error?

Oh well. I guess the techs are all preoccupied with dddt2, so aren't too interested in problems with the validator for hcmd2.

All three of my 'errored' wus have now been successfully validated by the two extra copies, so will have dropped off the results list by Monday. So I'll just have to accept the time is lost...
----------------------------------------

[Feb 26, 2010 9:55:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 37   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread