Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 90
Posts: 90   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 394028 times and has 89 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

Got one that was server aborted, not a trace in the message logs of that [why?] except for the RtR acknowledgement [because I have this log flag on], and most persistently misleading, these SA's keep being reported on the RS pages as "Error" instead of Server Aborted.
.
.
No second of computing time and having a 122.6 "claim" [all the more proof that claims are not client issued... what's the computation behind this?] is of course another point that could use a fix [and not confuse the starting cruncher].

Still here for the long haul!

I noticed too that there is no single line in the message log when a task is server aborted.
The 'Error' on your results page is because of the incompatibility of the BOINC clients exit codes of version 6 and 7.
You will have v7 running and the WCG-server code is not updated yet to interpret all v7 exit codes right.
I had a server aborted task too and by chance it was on the only host where I have still BOINC 6 running.
On my results page that task is "Server aborted" due to 2 wingmen ready and mine not started.

BETA_ faah38222_ ZINC12553895_ xPR_ wC6_ 11_ 1ref9_ 02_ 2-- 711 Server Aborted 13/02/13 21:38:35 14/02/13 08:00:33 0.00 122.3 / 0.0
BETA_ faah38222_ ZINC12553895_ xPR_ wC6_ 11_ 1ref9_ 02_ 1-- 711 Valid 13/02/13 21:38:20 14/02/13 07:25:14 4.84 129.5 / 105.1
BETA_ faah38222_ ZINC12553895_ xPR_ wC6_ 11_ 1ref9_ 02_ 0-- 711 Valid 13/02/13 21:38:18 14/02/13 07:54:52 5.53 80.6 / 105.1

As you can see also a claim without done anything.

(moi) knows that there's a patch for that, and the techs did mod something so the SA's of a client 7 do not impact the reliability rating [5 sequential valid for a science]. Suspect that the "different" exit code send back by the client v7, is the reason there's no SA message logged in the v7 Event log. One way or the other, it's modding the server signaling the v7 client some soothing information [stopping the need to visit the RS pages], or apply the server 70x status code patch, which is maybe breaking something in the past modding, being the reason it was not applied. One reason WCG may not be ready to endorse client 7, but the new app v7.11 number indicates the science is compiled to deal with all the v7 API available features. Would all sciences maybe have to pass through that re-compile cycle? FAICS, there's no backward/forward incompatibility for any v6.xx app version with v7 clients.
[Feb 14, 2013 12:44:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
gomeyer
Senior Cruncher
USA
Joined: Jul 11, 2008
Post Count: 161
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

Total of 24 WU's received and returned. All but 3 have validated, those 3 are still pending wingperson(s).
----------------------------------------

[Feb 14, 2013 1:04:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

(moi) knows that there's a patch for that, and the techs did mod something so the SA's of a client 7 do not impact the reliability rating [5 sequential valid for a science]. Suspect that the "different" exit code send back by the client v7, is the reason there's no SA message logged in the v7 Event log. One way or the other, it's modding the server signaling the v7 client some soothing information [stopping the need to visit the RS pages], or apply the server 70x status code patch, which is maybe breaking something in the past modding, being the reason it was not applied. One reason WCG may not be ready to endorse client 7, but the new app v7.11 number indicates the science is compiled to deal with all the v7 API available features. Would all sciences maybe have to pass through that re-compile cycle? FAICS, there's no backward/forward incompatibility for any v6.xx app version with v7 clients.

It could be a different, but also an unknown exit code for the server returned from BOINC7.
Together with the SIMAP admin, Richard Haselgrove and me, it was fixed for SIMAP by including a newer result.inc from trunk 6 months ago.
<!-- $Id: result.inc 25873 2012-07-13 22:19:26Z boincadm $ -->
After that the exit code 202 was interpreted correctly by the server as Exit status 202 (0xca) EXIT_ABORTED_BY_PROJECT
----------------------------------------

[Feb 14, 2013 1:08:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

Well, we have a handshake problem in this area :D

Meanwhile, upgraded from the test client 7.0.47 to 7.0.52 alpha, mid FAAH Beta run for 3, cold turkey, just ran the installer over. Denoted the Elapsed + CPU time prior and immediately after for these 3 and they resumed proper at last checkpoints. A good sign.
[Feb 14, 2013 1:40:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

One has restarted correctly at a checkpoint after I turned the computer off and back on. Two have returned and validated.

The rest are running, and running much longer than regular FA@H WUs.
----------------------------------------

[Feb 14, 2013 1:45:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

Testing the maximum number of dockings that can be packed in a task?

(Just realized that the test WU numbers are higher, than what is currently fed in production, where exp.41 is now around batch 38150... The current FAAH are running on average shortest in at least 1 year... http://bit.ly/WCGFAH )

edit: As denoted in the exp.41 announcement thread, the stats suggest the techs are prepping for a clean cut-over, first by lowering the work availability for FAAH, reducing the number in circulation that have to come back. We'll learn when time is there, but those with old app and custom AV security settings might want to receive a warning, also because it's been a long running version.
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 14, 2013 1:55:12 PM]
[Feb 14, 2013 1:52:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
CandymanWCG
Senior Cruncher
Romania
Joined: Dec 20, 2010
Post Count: 421
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

One has restarted correctly at a checkpoint after I turned the computer off and back on. Two have returned and validated.

The rest are running, and running much longer than regular FA@H WUs.


And here I was thinking I am the only one running behind on the 1WU I got on my dual core laptop. I even convinced myself that it must be from the fact that I am using both cores full throttle (the other WU being a GFAM). Oh, well...almost 50% done in 5h30 of real CPU time (almost 7 hours of run time). Won't be long now... rolling eyes

On a different topic, I did complete and validate 4WUs on my PC. Tried suspending and resuming a couple of them with LAIM on, all worked fine. Then did a cold restart of BOINC Manager (just hit exit and picked the option to terminate all ongoing tasks then restarted it) with no issues.

Reboot of the laptop on this other WU also didn't break anything. peace

Hope this helps... Oh, almost forgot: when this WU will get done and validated, I will have 2 full days of Beta completed! I can almost smell the Bronze badge...as if. d oh
----------------------------------------
Knowledge is limited. Imagination encircles the world! - Albert Einstein



[Feb 14, 2013 2:15:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

We will be releasing more work units later today. We have been able to reproduce the graphics application bug for 64 bit versions. We are releasing a new version that does not include the graphics for the 64 bit applications.

This test will also help with the homogeneous_app_version settings. This will make sure that resends will still use the old version 7.11 while all new work units will run 7.14.

Should be another 3000+ work units sent 3 times with quorum of 2.

Thanks,
-Uplinger
[Feb 14, 2013 4:55:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

Testing the maximum number of dockings that can be packed in a task?

I don't think so. My lappie went into panic mode when they finished 'cos they took so long, so the estimates were way off. And when I looked at the Status, I saw that I was awarded very low points -- not just lower than was requested, but lower than the two wingpeople got. Very odd:

BETA_ faah38221_ ZINC03373719_ xPR_ wC6_ 11_ 1ref9_ 02_ 2-- 711 Valid 13/02/13 18:42:40 14/02/13 06:00:24 10.01 171.0 / 178.3
BETA_ faah38221_ ZINC03373719_ xPR_ wC6_ 11_ 1ref9_ 02_ 1-- 711 Valid 13/02/13 18:30:03 14/02/13 12:47:22 10.90 185.5 / 178.3
BETA_ faah38221_ ZINC03373719_ xPR_ wC6_ 11_ 1ref9_ 02_ 0-- 711 Valid 13/02/13 18:29:51 14/02/13 15:39:01 19.38 433.8 / 141.5

The ones that ran on my P4 ran as expected (and it's already picked up some resends).

Update: The lappie just picked up a couple of NEW beta units, quorum 2 replication 2. It will be interesting to see how these go.
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 14, 2013 5:06:00 PM]
[Feb 14, 2013 4:55:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: FightAIDS@Home Beta Test Feb 13, 2013 (Issues Thread)

Not odd, as it takes a while for the server to figure out the proper credit for a new science app. The initial ones got over-awarded, which is how the system works... start high, then torque it down and down and down, to the lowest common denominator.

If your lappie went into panic, it may be on client 6? On client 7 there's the <don't_use_dcf/> that has been set by WCG [client_state.xml], so whatever moves up or down in TTC, the projected times remain the same, so that outlier long running tasks don't cause cache inflation]. Been watching this on two v7 clients for 8 days now... always the same TTC for the same science... 9:25 hours, not a second up or down.

edit: This time was primed to "utilize" a trick in the v7 client to get these 7.14 betas, when technically the hosts are over-committed [running in total panic mode, and running 100% of time when 99.7% on]... all devices loaded to the brim, v7.14, all x86_64 builds :P
----------------------------------------
[Edit 3 times, last edit by Former Member at Feb 14, 2013 5:14:24 PM]
[Feb 14, 2013 5:07:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 90   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread