Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 17
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4134 times and has 16 replies Next Thread
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7850
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Playing around with stats

Ingenious anhhai. I have used this technique elsewhere but did not think of it for this application.
Thank you.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Jan 10, 2012 11:47:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Playing around with stats

So you guys like simple and complicated formula... SUBSTITUTE is nice compared to REPLACE combined with FIND and nested at that to change commas for dots and dots for commas and thousand signs for spaces [see Dashboard for product], setting it up so that the locale does not matter, Swiss ' or Amercan ',', I'd actually not insert ' but rather remove the offending characters and let the spreadsheet screen formatting handle the display. Then a sheet can be send to a friend in Canada too or France and not necessitating them in changing the signs in all the formulas.

Anyway, here's a few sample headspinners from the Chart stats engine:

=IF(IF(INDIRECT("U"&ROW())+INDIRECT("P"&ROW())=0;INDIRECT("C"&ROW());IF(INT(IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW())))=IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW()));IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW()));IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW()))*1000))<INDIRECT("C"&ROW());1000;1)*IF(INDIRECT("U"&ROW())+INDIRECT("P"&ROW())=0;INDIRECT("C"&ROW());IF(INT(IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW())))=IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW()));IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW()));IF(INDIRECT("U"&ROW())>0;INDIRECT("U"&ROW());INDIRECT("P"&ROW()))*1000))

This needed to handle the fact that WU totals at WCG are displayed in arbitrary format, where there is 321 results, but also 4,321 results. Which is bigger to you and to me? What if you have 4,320. On import the zero goes so would your interpreter know it's 4.32 or 4,320? "INDIRECT" is very powerful because you can compute a grid location in a matrix or make dynamic cellname references.

=IF(ROUND(SUM(INDIRECT(N$85&"T"):INDIRECT(ADDRESS(RWS-59;COLUMN(INDIRECT(N$85&"T"));;;"DPD")));1)=0;"~";REPLACE(TEXT(AVERAGE(INDIRECT(N$85&"F"):INDIRECT(ADDRESS(RWS-59;COLUMN(INDIRECT(N$85&"F"));;;"DPD")));"0,0");FIND(",";TEXT(AVERAGE(INDIRECT(N$85&"F"):INDIRECT(ADDRESS(RWS-59;COLUMN(INDIRECT(N$85&"F"));;;"DPD")));"0,0");1);1;".")&IF(AVERAGE(INDIRECT(N$85&"F"):INDIRECT(ADDRESS(RWS-59;COLUMN(INDIRECT(N$85&"F"));;;"DPD")))>AVERAGE(INDIRECT(ADDRESS(RWS-1;COLUMN(INDIRECT(N$85&"F"));;;"DPD")):INDIRECT(ADDRESS(RWS-60;COLUMN(INDIRECT(N$85&"F"));;;"DPD")));HI;LO))

This one computes the 60 day rolling average TFL for any science or overall by dipping into the DPD database. (There are about 60 columns with historic data computed from just 3 input fields per science). Obviously when dealing with starting sciences with less than 60 days of sequential results, you need to cater for that too else you get skewed averages.

=IF(INDIRECT(DL1849&"C7")=0;0;ROUND(IF(COUNTIFS(INDIRECT(EC1852&RWS-20):INDIRECT(DL1849&"T");">=0,001")<21;(INDIRECT(DL1849&"PUC")-INDIRECT(DL1852&RWS-21))/21/TFLCNV;TRIMMEAN(INDIRECT(DL1849&"F"):INDIRECT(ADDRESS(RWS-23;COLUMN(INDIRECT(DL1849&"F"))));0,1));5))

The above determines the 21 day average and kicks out any outliers [when the knreed stats machine has hickups and burps].

It may all look like goo and woosh, but when you see the charts, this is what pulls it together [with a bunch of slow VBA to spit out the performance charts [15 in 7 seconds, which used to be 24 seconds until recently, until reading the detail release notes on pre-declaring variable types]. In those release notes there was also the new IFERROR which cleanly handles the failed condition (#NA #DIV #VALUE and the like). Rather than testing if a condition fails, and if not, repeating the formula, this one passes the positive outcome and else can produce and alternate, such as just nothing or a note of warning like the below when a WU name prefix is not in the table:

=IFERROR(VLOOKUP(UPPER(TRIM(LEFT(TRIM(A93);2)));WUMonikers[#All];2;0);"WU type not known in table")

For you to figure why there is that second nested TRIM in there... I cant ad hoc remember :D

Of course, 12 codejockeys, 13 products is the standard rule, so don't ask me why it came out the way it did... it produces the needed result. The RTFMs are the most ignored part of any piece of software, no denying that.

--//--

P.S. In the second last sample, DL1849 is where the project moniker is stored, so when that field holds FAAH and F (Flops) is the variable you want to collect, INDIRECT will look in the field with the name FAAHF. Would I change DL1849 to hold DSFL, it would go out to fetch DSFLF. By running a Do-Till VBA and substituting that field with the monikers, it's easy to generate reports on different sciences in a single sheet. Flick of the wrist and the next ABCD science is created in a blond moment.... maybe next month after :P
----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 11, 2012 1:45:13 PM]
[Jan 11, 2012 1:43:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Playing around with stats

thanks a lot
[Jan 11, 2012 3:33:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Playing around with stats

For Hypernova,

Was doctoring around with your 'Substitute' to see how it could be made to function ''all purpose'', importing into Excel and dealing with the dropped ending zeros if the number does not arrive as text [values under 1 million]. The solution is I think nifty and elegant, AND fault tolerant, superior to messing with the mentioned Replace/Find functions:

=SUBSTITUTE(TEXT(H123;"0,000");",";"")*1)

In this version, the number is ''as-is'' converted to text if arriving as a number. When it's less than 3 digits behind the last comma, it is end-filled with zeros. If a number such as 123,456 arrives in number, not text format it is fine, but when 123,450 arrives as number, the zero gets dropped [not when it's 7,123,450 though which Excel sees as text anyhow when using European locale].

Substitute in Excel has a 4th parm btw where you can set which instance needs to be replaced. With e.g. an additional ;2) at end, you can tell only to remove the second comma in the 7,123,450 sample to become 7,123450 which then in a single processing action sizes down the value to millions. Again the fault tolerance is that if there is no 2nd comma in the value, it will be ignored, but then you'd end up with a number such as 123,456 as being 123,456 million. With 'len' it's verifiable... then the number could be divvied by 1000 to get 0,123456 million. Testing is 9/10th of the coding effort.

Hope this helps, for whoever hobbies with stats in spreadsheets.

--//--

P.S. Not checked if these functions work directly in open office or libre-office in the identical fashion. Found many differences when working with Calc.
[Jan 15, 2012 11:19:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Playing around with stats

addendum: Before I forget, somehow [don't know about OO or LO] Excel does not recognize if the intend of processing a text stored value is to get a number with which you can calculate. Simply add *1 at end of the formula, as else summing a column or row might end you up with the wrong totals.

--//--
[Jan 15, 2012 11:23:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
mecole
Advanced Cruncher
Joined: Jun 18, 2011
Post Count: 55
Status: Offline
Reply to this Post  Reply with Quote 
Re: Playing around with stats

I wrote a couple of quick VBA functions that you can add to an Excel workbook and then call just like standard functions.

1. wcgTimeInSeconds - this converts the WCG time format into seconds - pretty straight forward.

2. wcgTimeSum - enter a range of cells (like ordinary SUM) and you get the total time in seconds

3. wcgTimeFormat - this converts seconds into the standard WCG time fomat y:ddd:hh:mm:ss

You can use them together and save yourself the time of having to build lots of crazy formulas ...

=wcgTimeFormat((wcgTimeSum(B2:B1653)))
this adds the range of cells and displays total time in WCG format

you can even subtract (for things like outliers)
=wcgTimeFormat((wcgTimeSum(B2:B1652)- wcgTimeSum(B1653)))

If anyone is interested I will post up the code. I'm thinking of creating a library of functions so we can use all of the standard functions against WCG time ... things like
=MAX(wcgTimeInSeconds("B2:B1532"))

the trick here is that if you pass in a single cell to wcgTimeInSeconds then is gives you back the single value, if you pass in a range of cells it hands back a range that has converted value that other functions can act on. I have not written this last piece yet but am wondering if there is interest here also?

Maybe functions like ...

pretty much like wcgTimeInSeconds
wcgTimeInYears, wcgTimeInDays, wcgTimeInHours, wcgTimeInMinutes

wcgYears(this returns only the year ... kind of like the built-in functions) wcgDays, wcgHours, wcgMinute, wcgSecond

Ok ... I'm getting carried away ... the other issue I read in this thread is international numeric formatting ... I can build convereters if anyone would like, I may even be able to do this based on your PC settings so you don't have to pas a lot of info to the function(s).


TIA,
Steve
----------------------------------------

----------------------------------------
[Edit 5 times, last edit by mecole at Jan 15, 2012 6:05:59 PM]
[Jan 15, 2012 5:52:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
jonnieb-uk
Ace Cruncher
England
Joined: Nov 30, 2011
Post Count: 6105
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Playing around with stats



If anyone is interested I will post up the code. I'm thinking of creating a library of functions so we can use all of the standard functions against WCG time ...
Steve


Sounds good to me, Steve, especially the idea of a library of functions. But I'm just a minnow when it comes to WCG stats so hopefully there'll be some comments from the long established statisticians.
----------------------------------------

To Join follow this link: Join the UK Team All Welcome! UK Team thread
[Jan 19, 2012 6:49:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread