Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 17
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 32083 times and has 16 replies Next Thread
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

For "weight of the project", I would use the ratio of yesterday's stats for the project (or if you wanted to smooth it, the past 7 days).

You can get the daily history of each project via:
https://www.worldcommunitygrid.org/stat/viewP...PerPage=14&format=xml

Why would you use only 14 or 7 days stats?
Instead of 1 month stats?

What are the benefits?
What are the cons?
confused


I said 7 becuase just yesterday is too small of a sample size. If you choose larger, you will get a more stable sample, but it will take longer to reflect changes as the weight changes.

Chosen to use:
( TotalProjectAvegrage + 30DaysProjectAverage + 14DaysProjectAverage + 7DaysProjectAverage ) / 4
Which will give me some influence over recent averages, but also stability over longer terms.
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Jun 15, 2018 1:40:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

The percentage complete for the projects on the research page are based on roughly on the number of "batches" that we have finished on the project compared to the estimated total number of batches that the scientist have estimated.


Kevin,
If you know the above items. Can that information be published in a table format where column "a" is the project short name, column "b" is the estimated number of batches, column "c" is the number of batches we have finished, and column "d" would be "column c / column b" as a percentage?

I can only speak for myself, but I believe we understand that column "b" will go up or down as more information is acquired from the researchers and as a result column "d" will change accordingly. Additionally, add one other column that is the project start date. That should be enough information for members to "swizzle" any way they want to arrive at estimated completion date, batches per unit of time, etc. In my opinion, this would be more useful and factual than the information currently displayed on the Research Page.
[Jun 16, 2018 10:50:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

The percentage complete for the projects on the research page are based on roughly on the number of "batches" that we have finished on the project compared to the estimated total number of batches that the scientist have estimated.


Kevin,
If you know the above items. Can that information be published in a table format where column "a" is the project short name, column "b" is the estimated number of batches, column "c" is the number of batches we have finished, and column "d" would be "column c / column b" as a percentage?

I can only speak for myself, but I believe we understand that column "b" will go up or down as more information is acquired from the researchers and as a result column "d" will change accordingly. Additionally, add one other column that is the project start date. That should be enough information for members to "swizzle" any way they want to arrive at estimated completion date, batches per unit of time, etc. In my opinion, this would be more useful and factual than the information currently displayed on the Research Page.

+1
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Jun 17, 2018 8:39:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

calculate those percentages of active project on Research page.


The percentage complete for the projects on the research page are based on roughly on the number of "batches" that we have finished on the project compared to the estimated total number of batches that the scientist have estimated. There are a number of couple assumptions built into this forecast:

  • That we actually know how many batches will be run
  • That the work in each batch is equally difficult
  • That the number of jobs in each batch is on average the same


Those factors affect how long a project will take and and any estimated that is computed will change over time (the researchers learn things as they go along and the changes what they want to run through the grid sometimes adding things sometimes removing things - i.e. that is the nature of research).

Noticed when I run the percentages as a "past days / total amount of days" that the percentages don't add up. So that's why I asked, thanks for tell me the data & calcs.
wink

Still would have liked to pick up those percentages from web... confused


I spoke from memory so I'm had a few things wrong. We use the data I mentioned above to compute the estimated completion dates. However, you should be able to compute % complete from:
(current date - start date)/(estimated completion data - start date)

As stated before, there's a quite "big discrepancy" in %, when I calc only using dates, check image here:

Check difference in rows J & K.
J is the % from Research page.
K is the formula based % from "computed days / total amount of days".


Are you trying to subtract out the "paused days"? If so, that might be the discrepancy. The percent complete on the research page doesn't do anything like that, it's just a percentage of days run from the start date to the estimated end date (it also get cached/rounded, so I wouldn't be concerned if it's off by only a percent).

Seippel
[Jun 18, 2018 10:26:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?



Kevin,
If you know the above items. Can that information be published in a table format where column "a" is the project short name, column "b" is the estimated number of batches, column "c" is the number of batches we have finished, and column "d" would be "column c / column b" as a percentage?

I can only speak for myself, but I believe we understand that column "b" will go up or down as more information is acquired from the researchers and as a result column "d" will change accordingly. Additionally, add one other column that is the project start date. That should be enough information for members to "swizzle" any way they want to arrive at estimated completion date, batches per unit of time, etc. In my opinion, this would be more useful and factual than the information currently displayed on the Research Page.


The estimates on the research page need to be revisited. Part of the problem is that while sometimes the total estimated batches for a project can be expected to be reasonably accurate, for many projects there just isn't enough information to give an accurate estimate early on. I don't think we should propagate that data any further than it already is on the website until we have a better idea what the next revision of the research page estimates will look like.

Seippel
[Jun 18, 2018 10:33:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

Are you trying to subtract out the "paused days"? If so, that might be the discrepancy. The percent complete on the research page doesn't do anything like that, it's just a percentage of days run from the start date to the estimated end date (it also get cached/rounded, so I wouldn't be concerned if it's off by only a percent).

Seippel

Yes, when calculating my ECD I'm subtracting "inactive days", as on those days no calcs have been done. But that's on discrepancy in ECD's.

Percentage is calculated from start of the project to their end, without subtraction. So don't know why there's so much gap in some projects?! As I'm not concerned in most projects, some of them have big deviation:
- SCC1 of 3%
- FAHB in 2%
- FAAH in 9%
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Jun 19, 2018 1:07:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Some general data about projects?



Kevin,
If you know the above items. Can that information be published in a table format where column "a" is the project short name, column "b" is the estimated number of batches, column "c" is the number of batches we have finished, and column "d" would be "column c / column b" as a percentage?

I can only speak for myself, but I believe we understand that column "b" will go up or down as more information is acquired from the researchers and as a result column "d" will change accordingly. Additionally, add one other column that is the project start date. That should be enough information for members to "swizzle" any way they want to arrive at estimated completion date, batches per unit of time, etc. In my opinion, this would be more useful and factual than the information currently displayed on the Research Page.

The estimates on the research page need to be revisited. Part of the problem is that while sometimes the total estimated batches for a project can be expected to be reasonably accurate, for many projects there just isn't enough information to give an accurate estimate early on. I don't think we should propagate that data any further than it already is on the website until we have a better idea what the next revision of the research page estimates will look like.

Seippel

I think you are focused too much on "total" accuracy. I think there is a general understanding that there will be a lot of unknowns especially in the early stages of projects. I don't think the researchers themselves know for any kind of certainty how many batches they will run over a two year or more period. They may start out with only 10,000 batches, but 6 to 8 months later provide 4000 more and then later decide they don't need to run the last 500 batches (does MCM1 ring a bell?). That's OK. Just publish what you know at the time. I think we are potentially missing a lot of good information just because someone is afraid it might not be totally accurate at some point. If it is accurate at time of publication, that's all that can be asked. When you calculate the estimated run time now to come up with the percentage on the Research Page, you don't know if they will add or delete any batches that ultimately effect the percentages. It won't be any different the other way. It would get you out of the calculation business. Just publish the basic known (at the time) information and let the members do the calculations in whatever way fits their needs
[Jun 19, 2018 5:38:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread