Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 17
Posts: 17   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 24808 times and has 16 replies Next Thread
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Some general data about projects?

D&G,
so after GDPR, most of the data collection has been rendered useless, including SNURK wcgsig - unfortunately. & from SNURK data I've been collecting "project % completion" & "weight of projects".

I'm still able to get to general data about the project ECD (estCompletionDate) or releaseDate over API, as those channels are not blocked - it's not user data, but general project data.
Had to do some minor changes to my table I run forECD topic here.

But still, I'm missing some critical elements for full automation of the process (& removing human error) to look about projects data & WCG in general - notice NO USER DATA WILL BE USED.
To be frank with you, I'd like to use API to access over JSOC (or other way) to data:
- project % completion - as they can't be imported from Research page
- weight of projects on WCG grid, as we have it here on SNURK page

Does anyone can help me with that?
Admins, Techs - can you give me a links for import? Or some suggestion?

Thanks
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Jun 8, 2018 1:34:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

KLiK,

Here are some pages that could be useful:

Project Stats in XML:
https://www.worldcommunitygrid.org/stat/viewP...tName=mip1&format=xml
(only available in XML at the moment)

Project History in XML:
https://www.worldcommunitygrid.org/stat/viewP...PerPage=60&format=xml
(only available in XML at the moment)

Project Info in JSON:
https://www.worldcommunitygrid.org/api/project?shortName=scc1
(only available in JSON)

Do these give you what you need?
[Jun 8, 2018 4:33:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

Thank you knreed.

Managed to have a look now & got the weight of the project in % from StatisticsAverages.RunTimePerDay. Which is is part of the question that I've asked, thank you for that.
Though, I've also found out that StatisticsAverages.RunTimePerDay of all projects don't add up to Global statistics. How come? Can you check that out?

Also, could not find out how you calculate those percentages of active project on Research page.
Can you elaborate more on that?
Where can I get:
- those percentage % of the project finished?
- or some Total estimated time for the project (for example 50.000 years of projected run time)?
- or some Projected timeline of the project?

Thanks in advance
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Jun 13, 2018 8:30:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

Though, I've also found out that StatisticsAverages.RunTimePerDay of all projects don't add up to Global statistics. How come? Can you check that out?


The run time per day is computed based on the following: total runtime/number of days. For global this is 1,623,478 years/days since Sept 20, 2004 (the first day in the db that there is a stat ). 1623478 yrs/5016 days = 323 years 240 days per day

For OpenZiza this is 49702.5 years/days since May 18, 2016 (the first day in the db that there is a stat for the project). 49702.5 /757 = 65 years 240 days per day.

Because they are measuring different periods of time, the sum of the projects cpu time per day should not equal the global total of cpu time per day.

calculate those percentages of active project on Research page.


The percentage complete for the projects on the research page are based on roughly on the number of "batches" that we have finished on the project compared to the estimated total number of batches that the scientist have estimated. There are a number of couple assumptions built into this forecast:

  • That we actually know how many batches will be run
  • That the work in each batch is equally difficult
  • That the number of jobs in each batch is on average the same


Those factors affect how long a project will take and and any estimated that is computed will change over time (the researchers learn things as they go along and the changes what they want to run through the grid sometimes adding things sometimes removing things - i.e. that is the nature of research).
[Jun 14, 2018 1:58:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

For "weight of the project", I would use the ratio of yesterday's stats for the project (or if you wanted to smooth it, the past 7 days).

You can get the daily history of each project via:
https://www.worldcommunitygrid.org/stat/viewP...PerPage=14&format=xml

----------------------------------------
[Edit 1 times, last edit by knreed at Jun 14, 2018 2:01:22 PM]
[Jun 14, 2018 2:00:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

calculate those percentages of active project on Research page.


The percentage complete for the projects on the research page are based on roughly on the number of "batches" that we have finished on the project compared to the estimated total number of batches that the scientist have estimated. There are a number of couple assumptions built into this forecast:

  • That we actually know how many batches will be run
  • That the work in each batch is equally difficult
  • That the number of jobs in each batch is on average the same


Those factors affect how long a project will take and and any estimated that is computed will change over time (the researchers learn things as they go along and the changes what they want to run through the grid sometimes adding things sometimes removing things - i.e. that is the nature of research).

Noticed when I run the percentages as a "past days / total amount of days" that the percentages don't add up. So that's why I asked, thanks for tell me the data & calcs.
wink

Still would have liked to pick up those percentages from web... confused
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
----------------------------------------
[Edit 1 times, last edit by KLiK at Jun 14, 2018 6:59:03 PM]
[Jun 14, 2018 6:58:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

For "weight of the project", I would use the ratio of yesterday's stats for the project (or if you wanted to smooth it, the past 7 days).

You can get the daily history of each project via:
https://www.worldcommunitygrid.org/stat/viewP...PerPage=14&format=xml

Why would you use only 14 or 7 days stats?
Instead of 1 month stats?

What are the benefits?
What are the cons?
confused
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Jun 14, 2018 7:45:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

calculate those percentages of active project on Research page.


The percentage complete for the projects on the research page are based on roughly on the number of "batches" that we have finished on the project compared to the estimated total number of batches that the scientist have estimated. There are a number of couple assumptions built into this forecast:

  • That we actually know how many batches will be run
  • That the work in each batch is equally difficult
  • That the number of jobs in each batch is on average the same


Those factors affect how long a project will take and and any estimated that is computed will change over time (the researchers learn things as they go along and the changes what they want to run through the grid sometimes adding things sometimes removing things - i.e. that is the nature of research).

Noticed when I run the percentages as a "past days / total amount of days" that the percentages don't add up. So that's why I asked, thanks for tell me the data & calcs.
wink

Still would have liked to pick up those percentages from web... confused


I spoke from memory so I'm had a few things wrong. We use the data I mentioned above to compute the estimated completion dates. However, you should be able to compute % complete from:
(current date - start date)/(estimated completion data - start date)
[Jun 14, 2018 9:53:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

For "weight of the project", I would use the ratio of yesterday's stats for the project (or if you wanted to smooth it, the past 7 days).

You can get the daily history of each project via:
https://www.worldcommunitygrid.org/stat/viewP...PerPage=14&format=xml

Why would you use only 14 or 7 days stats?
Instead of 1 month stats?

What are the benefits?
What are the cons?
confused


I said 7 becuase just yesterday is too small of a sample size. If you choose larger, you will get a more stable sample, but it will take longer to reflect changes as the weight changes.
[Jun 14, 2018 9:57:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Some general data about projects?

calculate those percentages of active project on Research page.


The percentage complete for the projects on the research page are based on roughly on the number of "batches" that we have finished on the project compared to the estimated total number of batches that the scientist have estimated. There are a number of couple assumptions built into this forecast:

  • That we actually know how many batches will be run
  • That the work in each batch is equally difficult
  • That the number of jobs in each batch is on average the same


Those factors affect how long a project will take and and any estimated that is computed will change over time (the researchers learn things as they go along and the changes what they want to run through the grid sometimes adding things sometimes removing things - i.e. that is the nature of research).

Noticed when I run the percentages as a "past days / total amount of days" that the percentages don't add up. So that's why I asked, thanks for tell me the data & calcs.
wink

Still would have liked to pick up those percentages from web... confused


I spoke from memory so I'm had a few things wrong. We use the data I mentioned above to compute the estimated completion dates. However, you should be able to compute % complete from:
(current date - start date)/(estimated completion data - start date)

As stated before, there's a quite "big discrepancy" in %, when I calc only using dates, check image here:

Check difference in rows J & K.
J is the % from Research page.
K is the formula based % from "computed days / total amount of days".
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Jun 15, 2018 10:54:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 17   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread