| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 15
|
|
| Author |
|
|
AntHill
Cruncher Joined: Nov 3, 2017 Post Count: 1 Status: Offline Project Badges:
|
How to use APIs to get a list of devices with statistics, like on https://www.worldcommunitygrid.org/ms/device/viewStatisticsByDevice.do
=== IN RUSSIAN === Как средствами API получить список устройств со статистикой, как на странице https://www.worldcommunitygrid.org/ms/device/viewStatisticsByDevice.do |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I don't think it's possible to get the "Device History Statistics" from the API, unfortunately. But as a last resort, you could scrape the screen and collect the data. I have a small very rudimentary script that will do that. You can find it here: https://github.com/msellan/wcg_dss/ It's a bash version 5 (unix shell) script.
Currently, it just processes the data on the page but doesn't support logging in. To use it, you have to login to the WCG site and navigate to the Device History Statistics page and then using your browser choose File/Save As and just save the html page. Point the script at that page and it'll strip out the data and save it into a delimited file that could easily be imported into an Excel spreadsheet or to a database. At some point, I'll get around to automating the login so the whole operation can automatic and scheduled. Or better yet, perhaps someone at the WCG will just decide to expose this data via the API and I can discard this script. Best, -mark |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Discussed in past
https://www.worldcommunitygrid.org/forums/wcg...ad,40823_offset,20#582183 2 ways if using IE ieBrwsr.navigate WCGPath & "j_security_check?j_username=" & MemName & "&j_password=" & MemPw Or into the little overlay that appears when clicking Log In top left .j_username.Value = MemName .j_password.Value = MemPw .submit Build in time out retry loops and delays... too fast succession and you get rejected. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks lavaflow - I think that'll help me finish the script to automate the download of my Device History Statistics! I built a function to login that looks much like that but had the POST wrong. So hopefully you've saved me a good chunk of time.
Best, -mark |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
At some point, I'll get around to automating the login so the whole operation can automatic and scheduled. Ok, thanks again @lavaflow for the link to the past forum thread for automating login. I've automated the login so you can download your 'Device History Statistics' on a schedule. It logs in using your WCG credentials and then retrieves and parses your 'Device History Statistics' page pulling the relevant data down and formatting it into a delimited file. It'll work on Mac/Linux and uses Bash v5 (it might also work on Windows with Linux Subsystem for Windows installed but I haven't tested it). Link to Github repo is above in an earlier post. Best, -mark [Edit 2 times, last edit by Former Member at May 20, 2019 12:21:00 AM] |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
Hi mark,
I've tested your bash script and got some remarks. In the initial phase where you initialize your variables, I think you should add: COOKIE_JAR=… In the function get_device_history, the string ~/Downloads/devicestats.html should be replaced by: "$INPUT_FILE" That being said, I've programmed some more on my own version of a script that scrapes WCG's website to look for several statistics: My Contribution and Results Status pages. I'm using my own script to log the daily My Contribution stuff, e.g.: $ wcgstats | cut -f 1-7 ## COMMENT: my contribution * Download my script 'wcgstats' here. * |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi mark, I've tested your bash script and got some remarks. nerd In the initial phase where you initialize your variables, I think you should add: COOKIE_JAR=… In the function get_device_history, the string ~/Downloads/devicestats.html should be replaced by: "$INPUT_FILE" Hi adriverhoef, thanks for the feedback! You're absolutely right about the hard-coded path - that's a mistake on my part just being lazy. The variable initialization is a good suggestion too but wasn't actually bad here as it's getting initialized in the dss_env.sh script that I'm sourcing which has my userid and password in it (and so is not loaded to Github and therefore invisible to anyone reading the code!) Design-wise, though, there's no reason to have it in the ENV script so I moved it into the main script. It's cleaner and easier to understand that way so I appreciate the nudge. I'm a big fan of using set -euo pipefailin my scripts as that catches most mistakes such as uninitialized variables amongst other things. Usually it's the first clue if I've forgotten to initialize something as the script immediately fails because of the set -euo pipefail command. I looked through your script and it's impressive how much you're handling with it! I wrote this one mostly because I haven't written a screen scraper in a while and thought it would be fun working out the regexs. I still prefer using the API when possible but this historic data isn't available yet through the API so the screen scraper will have to do. I'll probably create another table in my WCG database and start writing the Device Statistics History there just for good measure. Have you explored any further putting your data in a SQL database? Right now I'm just using a single table to store all the workunit data that comes from the API and I've started thinking about breaking it up into a normalized data structure and storing all the data about workunits across time in addition to historic (screen scraped) data correlated with devices, etc. I'm not sure it's worth all the work or not. It'll be a good exercise as I don't write a lot of SQL applications. But maybe seeing the underlying patterns of how the workunits are processed could be interesting? @twilyh got me thinking about that and I'm curious if others have gone that route. Best, -mark [Edit 1 times, last edit by Former Member at May 22, 2019 2:08:34 AM] |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
Hi Mark,
I feel a bit hurried today, but will try to express some thoughts. I looked through your script and it's impressive how much you're handling with it! It's something that grows (hopefully not out of hand ) as time progresses.Have you explored any further putting your data in a SQL database? Only Workunit Results and that works very well as far as I know. I should mention that I'm also still using the 'old' way of storing Workunit Results (tab separated) in a plain text file, since it doesn't grow that fast (something like 50 MB per year) and I'm 'tailing' the logfile (tail -f) daily to see if there are any anomalies.I'm not sure it's worth all the work or not. Well, if this isn't good for your programming skills I don't know what is ... In any case I found out that the 'wrong date' bug in the API is still present, I noticed a few weeks ago: $ grep -C1 ZIKA_000420712_x5k6k_ZIKV_NS1_MD_model_5_s2_2601_0 `wcglog -/` | wcglog -c iwmtrNClarification: The server received the middle result two days before it was sent to me. Go figure. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Well, if this isn't good for your programming skills I don't know what is ... wink In any case I found out that the 'wrong date' bug in the API is still present, I noticed a few weeks ago: Haha, yes for sure! I really could use more practice with writing SQL, lol. And thanks for pointing out the 'wrong date' bug. I haven't been on the forums long and am still discovering all the hidden gems like that from the past. I guess it also means I need to start combing through my data more. Hence the need for more SQL scripting skills! Best, -mark |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
Hi Mark,
Maybe it's funny to hear/read how my wcgstats script grew over time. ![]() First all I wanted was to scrape the data from My Contribution in two ways: * Left column and right column: My Contribution + My Team (that's one way ) - Total Run Time, Points Generated and Results Returned + all rankings* All the data from Statistics By Project (that's the other way )By saving those data to a logfile you can see the progress you're making over time (and with respect to My Contribution and My Team you can also see if you're climbing or falling in the rankings), e.g.: 2017-04-08 00:06:02 72:024:06:06:43 1985 176395553 824 641907 343Clearly the idea for this came up when I entered the top 2000 in runtime. After that I wanted to scrape the Results Status with all its options (Filter by Device Name, Result Status, Project Name, pagenumber, sort by dueTime, returnedTime, runTime). Two years later, I decided that more was needed. So recently I added an option to show the program version, furthermore I added a way for other users to update their version of the program without having to copy their credentials with each update. (Maybe you noticed that I don't source a file that's containing one's credentials.)Recently I also found out that I'd better show the actual devicename, instead of showing the OS version. Then it was your idea that made me implement scraping the Device Statistics History (which you can also filter by devicename and by pagenumber). At last I also added the options -N and -n, e.g.: $ wcgstats -n beta Since my program doesn't have a way to scrape the My Contribution History yet, maybe that could be the next program addition; anyways I don't have an option for that … yet. ![]() |
||
|
|
|