| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 5
|
|
| Author |
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Another Rundgren moment... "Hello It's Me" (Again). As an avid scraper of statistics web pages for which there's no XML, today ran the usual URL injection which is equivalent to the Device Statistics Filter - Anytime(All)/Anytime(All) ...
https://www.worldcommunitygrid.org/ms/device/...dSince=0&lastResult=0 yielded more data than before, more than I can see on the website! ![]() BATBK6810J is the last to been seen at bottom on the website, but now suddenly the table into which the fetched data is injected asked if I wanted to allow additional data to be appended. Not the first time, since today the scraped data lands in row 47, the next moment in row 46, so took yes, except this time all these never neverland devices showed up including blank lines. Whilst the routine extracts the Device IDs from the html, they appear to repeat, like 6 with the name ubuntu (probably a failed attempt at installing BOINC on Ubuntu, duh). From recollection, the names except for the ubuntu were UD devices. Closing the work book, reopen, rerun, just pulled the same i.e. this is reproducable. Anyway, why do I earn this additional stuff all of a sudden? P.S. The match numbers is just because another routine sums all items points/runtime/results with the same name and the profile column just connects the device statistics with the device profile table, i.e. may be ignored for the discussion. |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
I can't find the device BATBK6810J in our database either via the device id you show or via its name. What user is it under?
|
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
This is the url
https://www.worldcommunitygrid.org/ms/device/...d=149939&deviceType=U Under my name without * at end. Had not realized, but the U and the B are a good hook to recover the Agent Type info and compute how many points from the legacy UD agents (the procedure removes all images from the html imported page. Must have run the device stats half a dozen times since writing (as it updates every 3 hours, but the extras after the BATBK6810J device keep coming back unperturbed. |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Ah - I was looking in the BOINC database for the host. Didn't find it since it was a UD device.
What devices were returned that you hadn't previously seen? |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
All below the BAT device do not show up on the website pages, the lines where there are no device names most puzzling, and thus no url to squeeze out any device id.
----------------------------------------Edit: Overcame part of the missing information problem by coupling the Device Profile listing to the Device Statistics data to get the agent type from there with an index/match search. This leaves the blank lines with just 'Never' on them. BTW, was watching 'The Ultimate Introduction to Web Scraping and Browser Automation" with some interesting observation 'you will get things to see, that might not be intended to be seen' advising to talk to the webmaster if ethics calls for that. Pretty much all web browsers come with documents inspector features, Firefox having its excellent Firebug addon, so it's clearly a web deployer's problem. (It's my plan to inject tick marks into device profiles to quickly change project selections... after the summer, maybe). Edit2: [Edit 3 times, last edit by SekeRob* at Aug 4, 2017 11:09:37 AM] |
||
|
|
|