Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 89
Posts: 89   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 463772 times and has 88 replies Next Thread
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Thanks jonnie
----------------------------------------

[Nov 10, 2013 12:47:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7848
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

The new website design seems to have broken WCGDAWS. It allows me to log in, but when doing an update, stops and requires me to log in again and then just brings me to the website. WCGDAWS 3.1. Maybe I should upgrade ?
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Nov 10, 2013 1:45:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
jonnieb-uk
Ace Cruncher
England
Joined: Nov 30, 2011
Post Count: 6105
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

The new website design seems to have broken WCGDAWS. It allows me to log in, but when doing an update, stops and requires me to log in again and then just brings me to the website. WCGDAWS 3.1. Maybe I should upgrade ?
Cheers

The last WCGDAWS update was 29-Jun-13 Ver 1.3.2.
I'm guessing piroque need to update his code to reflect the WCG log-in(?) changes on Friday.
----------------------------------------

To Join follow this link: Join the UK Team All Welcome! UK Team thread
[Nov 10, 2013 2:06:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

I'm parsing the HTML to pull from the "MY CONTRIBUTION", "Global Statistics" and "Results Status" pages. The data I'm pulling from these pages is everything but the headings, graphics and links. I'm also pulling data from the Member and Team Statistics pages using &xml=true.

As asked by jonnieb-uk, are you planning changes to the xml option as well?
----------------------------------------
Bill P

[Nov 10, 2013 3:06:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
jonnieb-uk
Ace Cruncher
England
Joined: Nov 30, 2011
Post Count: 6105
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

As asked by jonnieb-uk, are you planning changes to the xml option as well?


And, just possibly, an extension to the range of data available in XML format? wink
----------------------------------------

To Join follow this link: Join the UK Team All Welcome! UK Team thread
[Nov 10, 2013 3:33:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Since pirogue has a monitored WCGDAWS thread, a post in there would initiate a mail to him.... just saying.

Myself, don't scrape per se... running, quasi autonomous, a series of 2 part web-queries from the project and global statistics pages, one to fetch a pre-defined set of fields from the front, then a second pass to pull X records and pick a defined set of fields from that. In example, this is the path to pull the last 91 global stats records [note the secure address use]:

https://secure.worldcommunitygrid.org/stat/vi...&numRecordsPerPage=91

And this is the screen where the segments are picked to import, scrape if you will [The green arrow sections is what is pulled], which is repeated for each project in a VBA script, repeating at set hours... if failed, repeat till timestamp changes in page header.



Don't care how the page is framed, long as the key sections don't get altered ... or the Dashboard and the other 40 or so charts will thoroughly break :O
[Nov 10, 2013 3:39:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
jonnieb-uk
Ace Cruncher
England
Joined: Nov 30, 2011
Post Count: 6105
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Since pirogue has a monitored WCGDAWS thread, a post in there would initiate a mail to him.... just saying.


widdershins posted in the thread Nov 8, 2013 8:28:02 PM GMT no response as yet.
----------------------------------------

To Join follow this link: Join the UK Team All Welcome! UK Team thread
[Nov 10, 2013 4:11:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SNURK
Veteran Cruncher
The Netherlands
Joined: Nov 26, 2007
Post Count: 1217
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

I'm downloading and using a lot of data daily for the signatures. I'm using the xml pages (&xml=true) if available. One notable place where this is not the case is the member rankings per country. If they could be made available in xml it would reduce my scraping by a fair bit.
In my ideal world, a member's ranking per country (runtime, points, results) and a member's ranking per team (runtime, points, results) would be added to the member's individual xml page. But I can understand if this is not possible.
I am ready to adept my code if anything changes, it's not a big deal, so don't worry about me if you feel like changing anything. I'll keep an eye out for any mishaps in the signatures the coming months (and I'm sure others will do so too). So far it's looking healthy.
Keep up the good work! biggrin
----------------------------------------
----------------------------------------
[Edit 1 times, last edit by SNURK at Nov 10, 2013 8:16:29 PM]
[Nov 10, 2013 8:14:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Since pirogue has a monitored WCGDAWS thread, a post in there would initiate a mail to him.... just saying.


widdershins posted in the thread Nov 8, 2013 8:28:02 PM GMT no response as yet.

As you surmised I missed that post :D and whilst, two members also found their way to the WCGDAWS site forums run by pirogue. The thread http://www.wcgdaws.com/forums/index.php?topic=38.0

Here's hoping this indispensable tool gets the required adaptation soon.
[Nov 11, 2013 8:55:09 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

The trick is to append &language=fr_FR to the links of the pages I am using. smile

The mistake is that this parameter is misspelled &langauge when you use the language selection in the Stats section. sad

PS2: In fr_FR fr is the language and FR is the country for the formatting rules. For example Quebec users would use fr_CA.


Fixed the issue with the misspelling of language - thanks for that.

As for fr_CA vs fr_FR in the language drop down, that stems from the fact that we originally targeted the French translation for Canada due to some policy(legal?) requirements for internal deployment/promotion within Canada.

However, if you have fr_fr as your primary language in your browser, then when you arrive on our site, the French formatting rules should be in effect. Let me know if you see otherwise.
[Nov 11, 2013 11:56:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 89   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread