Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 89
Posts: 89   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 463655 times and has 88 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Ok, thanks. Maybe the deadline date issue was a fluke or a silent fix took place i.e. no positive feedback, but just tried and the return due dates look to be picking up the right value, now. The top bit telling i've got 429 tasks presently listed on the result status pages

<ResultsStatus
<ResultsAvailable>429</ResultsAvailable><ResultsReturned>25</ResultsReturned>
<Offset>0</Offset>
<Result>
<AppName>mcm1</AppName>
<ClaimedCredit>0.0</ClaimedCredit>
<CpuTime>0.0</CpuTime>
<ElapsedTime>0.0</ElapsedTime>
<ExitStatus>0</ExitStatus>
<GrantedCredit>0.0</GrantedCredit>
<DeviceId>2750123</DeviceId>
<DeviceName>2750123</DeviceName>
<ModTime>1396126545</ModTime>
<Name>MCM1_0003476_6316_0</Name>
<Outcome>0</Outcome>
<ReceivedTime>null</ReceivedTime>
<ReportDeadline>2014-04-05T20:55:45</ReportDeadline>
<SentTime>2014-03-29T20:55:45</SentTime>
<ServerState>4</ServerState>
<ValidateState>0</ValidateState>
</Result>

For those using, the accolades shown in the help sample need removing where {member-name} and {verification-code} are shown.
----------------------------------------
[Edit 1 times, last edit by Former Member at Mar 29, 2014 9:22:22 PM]
[Mar 29, 2014 9:19:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Play and learn:

Limit: Defines the number of results returned. Default is 25.

The maximum results returned is 250, you thus need to run multiple extracts with offsets to get all the active entries. In my case at 432 now, needed to run a second pass with the added &offset=250 parm. I've specified thus &limit=250&offset=250 as append to a second pull, which is cumbersome. Why this restriction one could guess.
[Mar 29, 2014 11:20:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
foxfire
Advanced Cruncher
United States
Joined: Sep 1, 2007
Post Count: 121
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Why doesn't the XML data match the results page data?
According to the XML, this WU was due 2 days before it was sent.

piroque, as lavaflow suggests the WU appears to be a one-off odd ball. I started using the xml pull about the middle of last month and haven't seen any with negative days due in the 19K WUs I've returned for all active projects.
----------------------------------------

[Mar 30, 2014 2:11:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7847
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

I hope Piroque is able to fix his program. He makes it so easy. I have been able to download the xml from the website and load it into a spreadsheet via a web query. However I have not been able to get a good way (algorithm) to eliminate the duplicates downloaded without some additional monkeying around. Does anybody have any good ways or procedures to do this ? Thanks in advance.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Mar 30, 2014 5:44:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pirogue
Veteran Cruncher
USA
Joined: Dec 8, 2008
Post Count: 685
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Play and learn:

Limit: Defines the number of results returned. Default is 25.

The maximum results returned is 250, you thus need to run multiple extracts with offsets to get all the active entries. In my case at 432 now, needed to run a second pass with the added &offset=250 parm. I've specified thus &limit=250&offset=250 as append to a second pull, which is cumbersome. Why this restriction one could guess.

I would imagine it's restricted so the server isn't churning out 1000s of items on each access. The same restriction applies to the Results Status page.

Depending on what exactly you're trying to do, you can cut down on the noise by adding &ValidateState=1 to the URL. That way only valid results are returned. It would also cut down on the number of times you need to change the offset.
----------------------------------------

[Mar 30, 2014 10:44:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pirogue
Veteran Cruncher
USA
Joined: Dec 8, 2008
Post Count: 685
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Why doesn't the XML data match the results page data?
According to the XML, this WU was due 2 days before it was sent.

piroque, as lavaflow suggests the WU appears to be a one-off odd ball. I started using the xml pull about the middle of last month and haven't seen any with negative days due in the 19K WUs I've returned for all active projects.

That's good to know. The bogus one was the first that was returned for me. I didn't get a warm fuzzy feeling when I saw it.
----------------------------------------

[Mar 30, 2014 10:45:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
foxfire
Advanced Cruncher
United States
Joined: Sep 1, 2007
Post Count: 121
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

However I have not been able to get a good way (algorithm) to eliminate the duplicates downloaded without some additional monkeying around. Does anybody have any good ways or procedures to do this ? Thanks in advance.
Cheers

I use a few lines of code to go down the list of WUs pulled starting at the second one and compare the WU Name to the previous one. If they match I delete the previous one from the list. If you use Excel and would like the code let me know.
----------------------------------------

[Mar 31, 2014 3:36:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

If the ModTime field is updated every time a result has a status change, for instance from pending validation to pending verification, you could sort the database descending. With excel there's a 'remove duplicates' function for tables. It works on multiple field parameters, which facilitates to keep the newest update. Any additional copy found is then removed, retaining a unique list. Kind of what pirogue's program does, as in if the status changes, update the record and show the new timestamp.

The tip to use the ValidationState as a query filter was useful hint as not wanting to receive anything in progress, so added &ServerState=5 as in only tasks that have a returned result.

Corrected ServerStatus as it should be ServerState.
----------------------------------------
[Edit 1 times, last edit by Former Member at Apr 1, 2014 9:53:53 AM]
[Mar 31, 2014 3:54:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7847
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

However I have not been able to get a good way (algorithm) to eliminate the duplicates downloaded without some additional monkeying around. Does anybody have any good ways or procedures to do this ? Thanks in advance.
Cheers

I use a few lines of code to go down the list of WUs pulled starting at the second one and compare the WU Name to the previous one. If they match I delete the previous one from the list. If you use Excel and would like the code let me know.

Yes, I would love the code. Please send it to ten.thousand.lakes using gmail.com. Thank you for your help.
With excel there's a 'remove duplicates' function for tables. It works on multiple field parameters, which facilitates to keep the newest update. Any additional copy found is then removed, retaining a unique list.

Thanks for the tip. I have been using the "unique" attribute in the filter to remove duplicates and then copying the remaining list to another page. Crude, but works, but I was looking for a more elegant solution. I will be adding the server status to only get valid results.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Mar 31, 2014 8:54:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
foxfire
Advanced Cruncher
United States
Joined: Sep 1, 2007
Post Count: 121
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Screen Scrapers - Please Discuss

Yes, I would love the code. Please send it to ten.thousand.lakes using gmail.com. Thank you for your help.


Sent...
----------------------------------------

[Apr 1, 2014 2:59:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 89   Pages: 9   [ Previous Page | 1 2 3 4 5 6 7 8 9 | Next Page ]
[ Jump to Last Post ]
Post new Thread