Wednesday, 15 May 2013

Mashup Parkrun like you mean it

Parkrun

Parkrun is an awesome idea, not only does it get a lazy fatty like me out of the house it's also free!  All I need to do is give up some time every so often to watch others do the running and that's that.

So what has this got to do with a Technocr@p post, well it just so happens that their site is nice but doesn't offer something that I am after, I want to compare the stats of different runners because I am a competative so-and-so at heart but there is no way to do this on the normal site.

Overview

The information the site does give is quite extensive, like I said considering it's totally free and run by volunteers it's amazing.
The information that's available is quite good for example going to the link
http://www.parkrun.org.uk/<<event name>>/results/latestresults/
you get overall position, name, age, time, age category, age grade, gender, gender position, club, note(this is normally your PB and if you broke it), total runs.

If you click on a runner then you get that individuals history, here's mine for example by clicking through my results you eventually get to an age grade chart and it's this that I want to compare between other runners.
http://www.parkrun.org.uk/results/athleteeventresultschart/?athleteNumber=318738&eventNumber=377

Where to start?

So the first thing to notice is that the user is identified by a number rather than their name so this should probably be the first thing to tackle, the second is to find out what exactly the eventNumber is and finally find out what format the graph information is in.  If the graph is simply an image then I don't think I can go any further so I am hoping that it's some xml or something, fingers crossed.

Sadly upon further investigation it is just an image that is populated via javascript, but never fear the same data was available in the specific athlete stats in tabular form therefore I shall have to use these instead!

Back to the table http://www.parkrun.org.uk/tilgate/results/athletehistory/?athleteNumber=318738
The first thing to realise is the table of results is identified by
table id="results" and the column we're interested in is the

Damn it!

I couldn't get anything when I tried a simple URL.getContent() so stuck some extra stuff in a groovy script and was faced with
Please don't scrape. See <a href='http://www.parkrun.org.uk/scraper.html'>http://www.parkrun.org.uk/scraper.html</a> for details.
sadly that url is non-existent so I am going to give this up as a bad idea.

FYI to get this message I ran a 3 liner
def conn = new URL('http://www.parkrun.org.uk').openConnection()
conn.connect()
println conn.responseMessage

Not so Damn it after all...

Following on from a helpful comment made by Nick a simple change allowed me to get the content so now the code is
def conn = new URL('http://www.parkrun.org.uk').openConnection()
conn.followRedirects=true
conn.setRequestProperty('User-Agent', 'mozilla')
println conn.content.text

In fact...

I discovered that there is a file that is used to tell bots how they should handle pages so I checked for it on parkrun.org et voila there it is. Robot text file