Sat, 05 Nov 2011
A Prompt Piston Primer
At work, we use django-piston to easily provide some RESTful APIs. For those who don't know, Piston is a reusable Django app that allows you to give RESTful access to models by writing simple handlers. These handlers specify the accessible fields, appropriate access methods and suchforth.
In this post, I'll give a quick introduction to Piston (limiting myself to GET APIs only) which I will expand upon in future posts.
Imagine a simple library scenario with the following models:
class Shelf(models.Model):
location = models.CharField(...)
class Book(models.Model):
title = models.CharField(...)
shelf = models.ForeignKey(Shelf, related_name="books")
For some reason (it's an enterprise library, perhaps) we need a REST API on to our books data, so we can easily fetch their title and shelf via HTTP. We'll start by writing a basic handler for books:
from piston.handler import BaseHandler
from models import Book
class BookHandler(BaseHandler):
model = Book
That's it. Except it isn't quite, as we need to hook it up in our urls.py:
from django.conf.urls.defaults import patterns, include, url
from piston.resource import Resource
from handlers import BookHandler
book_resource = Resource(BookHandler)
urlpatterns = patterns('',
(r'^book/(?P<id>[^/.])', book_resource)
)
Here we wrap up our handler in a resource (which handles all of the view-like stuff for us) and set /book/<id> to point to it. If you run your server, you should now be able to point at /book/1 and receive a 404. If you create a Book (and a Shelf for it to live on), you should get it back in JSON:
{
"shelf": {
"_state": "<django.db.models.base.ModelState object at 0x2be6090>",
"id": 1,
"location": "Shelf Location 1"
},
"title": "Book 1"
}
Voila! You have the book's title and, even better, you have all of the shelf information included as well. Problem solved! Well, not quite. We don't actually want all of our shelf information included. _state is an internal Django thing [0]. id is an internal identifier which we don't want any consumer of our API to be able to access.
The most obvious option is to specify which fields we include in the API output:
class BookHandler(BaseHandler):
model = Book
fields = ('title',
('shelf', ('location',))
This syntax is, I hope, relatively obvious. fields contains a list of all the field we want our API output to include. Where we have a related model (shelf in this example) we can use a two-tuple specifying the field name and the list of fields on the related model to include in the output (we only want location). This gives us the following JSON output:
{
"shelf": {
"location": "Shelf Location 1"
},
"title": "Book 1"
}
This looks exactly as we want. Problem solved! Well, not quite. Let's say that we need to add a field to Book (e.g. page_count). Because we are hard-coding the fields that the API produces, we will have to add it to the fields attribute of our BookHandler. The same is true if we want to add another field to Shelf; we'll have to modify that shelf tuple to include it. There must be a better way.
There is. If we think back to our BookHandler, Piston worked out what fields there were on the Book model all by itself. Also, the Book model has an id field (and probably a _state attribute) and Piston didn't expose those through the API. This tells us two things about Piston handlers:
- They automatically find fields on the model which they handle, and
- They automatically filter out certain attributes which we probably don't want to reveal to the world at large.
These are exactly the properties that we are looking for when exposing a Book's Shelf through the API, so let's remove the fields attribute on our BookHandler and write a ShelfHandler:
class ShelfHandler(BaseHandler):
model = Shelf
class BookHandler(BaseHandler):
model = Book
If you hit our API now, you will see that the JSON returned is exactly the same as before we made the change. However, if you experiment with modifying the models, you will see that those changes are reflected in the API output with no further modifications by us.
This reflects the last lesson that I want to offer in this post: Piston keeps a registry of handlers for models. If a related model is referenced without any guidelines as to how to display it (as Shelf is implicitly referenced in our most recent BookHandler), Piston will use a handler defined for that model if it is available [1].
Let's quickly review what we've learned:
- Piston is a Django app for providing REST interfaces,
- How to write basic handlers for models,
- How to specify which fields on a model should be included in the API, both for the model itself and any related models, and
- How Piston uses handlers by default when working out how to represent related models in API output.
Whilst writing this post, I've been recording my work in git, you can find the repository at https://github.com/OddBloke/Books-Sample-API.
In a future blog post, I will cover how we handle circular dependencies. I may also write posts about how to get different output formats and, if I do some research, how to use Piston to provide more than a read-only API.
| [0] | I am using the most recent release of Piston. It would not surprise me if this particular foible had already been fixed in development. |
| [1] | One limitation of this functionality is that the registry only supports a single handler per model, and it will silently overwrite the entry in the registry if you define a second one, which can lead to very confusing behaviour. I will briefly touch on this when I write about circular dependencies. |
Posted: Sat, 05 Nov 2011 09:51 | | Comments: 2 |
Fri, 04 Nov 2011
Reducing the Vertical Height of Google Reader
If you're like me, then you're a big fan of the new Google Reader interface, except when using it on a small screen, where the vertical height it takes up is just too much. The good news is that I've discovered a simple way of fixing this if you're running Google Chrome:
Install Minimalist for Google Reader.
Open the options dialog.
It should open in the "General" tab. If not, click that in the sidebar.
Check the "Use custom CSS" option.
Paste the following in to the text box:
#top-bar { height: auto; } #search { padding: 5px 0; } #logo { margin-top: -13px; } #lhn-add-subscription-section { height: 35px; } #viewer-header { height: 35px; } #viewer-top-controls-container, #lhn-add-subscription { margin-top: -17px; } #title-and-status-holder { padding: .5ex 0 0 .5em; } #entries-status { margin-top: 3px; }Reload Google Reader.
Voila!
Posted: Fri, 04 Nov 2011 20:59 | | Comments: 1 |
Thu, 03 Nov 2011
reStructuredText Preview for PyBlosxom

This blog runs on blogging software called PyBlosxom, running on my VPS. I have never really used it as much as I could have, partly because my blog writing is sporadic at the best of times, and partly because I find the software itself somewhat difficult to work with. My main qualm is that, out of the box, you have to write plain HTML for your posts. This is moderately painful when it comes to writing free-form text, but truly terrible if you want to write technical posts including code.
So I started looking around at various platforms, most of which provided a rich-text editor. I've never found rich-text editors particularly useful, partly because I love me some semantic markup [1] but mostly because they tend to be designed, again, for free-form writing with occasional formatting.
At work, we've started using Sphinx for all of our documentation, which uses reStructuredText, of which I am a great fan. So I was thinking to myself, wouldn't it be great if I could use a blogging platform which allowed me to write in reST.
It transpires that PyBlosxom is just such a platform [2], so I'm sticking with it.
One of the advantages of writing my posts in HTML is that I could easily see what they were going to look like whilst writing them, by pointing a browser at the file on my local machine. With reST, this wasn't immediately possible, which posed something of a problem.
To solve this, I've written a little Python script which takes a file as its first argument, processes it, writes the HTML output (with the title included) on stdout and the metadata for the post on stderr. I've been using it to preview this post like so:
python preview_rest.py pyblosxom-rest-preview.rst > preview.html
You'll need to have docutils installed (python-docutils on Debian) for it to work. Enjoy!
| [1] | If I could easily write my blog posts in LaTeX, I probably would. |
| [2] | Come back, PyBlosxom, all is forgiven. See http://pyblosxom.bluesock.org/registry/text/rst.html. |
Posted: Thu, 03 Nov 2011 21:11 | | Comments: 1 |
Mon, 04 Apr 2011
The Red Sox Aren't Doomed
The baseball season has started, and the Red Sox are off to an 0-3 start. A lot of Red Sox fans will be feeling down about this, so I'm going to cheer them up the only way I know how: statistics.
In this post, I'm going to look at how well teams records through the first 3 games of their seasons predict their record on the season. I'm going to examine this by looking at the correllation between the number of wins in the first three games of each team's season and their final win percentage for that season. I've used every team-season since 1962, the year in which the NL adopted the 162-game schedule that the AL started using a year earlier. This includes strike years, as they didn't occur to me until after I'd finished making all the graphs.
Let's start out by looking at the distribution of wins in the first three games:
The numbers break down as follows:
| 0 Wins | 1 Win | 2 Wins | 3 Wins | Total |
|---|---|---|---|---|
| 148 | 473 | 475 | 152 | 1248 |
Unsurprisingly, there are a lot more 2-1 and 1-2 records, with a large tail-off for 3-0 and 0-3. It should be noted, however, that the 2011 Red Sox are part of the most exclusive club as far as records in the first 3 games go.
Now let's look at the distributions of full-season records against the number of wins in the first 3 games:
Looks pretty flat, doesn't it. Let's add in a linear trend line to see just how flat it really is:
That confirms that it is actually pretty flat. There is a tiny amount of correlation between the first 3 games and the full-season record, but only at r=0.17, with r squared at a miniscule 0.03.
For those not in the know, the r of two variables ranges between -1 and 1, where 1 means that there is perfect positive correlation (i.e. when one variable increases, the other always increases) and -1 means that there is a perfect negative correlation (i.e. when one variable increases, the other always decreases). 0 means that there is no relation between the values.
r squared tells us how well our r fits our model. It is pretty much arrived at by comparing what r (the black line in our graph) predicts with the actual data (all the blue points). If the black line represented our data well, r squared would be near 1. As it does not represent our data well, it is very near 0.
So, to summarise, we have looked at how good a predictor the first 3 games of each team's season from 1962 to 2009 were for their overall record in that season, and have concluded that there is a relationship between the two, but that it is of no great significance.
Red Sox fans, take heart!
Sources and Scripts
All of these conclusions were drawn using game logs freely available from Retrosheet.
The following shell script was used to generate summaries of games in each year:
for n in $(seq 1871 2009);
do
cut -d, -f6,9,4,7,10,11 < GL$n.TXT > summary$n.txt
done
The following Python script was used to process the summarised game logs into win-loss records for the first 3 games and the whole of each season:
#!/usr/bin/env python
from csv import DictReader
first_three_wins = {}
first_three_losses = {}
wins = {}
losses = {}
for n in range(1962,2010):
d = DictReader(file('summary%d.txt' % (n,)),
fieldnames=['away','away_no','home','home_no','away_score',
'home_score'])
for r in d:
home_name = "%d%s" % (n, r['home'])
away_name = "%d%s" % (n, r['away'])
first_three_wins.setdefault(home_name, 0)
first_three_losses.setdefault(home_name, 0)
wins.setdefault(home_name, 0)
losses.setdefault(home_name, 0)
first_three_wins.setdefault(away_name, 0)
first_three_losses.setdefault(away_name, 0)
wins.setdefault(away_name, 0)
losses.setdefault(away_name, 0)
away_win = r['away_score'] > r['home_score']
if int(r['home_no']) <= 3:
if away_win:
first_three_losses[home_name] += 1
else:
first_three_wins[home_name] += 1
if int(r['away_no']) <= 3:
if away_win:
first_three_wins[away_name] += 1
else:
first_three_losses[away_name] += 1
if away_win:
losses[home_name] += 1
wins[away_name] += 1
else:
wins[home_name] += 1
losses[away_name] += 1
for team in wins:
w = float(wins[team])
l = float(losses[team])
w3 = float(first_three_wins[team])
print "%s,%f,%f" % (team,w3,w/(w+l))
Insofar as it matters for such a short snippet, this script should be considered to be in the public domain.
Posted: Mon, 04 Apr 2011 23:03 | | Comments: 17 |
Wed, 15 Dec 2010
Cycling Websites
Being something of a geek, I couldn't start cycling without looking around at what online aids there are available. Below I give details of three websites that I've found useful: Ride With GPS, Bike Route Toaster, and CycleStreets.
Ride With GPS
Ride With GPS is the site that I've been using most often. It allows you to create routes and then log trips along those routes. From that information, it will give you a breakdown of how long you've spent on your bike, how much distance you've covered and how much elevation you've gained.
The creation of routes is done using Google Maps data. You add waypoints along the route and use either straight lines or automated routing to join them up. As you create your route, the elevation profile is displayed, as well as the route distance and the total elevation gained and lost. Excitingly, when you move your mouse over the elevation profile it shows where on the map that elevation is, so you can correlate what you were feeling and the actual facts.
The automated routing is performed by the Google Maps' routing algorithms (either car or walking). This can cause problems, as the car algorithm is likely to choose a poor/dangerous route for a cyclist, and the walking algorithm (unhelpfully labelled 'Cycling' by Ride With GPS) will do things like sending you the wrong way up a one-way street. As such, you have to switch between the two to ensure that you are getting a good, cycleable route.
As the name suggests, Ride With GPS is designed for use with a GPS device. None of the functionality above requires a GPS, but there is other functionality that I haven't described as I've never used it. The most obvious example is that you can upload a trip from your GPS and it can/will be used as a route. This obviates the need for the route planner, so I do wonder how much improvement/work it will receive from the developers.
A few examples from my Ride With GPS profile:
Bike Route Toaster
Bike Route Toaster is a bike route planning website. It doesn't include any of the non-routing features of Ride With GPS, but it does include integration with OpenStreetMap, which has considerably more detailed information about cycle-only routes (and won't send you the wrong way down one-way streets). This makes it a good choice for planning a route through somewhat unknown areas, where Google may well screw you.
One thing that Bike Route Toaster does lack (as far as I've been able to tell) is a location-finder. So I have to find my house every time I'm planning a route from my house, and I have to look up the exact location of places I'm not familiar with on Google Maps or OpenStreetMap to be able to route to them.
With both Bike Route Toaster and CycleStreets, you can use their routes in Ride With GPS. This is done by exporting to GPX (the GPS eXchange format) and then importing that into Ride With GPS. There are some complications, so this isn't a good solution for every route, but if you're planning a particularly tricky or regularly-used route, it is worth it.
CycleStreets
As with Bike Route Toaster, CycleStreets is solely designed to plan routes from Point A to Point B. Also like Bike Route Toaster, it uses OpenStreetMap's data, which gives you better routing than Ride With GPS. However, unlike the other two websites, it doesn't allow you to specify points C, D or E in between your Points A and B. You give it start and end points, it does the rest.
After spending some time deliberating, CycleStreets generates not one, not two, but three (potentially) different routes to get you to your destination. These consist of a Fastest route, which ignores any niceties to give you the shortest distance A to B; a Quietest route, which prefers a longer route with minimal traffic; and a Balanced route, which takes a little from column A, a little from column B. You get a map, an elevation profile (though it's not as shiny as the Ride With GPS one), and step-by-step instructions with mini-maps (which is better than either of the other sites).
Of the three projects, CycleStreets is probably the one that I'm most excited about. It's run on a not-for-profit basis, and they plan on releasing the code as open source. They're also looking to expand and allow your points C, D and E, as well as to improve their use of the OSM data.
Here's the CycleStreets routing for my house to Leam and Coventry to Peterborough.
Posted: Wed, 15 Dec 2010 20:00 | | Comments: 0 |
Cycling
I recently bought a new bike (which deserves a separate blog post, regardless of whether or not it will ever receive it) and, as such, have been doing and planning quite a bit of cycling.
My current aims are:
- In the short term, do 3 cycle rides of at least 5 miles per week, in addition to my regular cycling,
- In the medium term, cycle to or from work once (Coventry -> Rugby, around 15 miles),
- In the long term, cycle to work every day.
- In the very long term, cycle from my house to my parents' house (about 70 miles) (probably over two days, with an overnight somewhere in betwixt the two).
Posted: Wed, 15 Dec 2010 12:20 | | Comments: 0 |
Tue, 02 Nov 2010
Re-encoding an Audio File as Video
As part of my current MythTV project (about which I may blog more in future), I wanted to re-encode an audio file with a black video track.
To do this, I used the following commands:
$ convert -size 320x240 xc:black /tmp/black.jpg $ ffmpeg -loop_input -i /tmp/black.jpg -i in.mp3 -qscale 1 -shortest -acodec copy out.avi
I hope this helps.
For those who are wondering why I wanted to do this, it's because MythTV won't play audio-only recordings, and I wanted to include BBC iPlayer radio programmes in my recordings screen. So the simplest way to fix this was to just make it so that they weren't audio-only.
Posted: Tue, 02 Nov 2010 22:36 | | Comments: 0 |
Wed, 11 Mar 2009
Review: Lowboy
Lowboy, by John Wray is a novel about a schizophrenic teenager who calls himself Lowboy. There are two main threads to the story, that of Lowboy who escapes his handlers in a subway station into the tunnels themselves, and that of his mother and the police officer assigned to find him.
I enjoyed the novel a great deal. The Lowboy thread helped me to understand what having Lowboy's condition might feel like, the constant shifting of attentions and the extrasensory feelings he was having, without alienating me from him. The mother thread both grounded the novel and provided a background of normality against which the Lowboy thread was juxtaposed.
The characters in the book are vivid. Lowboy himself is neurotic but never alien. His mother, seen through the police officer's eyes, is alien but not unwelcome. The police officer, seen through the mother's eyes, is predictable but not boring. There are other, peripheral, characters. They are well-drawn and never feel like they exist to expose some facet of a main character or to move the plot along.
The only disappointments I had with the novel came within the last 5 to 10 pages. The ending is not quite as clear-cut as I was hoping, and the 'twist' doesn't have a mind-blowing effect. On the other hand, the ending fits with the rest of the book perfectly, and the lack of a 'twist' means that rereading the novel in future will remain interesting, so the disappointment was not too great.
I recommend reading this wholeheartedly. As with The Sound of Building Coffins, anyone I know should feel free to ask to borrow it.
Posted: Wed, 11 Mar 2009 12:00 | | Comments: 2 |
Tue, 10 Mar 2009
Automated Blog Posting With Pyblosxom
As some naysayers in #wuglug this (Sunday) evening suggested after three blog posts in a two hour span, I'm going to start posting to my blog once a day automatically, unless I really, really want to get around this. This should be posted Tuesday morning at 6am (famous last words).
I'm using PyBlosxom as my blogging engine, which is essentially based around dropping files into a directory. As such, it's almost trivial to write a script that will do the moving into place which can then be automated by cron. As it's only almost trivial, here's my script:
#!/bin/sh
set -e
set -u
FROMDIR=/home/daniel/pending-blogposts
TODIR=/home/daniel/test
next_post=`ls -t $FROMDIR | tail -n1`
if [ z != z$next_post ]; then
echo $TODIR/$next_post
if [ -e $TODIR/$next_post ]; then
echo "Already a post with that name."
exit 1
else
echo "Posting '$next_post'"
mv $FROMDIR/$next_post $TODIR
touch $TODIR/$next_post
fi
fi
Just change FROMDIR and TODIR to match your needs, drop the resulting script somewhere on your server and add something like
0 6 * * * /home/daniel/bin/blog-postto your crontab and you should be away.
Posted: Tue, 10 Mar 2009 12:21 | Comments: 0 |
Mon, 09 Mar 2009
PyRoom
I'm writing this blog post using a text editor called PyRoom, which bills itself as allowing 'distraction free writing'. I've also written all of my recent blog posts using it.
It's a really stripped down editor, which runs fullscreen and contains just a text area within the middle of the screen, with configurable themes to define background and text colour. It's designed for creative writing rather than text editing per se, and I really like it.
It's packaged for Debian (and so probably Ubuntu), or can be downloaded from http://pyroom.org/download.html.
Posted: Mon, 09 Mar 2009 02:22 | | Comments: 0 |