Friday, 8 February 2013

Using inotifywait To Run Your Tests When Your Code Changes

Recently, I tweeted about my desire for a change-aware test runner:
I haven't quite come up with that (though I do now have a half-finished blog post prognosticating on it), but I do have a solution which covers some of the bases: I determine what tests I want to run, and the test harness runs them whenever my code changes.

This "test harness" is actually a bash snippet which relies on inotifywait, a command-line program which blocks until it detects an inotify event on the files it is watching, and is incredibly simple:
while inotifywait -e close_write -r $CODE_DIRS --exclude=".*sw[px]"; do
    $TEST_COMMAND
done
This while loop is simple. When the inotifywait command stops blocking (and with exit code 0, which it will unless something unusual has happened to the files you're watching), we run $TEST_COMMAND. You can put whatever you want there, so you could limit the number of tests you run that way.

Now let's break that inotifywait call down. $CODE_DIRS can be any number of directories or files that you would like inotifywait to watch. -e narrows down the events that we should unblock on; we don't want to run the tests every time we open a new file. close_write triggers on (to quote the man page):
file or directory closed, after being opened in writable mode
vim triggers this when I save, so it works for me. You might also want to listen for modify (file or directory contents were written), move (file or directory moved to or from watched directory) and create (file or directory created within watched directory). Multiple -e options are comma-separated.

-r, as with many commands, tells inotifywait to watch directories recursively. This will mean that all directories under those that you specify will trigger your tests. I normally just run this command pointing at the top-level directory of my project. It's worth noting that -r applies to all directories that you pass.

Finally, --exclude takes a regular expression of files to exclude from your watch. As a vim user who hasn't configured swap files to be stored out of my tree, I want to ignore them (otherwise my tests run every time I open a file because vim writes out a swap file).

A quick disclaimer: as this uses inotify, this will only work on platforms that support it (which, I think, is only Linux). Mac users might want to examine this StackOverflow question. Windows users might find this answer useful.

UPDATE: +Murali Suriar has helpfully pointed me to kqwait on IRC, which will help out any Mac/BSD users.

Tuesday, 5 February 2013

Removing .pyc Files: A Coda

A few days ago, I wrote a blog post detailing a git hook that would automatically remove .pyc files when checking out a different branch. I received a variety of feedback, which I will outline here.

Firstly, +Jeff Mahoney provided a more efficient implementation of the git hook in a comment on the original post. I haven't tested this, but it might provide a more efficient implementation if you need it.

Secondly, there are a number of helpful comments on the reddit post, including a fine-tuned git/xargs command.

Finally, and most importantly, a number of people (both in the blog comments and on reddit) pointed out the PYTHONDONTWRITEBYTECODE environment variable.  Setting this to anything will mean that Python doesn't generate .pyc or .pyo files, completely circumventing the problem that I was trying to solve.

Thursday, 24 January 2013

Git Tip: Get A Warning When A File Has Changed

Every project has one1. The file. If that file changes, you need to know, because if you don't then your life is going to be very unpleasant in very short order.

At Hogarth, that file is called development.conf. It is, roughly speaking, a dump of the way our various elements and components are wired together2, and as it's completely unmergable only one person can be working on it at any one time (something which we're working to fix). If you want to work on it, you need to make sure that you have the latest version in the database, so that when you dump out your new version, it includes all previous changes.

At various times, we've all been caught out by missing an update, and had to completely redo (often complex) changes just to incorporate a (often minor) previous change. One of my colleagues (Patrick), having just been caught out by this for (I think) the first time, suggested that it would be useful to get a warning when this changes. He reasoned that as we use git to manage all change to our codebase, it would be natural to write a git hook which did this for us.

We had a look together at the list of git hooks, and couldn't really see anything appropriate. post-merge seemed like what we wanted, but we couldn't work out how we could determine what had actually changed. So Patrick went off to do some real work, and I turned to the ever-reliable #git IRC channel on Freenode. ojacobson suggested a solution, which works beautifully.

To understand how this works, you'll need to know about the git reflog. +Alex Blewitt has a good introduction here. If you do one thing as a result of this article, educate yourself about the reflog. It's incredible!

The important take-away from that article is that:
git diff "HEAD@{1}"
will show you the diff between what you currently have in your tree and the commit before the last action that changed your history. Importantly, it treats merges as a single entry, so if you are immediately post merge, running the above command will show you a diff containing everything that changed in that merge, regardless of the number of commits the merge contained.

Pulling these various bits together, with some shell wizardry, gives us:
#!/bin/bash
set -eu

git diff "HEAD@{1}" --name-only | grep config/development.conf 2>&1 > /dev/null
CHANGED=$?

if [ $CHANGED ]; then
    echo
    echo -e "\e[41m!!! development.conf HAS CHANGED !!!\033[0m"
    echo
    echo "You should reload config using (something like):"
    echo "  ./manage.py loadconfig config/development.conf"
fi
We check if anything has changed, and if it has then we print out a BRIGHT RED WARNING MESSAGE. Drop that in .git/hooks/post-merge, make it executable, and you're good to go.


1 Well, maybe not every project. If you don't, I'm jealous.  But a lot of "enterprise" or otherwise unloved codebases will have one. And, more pertinently, ours does.

2 You can tell that ours is not a Java project because this is not an XML file.

Wednesday, 23 January 2013

Git Tip: Remove .pyc Files Automatically

N.B. I've published a follow-up to this here, which includes a way to completely avoid this problem in the first place.

Recently, I've found myself increasingly caught out by stale .pyc files in our project. When I change from our mainline branch to a story branch (or vice-versa), I often find myself with inexplicable test failures because Python is using the .pyc files for no-longer-current code.

Luckily, it's pretty easy to fix this in git, using hooks, specifically the post-checkout hook. To do that, add the following to .git/hooks/post-checkout, and make the file executable:

#!/bin/bash
find $(git rev-parse --show-cdup) -name "*.pyc" -delete

Now, every time you checkout a new branch, all the .pyc files will be cleared out of your git branch.

Friday, 27 July 2012

Django, Meet Nose

At Glasses Direct, we use nose to run our tests, as it gives us all sorts of nice things like test functions and XUnit-compatible output (which Jenkins loves).  As the majority of our projects use Django, we use django-nose to integrate nose into our Django projects.  This gives us all of that nose loveliness when using manage.py test.
The process for setting this up is simple:
  1. Install django-nose and nose however you normally do (I would use pip install django_nose nose).
  2. Add 'django_nose' to INSTALLED_APPS.
  3. Add TEST_RUNNER = 'django_nose.NoseTestSuiteRunner' to your settings.
  4. Run manage.py test -s to check that nose is being used (the standard manage.py test doesn't have this option).
  5. Read manage.py test -h and nose docs to learn about the exciting things you can do with nose.
I've added a few more nose-related things to my blogging backlog, so I'll get to those eventually.

Wednesday, 18 July 2012

Deploying Sentry on Heroku

For a recent personal project, I've been using Sentry to monitor errors and suchlike. I was hosting this on my VPS, but the app it is monitoring is hosted on Heroku, so I thought that (a) it would make more sense to have the monitoring In The Cloud (TM), and (b) it would be an interesting exercise with Heroku, which I have been enjoying using a great deal.

It turns out to be pretty simple, and I've made a git repo available to make it even easier.  Just follow these steps:
  1. Clone my GitHub repository:
    git clone https://github.com/OddBloke/heroku-sentry.git
  2. Generate a secret key and set it as SENTRY_KEY in sentry.conf.py. See point 11 here for a good example of how to generate a good key.
  3. Install the Heroku Toolbelt.
  4. Sign up for a Heroku account.
  5. Log in to your Heroku account from the CLI:
    heroku login
  6. Navigate to your clone of the git repo and create a new Heroku app:
    heroku apps:create
    This command creates a heroku remote in the git repo, which we will use in a minute.
  7. Take the URL it spits out (which will be something like http://floating-earth-1234.herokuapp.com/) and set that as SENTRY_URL_PREFIX in sentry.conf.py.
  8. Commit your changes:
    git commit --all -m "Personalise."
  9. Push your repository up to Heroku:
    git push heroku master
    If all goes according to plan, this will cause Heroku to deploy, and you will see the output as it installs Sentry and all of its requirements. Finally, it will tell you that your app is deployed to Heroku, but we're not quite done.
  10. We need to configure a database for Sentry to use. We're going to use the new Heroku Postgres offering, which is currently in public beta: 
    heroku addons:add heroku-postgresql:dev
  11. The previous command will have given you a database name, something like HEROKU_POSTGRESQL_CHARCOAL. Tell Heroku to use this database: heroku pg:promote HEROKU_POSTGRESQL_CHARCOAL
  12. Finally, we need to run the Sentry setup script, create a default user and tell Sentry to use it: 
    heroku run sentry --config=sentry.conf.py upgrade
    heroku run sentry --config=sentry.conf.py createsuperuser
    heroku run sentry --config=sentry.conf.py repair --owner=JustCreatedSuperuser
    You should use the name of the super-user you create with the second command as the --owner argument to the third command.
Voila!  You should now be able to see Sentry by pointing your browser at the URL you used for SENTRY_URL_PREFIX earlier.

If you have any problems, you can check out your logs using heroku logs.

One final note: Heroku doesn't provide SMTP for you, so you'll also have to modify the SMTP settings to point at your own mail server (or play around with the Heroku add-ons that do provide it).

Friday, 13 July 2012

User Fixtures in Django

At Glasses Direct, we are setting up an internal system which needs a simple web UI.  As we use Django for all of our HTTP needs, the admin interface was the obvious, quick solution.  All we need to do is write the requisite model, tie it in to the admin interface and we're done!  Right?  Wrong.

The above description is missing an important part of the puzzle: authentication.  Django's admin interface (sensibly) requires authentication by default.  However, this piece of our system will only be exposed internally, and we don't want to have to manage credentials for all of our internal users (as we are sadly lacking when it come to internal single sign-on).

The obvious course of action is to remove the authentication.  However, this seems to be easier said than done.  Firstly, there is no simple switch to disable authentication. Secondly, even were there, we wouldn't want everyone to have access to the full admin interface, largely because it would be confusing for the target users.  So we can't get rid of authentication.  What we really want is a default user.

A default user is easy enough, you can add it using a User fixture that looks something like this (the easiest way to do this is to create a User object and use the dumpdata management command):
[{"pk": 2,
  "model": "auth.user",
  "fields": {
    "username": "default",
    "is_staff": true,
    ...
  }}]
Voila! After running a syncdb, you'll have a user who can access the admin interface.  Unfortunately, they won't be able to do anything, because they don't have any permissions.  Let's fix that by adding some (again, easiest to do this using dumpdata):
[{"pk": 2,
  "model": "auth.user",
  "fields": {
    "username": "default",
    "is_staff": true,
    "user_permissions": [12, 13, 14],
    ...
  }}]
You can see here that we've granted this user three permissions.  The relevant entries will show up in the admin interface.  We're done!  Right?  Wrong.

Everything will seem to be proceeding happily, possibly for quite some time.  Then, in a few weeks or months, you'll add another model, or app or something and suddenly your default user will have permission to do really weird things.  The problem here is that Django will occasionally regenerate the primary keys of permissions (and other internal objects).  So what are we to do?  After a fair amount of swearing this afternoon, my colleague Ondrej pointed me in the direction of natural keys. With these, you can future-proof yourself against primary key oddities:
[{"pk": 2,
  "model": "auth.user",
  "fields": {
    "username": "default",
    "is_staff": true,
    "user_permissions": [
      ["add_mymodel", "myapp", "mymodel"],
      ["change_mymodel", "myapp", "mymodel"],
      ["delete_mymodel", "myapp", "mymodel"]
    ],
    ...
  }}]
As with the above examples, you should generate this output with dumpdata, passing in the --natural flag on the command line.

To conclude, we've looked at how we can use Django fixtures to give us a default user with a known username and password, with reliable, known permissions. Perfect!


N.B. One option for "auto-authentication" would be to use a middleware class that sets the user on the request to our default user, something along the lines of this.  This component is only meant to be a quick procedure fix, so we haven't taken the time to do that.