Add support for running a subset of tests (aka "sharding").

This patch adds two new command line arguments, --shard-index
and --total-shards. These can be used to run a fractional subset
of the tests, and work by running every `total_shard`th test in
the list of tests, starting at offset `shard_index`.

Also, bump the version to 0.9.5.
