Installing openaddress-machine on a new EC2 system using Chef

No need to install stuff manually; Mike already wrapped up scripts to set up an EC2 system with Chef for us. Here’s how to use it on a brand new EC2 micro server

sudo bash
apt-get update
apt-get upgrade
apt-get install git
git clone https://github.com/openaddresses/machine.git
cd machine/chef
./run.sh

Done! The shell command openaddr-process-one now works and does stuff.

In brief, this:

  1. installs Chef and Ruby via apt
  2. runs a Python setup recipe. That installs a few Ubuntu Python packages with apt (including GDAL and Cairo), then does a “pip install” in the OpenAddress machine directory. This tells pip to install a bunch of other Python stuff we use.
  3. runs a recipe for OpenAddresses. This uses git to put the source JSON data files in /var/opt.

Update

But really, that’s so manual. If you just pip install openaddr-machine it makes a /usr/local/bin/openaddr-ec2-run script that will do the work for you. That in turn invokes a run.py script which you run on your local machine. It, among other things, runs a templated shell script to set up an EC2 instance and run the job on it.

The shell script that is run on EC2 is pretty basic. It:

  1. Updates apt (but does not upgrade)
  2. Installs git and apache2
  3. clones the openaddress-machine repo into /tmp/machine
  4. Runs scripts to setup swap on the machine, then invoke chef to set up the machine
  5. Runs openaddr-process to do the job
  6. Shuts the machine down.

The run.py script you use on your own machine is mostly about getting an EC2 instance.

  1. De-template the shell script and put it in user_data.
  2. Use boto.ec2 to bid on a spot instance
  3. Wait for up to 12 hours until we get our instance

The details of how the EC2 instance is bid for, created, and waited on are a bit funky but seem well contained.