Python packaging in March 2016

Hooray! I finally finished porting my lolslacktest code over to be a proper Python module, installable with pip, and with a virtual environment. What a pain! Some notes on what I learned. But first, some reflection.

This project has been one of the least fun things I’ve done in a long time. I can tell by the way I’ve been foot dragging. It’s confusing and hard to debug packages. The docs are inconsistent and suffer from years of accretion. Some things are genuinely confusing, like Python’s surprise lack of support for circular imports. And in the end result is my product works just like it did before. No user visible changes. Behind the scenes things are better; the install is cleaner, I have a virtual environment for proper external dependencies, etc. But nothing fun, just slightly less crappy ops.

The state of the art for Python Packaging

As of March 2016 the Python 3 state of the art for package installation is pip, pyvenv, and setuptools. pyvenv and pip comes with Python 3.4 but setuptools is still extra. (On Ubuntu you have to install python3.4-venv with apt.)

The state of the art for managing Python packages keeps changing. This Stack answer explains the landscape as of September 2014. This packaging user guide is updated as of September 2015 and is mostly very good, but even it references things that don’t work such as “python bdist_wheel”. I also found this guide useful but it was mostly written in 2012 (although occasionally updated).

The pip installed by python 3.4 (Ubuntu) is version 1.5.4, you want version 8.0 or later if you want to do things like install gevent with a precompiled wheel file. setuptools is old too. So first thing you should do after setting up a pyvenv is “pip install -U pip setuptools”.

The setuptools docs are not very good. They assume you know how distutils works. Also they spend a lot of time talking about easy_install and none at all talking about pip.

Honestly all this Python packaging stuff is a big mess, I feel like you have to understand the history to be able to use the current state of the art. It’s a shame Distutils2 didn’t work out for Python 3.3 and rationalize it all. To be fair, packaging in almost every system sucks: npm, rvm, homebrew, they’re all a mess. All but my beloved apt, and that’s because the Ubuntu package management team works super hard.

Some hacks and code things

I had to do a few hacks and other things to make my project work.

The biggest code change was just wholesale replacing absolute imports like “import lib” with relative imports “from . import lib”. I think that’s actually correct, but I was doing it kind of blind for every single file.

I had to modify my code because Python really doesn’t support circular import dependencies. You don’t really notice until you are using relative imports. The accepted solution for circular imports is to break the circularity, refactor code. Screw that. My workaround was to move some of the import statements inside functions, so they execute at run time and not import time. That’s bad and inefficient but expedient.

There’s two similar-looking ways for a Python package to specify dependencies. You can add an install_requires stanza to your setuptools setup() method, or you can use pip to install a bunch of stuff listed in a requirements.txt file. The setup() dependencies are installed automatically when a package is installed, but nothing installs requirements.txt automatically. OTOH the requirements.txt option is more powerful, for instance pip can install things from github URLs whereas setuptools can’t. (The setuptools docs say you can do this with dependency_links, but I couldn’t make it work.) I’ve ended up using a mix of both, preferring setuptools where I can.

I have some shell scripts named in the “scripts” section of my, so they are installed in the virtualenv bin directory. But I want them to work even if the virtualenv isn’t activated, so those scripts have to source their own virtualenv. Mostly for execution in cron; getting cron to activate a virtualenv is not easy. The hack I did for this was this shim of code at the top of each script:

VENVDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
source $VENVDIR/activate

The bash magic in the first line sets VENVDIR to be the directory where the bash script itself is. Conveniently, that’s the same directory that has the activate script.

I have no idea how to put version numbers on my program. It’s a private thing only I’m installing, so for now I’m going with the dbschema version. Part of me wants to just put the git commit nonce there.

My deploy script used to be rsync from dev to prod. Now it’s sshing into prod and having it do a “git pull” followed by “pip install -U”. I took this approach at migurski’s suggestion. It means I’m not using any of the fancy distribution builds and versioning stuff that setuptools/pip enables. But I don’t really need those right now, they make more sense for public code hosted at PyPI

Note to self: you can’t move a pyvenv environment once its created. They have paths hard coded.



Python modules frustration

I’m trying to turn my lolslackbot code into a proper set of Python modules, so I can manage the code with pip and venv. That means turning my moduleless code into a bunch of packaged modules. And writing setuptools.

And oh god it’s awful. Python’s handling of module namespaces and imports is confusing, particularly when circular dependencies are involved and/or command line scripts as opposed to Python libraries. I can’t even tell if this caveat applies to me or not

Note that relative imports are based on the name of the current module. Since the name of the main module is always “__main__”, modules intended for use as the main module of a Python application must always use absolute imports.

The other problem is years of cruft that’s accumulated in Python packaging tools. setuptools, disttools, easy_install, pip, wheel, egg, virtualenv, pyvenv.. To some extent this has been rationalized in modern Python: use pip and setuptools and call it a day. But the docs that are out there are hard to follow because they tend to describe the history, not the current practice.

The two best documents I’ve found so far:

Right now I’m hung up on some code where “from . import lib” works in some contexts but not others. I think it may be related to circular dependencies but am not sure.

Update: turns out Python doesn’t actually support circular imports, despite looking like it does in many circumstances. So now I’m having to refactor all my code that worked just fine to not have module A import module B which imports module A. I feel like I’m programming in C again, only without #ifndef macros. One partial solution is to only do the imports inside the running function code. That avoids the circular imports at import time, at the expense of deferring the import code machinery running in your actual execution path all the time. Maybe I’m misunderstanding how to really make this work. (There’s a bunch of arrogant advice which says “if your code has circular imports you should refactor it.” No, imports are how you manage namespaces in Python, and Python shouldn’t have this limitation.)

Turning hacky Python into a real app

I’m about to go on vacation. I just finished a bunch of improvements to lolslackbot. Also I suffered my first real outage since it started as a simple cron job six months ago. So I’m reflective on what I need to do to graduate this service into something more serious and reliable. Some projects towards making the system production quality…

Locking and timeouts. The outage was because I tried grafting lockrun into my cron job to ensure only one update runs at a time. But if the update script hangs (say, because the remote API is down) then it holds the lock forever, or at least until I wake up in the morning. I’ve removed the locking for now, running the update twice at once isn’t too bad. But I’m considering adding back the locking along with some sort of global timeout to the scripts, or at least the network requests, so they can’t hang forever. But that’s a hack on top of a hack. “Run this once and only once, reliably” is a hard problem.

Error recovery. I’m a big believer in throwing exceptions when things go wrong, then fixing the code to handle the error. But at some high level I’d rather the script kept going. I’m processing hundreds of game objects a day; it’d be better to skip a game that’s weird and breaks my code than to have the whole system fail until I can fix it.

Logging. I’ve been using Python’s logging program but I’m not logging to a file anywhere; just stderr. That makes it hard to look back at what happened. Or even notice if I skipped a weird game. I might also benefit from some contextual logging, but that may be overkill.

monitoring. I don’t exactly need a pager for 3am, but I would like some better way of knowing if stuff is working than “did cron email me?” and “have any new messages shown up in awhile?”

virtualenv. I’m naughty and don’t use venv. I should.

make code a module. I’m also naughty and don’t have my own code in a proper module. I should. Would be nice to separate the Python scripts from the precious data files, too.

test suite. A lot of my tests run based on some live data I captured a month or two ago and has baked in it things like “there are exactly 110 game objects”. That makes it hard to add new test cases. I think I’m stuck here and need to just suck it up and hand-edit in new test cases to the JSON blobs I use in my mock framework. And then update all the tests. I also had a hilarious problem yesterday where one of my test cases had enshrined the wrong expected output, basically hiding a bug I sort of knew about. That’s the hazard of starting a test suite with “I think this code is working; let’s assume its current output is correct”. Still that test suite is the single best investment I’ve made in the project, so much value.

indices / SQL performance. I haven’t done any work towards creating indices for common queries other than the implicit indices that primary keys and unique constraints give you. So far my whole database is only 20MB, hardly worth worrying about. But at some point table scans are going to stop being fast. I haven’t seen any evidence of a slow query log in sqlite, I suppose I could build my own wrapper in Python.

Distributed transactions. A closely related problem to locking; I have a sort of distributed transaction problem when I’m delivering messages to Discord. I want to mark the message delivered in my database if the message is successfully delivered to Discord. So I write the “message processed” bit after Discord accepts the message. But sometimes Discord throws an error even though the message got delivered anyway, so the message doesn’t get marked processed and later is delivered a second time. It’s sort of a distributed transaction problem I don’t quite know how to solve.

Web view. Big project, I think I’ve graduated to needing a web view for seeing and editing the objects in my database. Particularly for stuff like “add a player to a team”. I really don’t want to roll this all ad-hoc, way too much work. Currently thinking of doing this as a whole new codebase not reusing much (if any) existing code, to try out Flask and SQLAlchemy. With a plugin to generate the editing UI, maybe WTForms or Flask-Admin.

sqlite3 vs postgres. I continue to have the feeling that I should switch to postgres, but I’m now pretty in love with sqlite’s simplicity. The planned web view will be my first time with simultaneous writers, which previously I assumed was when sqlite may be too limiting. OTOH sqlite does have database-level locking, it’s not going to blow up. As long as the locks are only held for “a few milliseconds” I can live with that. Python’s driver has a 5 second timeout for connections by default. The sqlite3 command tool by default just throws an error if the database is locked, but you can specify a timeout in a somewhat awkward way. (If sqlite3 is compiled wrong, locks can last seconds.)

So that’s a big list of projects. I’m not thrilled to do any of them; they’re not features, and they’re a significant investment for a project that has very few users. OTOH they will make my life better as I work on this.


Home-rolled database migrations

Confession time: I’ve never personally managed a whole project that involved a complex, changing database. I’ve certainly worked on projects with them, like Google AdWords, but someone else was in charge of the database management. With this LoL Slackbot I’m working on I’m the guy. Which means I’m the one who has to handle migrating a database when the code changes the schema.

Here’s a sample migration script I’ll discuss below.

Turns out changing database schemas on deploys is complicated. Who knew? It was a blessing in disguise that sqlite’s alter table is so limited, it forced me to take migrations seriously. And since I’m not using a tool like SQLAlchemy I’m writing migration scripts by hand. Here’s what I do:

  • The database has a table “globals” with a value “dbSchemaVersion” that I manually increment on changes.
  • My app has a single connect-to-DB function everything uses. It checks the schema version and refuses to run if it’s wrong.
  • When I write code that changes the schema somehow, I manually increment the version. This step is a bit error prone in that I could forget. I could automate it with a checksum on the output of “.schema”, but I won’t bother until the first time I forget.
  • I use sqldiff to summarize the schema changes.
  • When I write code with a new schema change, I keep manual notes on the migration I’ll eventually need. Then I just develop and test my code that uses the new schema in my dev environment ignoring the need for migration. My test scripts start with an empty database.
  • Once the feature development work is done I then write migration scripts. See a simple example in this gist.
  • is the actual migration. It validates the schema version is what’s expected, then runs arbitrary shell code. In this case all it does is run 3to4.sql. Sometimes I also have it run custom Python programs that do more complex data filling.
  • 3to4.sql is sqlite code for the migration. I do as much in SQL as possible. The .bail on pragma at the start is essential; see here for why. This example script adds a column and then fills it with a static value, very simple. Often the migration script has to copy some table over to a new table and then rename it, since alter table in sqlite3 is so limited.
  • is my test suite for the migration. I run this on a copy of the live production database (taken out of backup). It runs the migration, then diffs the schema from a dev database. Then it runs some ad hoc queries to validate the new database is as expected, which is totally hokey. I should make that a proper pass/fail test.

One observation I’ve had from this is “schema changes” are more than just changes to the shape of tables. They’re also changes to global values stored in the database, and sometimes changes to the meaning of the data in some column. This single incrementing version number captures all of that just fine, at least with a single developer.

Having now done this work I’m less excited than I used to be about autogenerated schema migrations. I mean, I should try one, but there’s something to be said for hand-crafting a migration when you have to change column values, rerun programs to fill empty data, etc. The automation around testing and version numbers is really valuable though.

I’m terrified with how ad hoc this is. There’s this temptation doing ops work to half-ass stuff, to just type some code in by hand and throw it out, or test it by visual inspection, or… Part of the devops mystique is treating the ops part as real development too, with real testing and repeatability. I’m about 3/4 of the way there with the migration stuff but not fully.

Timestamp precision

I’m in the habit of recording timestamps as integer seconds. Ie, I’m inserting into my database the result of int(time.time()) from Python. It’s a habit left over from the bad old days where Unix machines didn’t have sub-second clocks. Any modern system has access to at least a millisecond timer, if not a microsecond timer or better, and there’s no reason not to use it.

The only drawback for me with my project is I declared all my timestamp fields as “integer”, and sqlite3 does in fact round them off on storage. It’s harmless enough, at least if you’re not relying on exact timestamp storage, but I should probably do a schema migration at some point for it.

It’s a bit confounding that Javascript’s time precision is integer milliseconds, although it seems like a reasonable practical choice.


Success! lolslackbot

Big Success! My League of Legends Slackbot has had two major upgrades in the past two weeks. And both have gone quite well, with no serious bugs. I’m so, so glad I invested in a serious testing regimen. I’ve had automated tests for about a month and it is like night and day for development speed. Even on a small project, even working alone.

First, a quick description of the project. It’s a social monitor for League of Legends players. It watches the Riot API for games played by a select group of people. If someone plays a game, it reports the game result to Slack or Discord. A form of social presence. I built it first for a group of about 6 friends playing together and chatting via Slack. I’m now expanding it to a group of about 500 people on another Slack, which is what necessitated me evolving this from a quick hack to a real thing.

The first big change was 10 days ago. I added the notion of “groups”. Previously the list of players I was watching was hard-coded and I assumed everyone was interested in everyone. Now there’s Person and Group database tables and the game reporting is able to collect stats for all the people in a specific group. I still have a design flaw here I need to fix; currently doesn’t work if people from two different groups play the same game.

The second big change was today. I added message routing and delivery. Previously I had hardcoded a single Slack and Discord channel. Now I have Destinations, Subscriptions, and Messages tables so that the game results from various groups can be delivered to various different Slack and Discord channels. A lot of complexity, made doubly hard to test because the end result is actually writing messages people see on a live service. I ended up using unittest.mock for real to mock out the last bit of message delivery in various places.

I wish I’d thought to abstract out a Messages table earlier. Before I was just generating text messages from game objects, but then the game processing code got too deeply entwined with the message delivery code. Now I place create messages in the DB with one program and deliver them in a separate program. Much simpler.

Next major product feature I need to add is some sort of web interface for managing Groups, People, Subscriptions, etc. That sounds really tedious, but necessary. I think this is the step where I’ll graduate and use some higher level Python frameworks, maybe Flask for the webapp and SQLAlchemy to mediate database access. I don’t want to port all my old code over to SQLAlchemy just yet, but I would like to get my feet wet.

I should also productionize the python code. I’m not using virtualenv, my code isn’t even packaged in a proper module. Time to clean that up and add some logging, monitoring, etc.

May also be time to invest more in the test infrastructure. My automated tests are great but they rely on a test dataset I can’t really augment. I generated test data once and have no way to do it again. Also I’ve been running a separate test system for database migration scripts, much more manual and inspection by eye. I think that may be fine actually, but it’s worth examination.


Python is fast opening sqlite3

I’ve got a lot of code calling sqlite3.connect() in my program now. Particularly in my test suites. In MySQL and Postgres this can be slow and you end up needing connection pooling. I had a feeling it shouldn’t be in sqlite3 given its simplicity, but I wasn’t sure.

A bit of work with timeit later I can report it’s taking about 18usec to open a database. My connection function is more complex too, doing a select into a “globals” table to test the schema version. Even with all that work it’s 170usec, or < 2ms. That’s not super fast but it’s not a problem if my test suite does it 100 times.

$ python3 -m timeit -s 'import sqlite3' "sqlite3.connect('games.sqlite')"
100000 loops, best of 3: 17.6 usec per loop