On JSON and nothingness

Python’s JSON module serializes the Python value None as the JSON value null. This is correct. However it causes odd bugs if you are not aware of it.

Specifically, I had code like this:

db.execute('insert into table values(?)', json.dumps(foo.get("bar"))

I was trying to insert a database NULL if foo didn’t have the key “bar”. But foo.get() returns None, which gets turned into the string “null”, which results in the string ‘null’ being inserted into the column. Derp.

>>> json.dumps(None)
>>> json.loads('null') == None
>>> json.dumps('null')
>>> json.loads('"null"')

>>> json.loads(None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/json/__init__.py", line 312, in loads
TypeError: the JSON object must be str, not 'NoneType'

Need a database abstraction

My little lolslackbot project has grown big. 11 different tables! I’m hand-writing ORM code now, silly things like “class Person” which wraps a row for the people table. The meat of the work is still hand-crafted SQL with proper relational queries, but I’m increasingly doing some really basic “get me the row with property X” and writing it all by hand is tedious. And oh boy I added some caching using functools.lru_cache, which I know from experience at Google is the beginning of the end for understandable correctness. And I continue to write increasingly elaborate schema migration scripts as I upgrade stuff.

It’s all kind of a mess. I wish now I’d started using some Python ORM tool. I still think ORMs are kind of evil, but they are a useful evil.

I’m out of date on where Python is but my impression is everyone likes SQLAlchemy. The basic tools are just a really nice upgrade from the basic Python database driver stuff. And the ORM looks like it’s made the right choices, at least based on what I read.

But do I want to port all this code over? Hell no. Hrm… I’m accumulating technical debt. I also need to wrap all my Python code in a proper module structure, and set up a real virtualenv, and…

At least I have a test suite. Although Guido help me if I ever have to regenerate the test dataset.


Discord API first use

My gaming group is switching from Slack to Discord. It’s a nice product. It’s sort of a Slack clone with voice channels as the big added feature. It’s also a product aimed at gamers, not corporate use, so there’s some hope the pricing and features will make more sense for gamers. I like the product, it’s worth a look.

So now I need to make my Slack bot post to Discord, too. Discord currently has no formal API. There is a fantastic group of hackers doing an Unofficial API project, though, and the company seems OK with it. Compared to most hacking projects this one is particularly well documented and organized. I dove in this morning and figured out how to write my own bot that works with the Discord API.

First, some resources:

The Python client is very nicely documented and organized. If you’re starting afresh you want to get version 0.10.0a, the alpha, directly from GitHub. PyPI currently has 0.9. 0.10 has reorganized the API, no reason to use the old one now.

The client itself is entirely based on Python 3.4’s asyncio (with preference for the new 3.5 syntax if available). That means you basically have to do everything through an event loop, which is a bit of a nuisance for simple things but absolutely necessary for building real software that’s going to interact with a Discord server. it also means discord.py only works with Python 3.4 and later. I think that’s all the right choice.

Here is my demo program. It took me a couple of hours in total, half of which was me reminding myself how asyncio works. All in all pretty straightforward, once you figure out the basic paradigm of how it fits together. There’s one choice embedded in my code here, which is I subclassed discord.Client instead of using the function decorator syntax for declaring event handlers. Either works fine.

The unofficial API also makes me understand the Discord product a little better. Unlike Slack, Discord has a notion of a single User that is logged into Discord with an email/password. That user can then be a member of various Servers, each of which has its own Channels. Slack doesn’t really have a cross-Slack notion of User, every Slack is an independent domain. The Discord API’s handling of public/private/DM channels is also a bit more straightforward than Slack’s.

quicky python continuous tests

pytest-watch is a fine quick “continuously run my tests” thing for Python in a tty. It’s nothing fancy, and the default UI is not perfect. But with -c and –no-beep and using –runner to run a custom command for your tests, it’s fine. Nice to not have to switch from my editor window to a console just to run the tests.

I’m envious of the NodeJS folks and their hipster test runners, they are very nice. Alternately I wish there were a good way to turn SublimeText into a bit more of an IDE with debugging and test output parsing. But after years of trying IDEs and going back to plain text editors, I think I’m not going to try to board that ship again.


sqldiff for sqlite3

Learned about a nice tool today: sqldiff. It shows differences between two sqlite databases. Particularly nice with the “–schema” flag, so you can just compare schema differences. I’m in the middle of database versioning hell as I keep iterating my lolslackbot project. Beginning to regret rolling all the database stuff by hand instead of using some ORM.

sqlite3 vs postgres: uneasy feeling

I continue to write my LoL Slack bot with sqlite3. I can live with the primary drawback of sqlite3, which is that writes require a database level lock (ie: no table or row locking). In normal use I should only ever have one writer to the database anyway. But I still have a nagging feeling I should use “a real database”. Ie: Postgres

The nervousness is I’m writing more complex code that’s wedded to sqlite. I’m using sqlite’s upsert equivalent, for which there is no standard syntax. I’m also using sqlite’s Python convenience feature of not using cursor objects. And just now I’m starting to pay attention to Python’s weird transactional semantics, which I suspect might be different in Postgres. In short my code has been dating sqlite for awhile but now things are getting serious and I’m getting commitment anxiety. Also Postgres has some alluring features I could use, like JSON and array types.

The underlying question is whether Postgres is really “better” for my kind of application. I suspect not, and part of why I stick with sqlite is to test that assumption, but the cost of being wrong is getting higher. The flip side is managing a Postgres daemon is still a PITA. Particularly on Ubuntu where upgrading is scary.

Python HTTP mocking

I need to do a better job mocking out HTTP requests to allow me to test all this web API code I’ve written. Right now I rolled my own bad mocking around a wrapper for all the REST calls, but I think it makes more sense just to mock at the HTTP level.

A quick Google search reveals three modern libraries: requests-mock, responses, and httpretty. After literally tens of minutes I’m left with the opinion any would work. responses-mock looks like the most mature and complete product. I like responses for its simplicity. The main differences between the three are the mechanism they use to intercept HTTP calls. I believe httpretty is the only one that works with an HTTP library other than Requests. OTOH why would you use any other HTTP library in Python?

Update: since doing this exercise I started using requests-mock. I’m quite happy with it.

Some detailed notes below, then my test code.

requests-mock: mocks out Requests by installing a Transport Adapter. (I’m endeared, because I came up with a similar technique on my own for Java back in 2000 or so.) Has detailed support for matching full URLs or parts of URLs or specific query strings or URL regexps. Supports all HTTP methods. Responses can be text, JSON, byte strings, or callbacks. Has a way to inspect which URLs were called. Has great docs.

Responses: mocks out Requests using the unittest.mock library, patching requests’ send function. Supports matching on regex, but no special support for query strings in URLs. Responses can be text or JSON or callbacks. Supports all HTTP methods. Optionally throws an error if not all URLs were accessed. No fancy docs, but the code is easy to read. Motivation in this blog post.

httpretty: works by monkey patching sockets. that’s terrifying, but kinda neat because it can mock out HTTP from sources other than Requests. Oddly they don’t have official Python 3 support (it seems to work though). I didn’t look at all its features in detail.

(BTW, unittest.mock was new to me, a very nice standard library for monkey patching any Python library for unit test mocking. That could come in handy in lots of places.)

Here’s my test / sample code.

#!/usr/bin/env python3

import requests, unittest
import requests_mock, responses, httpretty

def fetch():
    &quot;The method we'll be mocking&quot;
    r = requests.get('https://www.somebits.com/robots.txt')
    return r.text

class TestStuff(unittest.TestCase):
    def test_real(self):
        self.assertEqual(107, len(fetch()))

    def test_requests_mock(self):
        with requests_mock.Mocker() as m:
            m.get('https://www.somebits.com/robots.txt', text='nelson-requests-mock')
            self.assertEqual('nelson-requests-mock', fetch())

    def test_responses(self):
        with responses.RequestsMock() as r:
            r.add(responses.GET, 'https://www.somebits.com/robots.txt', 'nelson-responses-mock')
            self.assertEqual('nelson-responses-mock', fetch())

    def test_httpretty(self):
         httpretty.register_uri(httpretty.GET, &quot;https://www.somebits.com/robots.txt&quot;, 'nelson-response-httpretty')
         self.assertEqual('nelson-response-httpretty', fetch())

if __name__ == '__main__':

    # Verify the environment is left clean
    assert 107 == len(fetch())
    print(&quot;Real requests are working again.&quot;)