Wanted: SQLite for JSON

I like SQLite. It’s such a nice simple data store. It’s fast for small data (< 100 MB), it requires no separate daemon process, it has great support in every language I work in.

I like JSON and the idea of schemaless databases. I want to be able to download JSON blobs from an API or create some new JSON blobs and store them away with a minimum of effort. Typeless. Then I want to be able to query them somehow.

Now I want to mix the two together. I want to take a bunch of JSON objects and store them in somewhere simple, then query them.

MongoDB was the first big “store a bunch of JSON”. It sort of fits the bill for what I want, but it’s way overengineered for my needs. I don’t want a separate database process. I definitely don’t want a separate cluster of database processes. Also MongoDB is fundamentally untrustworthy and should not be used by anyone for anything serious. It might be OK for my casual hacking, particularly without clustering, but then you run into the way the query language is so awkward.

Postgres has decent JSON support with some queries. I will probably try that for my next hack. Unfortunately it requires a separate database process and I prefer the simplicity of sqlite’s “just open the file” design. But I have a Postgres instance lying around already so what’s one more database?

3 thoughts on “Wanted: SQLite for JSON

  1. jq (http://stedolan.github.io/jq/) seems to be heading in that direction, it now has some sort of library interface and it appears someone has made python binding for it – https://pypi.python.org/pypi/jq

    But I wonder whether you’ll ever see a particularly non-awkward query language given that the difficulty in sanely and non-redundantly representing data hierarchically, let alone ad-hoc querying it was one of the driving forces behind the relational model, normalization and all that jazz to begin with.

    Do you ever find yourself slurping a bunch of JSON into python and saying ‘I wish I had something like XPath/CSS selectors/Jquery to deal with this because object notation and loops/comprehensions are so unwieldy and hard to read two months later?’. Actually asking, not passive-aggressive nerdguing .

  2. Thanks, I had jq in mind too. I use it about once every two months and have to relearn the query syntax every time. I’m not very good with it. But yeah, I do find myself wanting something like XPath/selectors for JSON, quite frequently. I feel like I’ve seen some! But none I use regularly.

    On my todo list is learn what query functions Postgres supports on JSON. http://www.postgresql.org/docs/9.3/static/functions-json.html

  3. I’m less optimistic about the query language than you are, not just for high-minded set-theoretical reasons but it seems to me it would take something quite well-designed, ergonomic and clever to overcome the easy and often good-enough impulse of ‘just use the host language’. Although that jq lib/binding work (if it becomes a production thing) would be super-useful anyway just for the performance benefits of something that is streaming and does all of its dirty work in C.

    The Postgres stuff looks interesting – after all these years and work put into it, is it going to turn out that there *is* some practical benefit to teaching your rdbms to stuff non-opaque structured types into columns? Hopefully you get to try it and write about it for the benefit of a grateful lazykind!

Comments are closed.