I continue to look for things that make working with databases simpler in Python. I keep doing projects where I end up dropping down to raw SQL, with awkward syntax for querying rows and bespoke one-off code for updating data in the database. The Python is messy but I like being able to use SQL in its full glory.
I’ve tried using SQLAlchemy a couple of times and have a hard time loving it. ORMs are a bad idea in general and SQLAlchemy is very complicated and magic. For a schema with 20+ tables and a lot of complex interactions it makes sense, but for a small project where you just want something more sane than low level DBAPI…
For my current project I’m trying to use plain psycopg2 and its NamedTupleCursor. It’s not awesome, but at least it gives you named access to the rows from the DB. So it makes reading from the database a little nicer, but does nothing for writing. It’s Postgres only.
Along these lines is the records library. It also gives sane access to results from queries; either as dicts or named tuples. Result sets also have nice export functions like “write to CSV” or “make into a Pandas DataFrame”. It doesn’t really do much for writing data, just querying, so it’s simple and small.
dataset is an effort to do even more. I love their tag line: “Because managing databases in Python should be as simple as reading and writing JSON files.” Amen! There’s a lot more going on in dataset than records. It’s more like an ORM-lite and contains support for creating schema, automatic upserts, complex queries without Python, … It doesn’t have the export functions records has, but it does have a lot of thought for simple queries and updates. I’d have to actually use it for something to see if it’s sensible or if ORM-lite is an even worse idea than a full ORM.
Both records and dataset depend on SQLAlchemy, that turned me off at first. But they’re both mostly using it as a way to support multiple database types. In theory the Python DBAPI standard should be enough, but in practice it sucks enough I understand why they’d build on SQLAlchemy instead. With records you can mostly ignore that SQLAlchemy is there, but dataset exposes it more in the query system.
I wonder if some other programming environment has a much nicer database interface. The Microsoft world has LINQ, but that seems to have all the problems of other ORMs. I wonder if the R folks have a really nice way to work with databases, or maybe even Pandas.