Node.js sqlite3: very slow bulk inserts

I’m running into a baffling problem where the straightforward way to use prepared statements in the MapBox sqlite3 Node.js code is super slow. Like inserting 100 rows in a table takes 16 seconds!

Update: found the problem, I wasn’t using transactions right. See bottom. Simple solution for faster bulk inserts in sqlite3: do them with a transaction.

The API docs have a strange note for Statement.finalize() which says “if you experience long delays before the next query is executed, explicitly finalizing your statement might be necessary”. I’m seeing exactly that. But after you’ve finalized it you can’t reuse the prepared statement; what’s the point of using statements at all then?

There’s notes online that naive bulk inserts in sqlite3 are slow, nothing to do with Node. See StackOverflow and Blog::Quibb. The quick fix I learned from both of those is to wrap the insert block in a transaction, because without that sqlite3 treats each insert as a separate transaction and is understandably slow. OK, so we’ll use a transaction.

But even with transactions using prepared statements is still slow in the Node version. I don’t know if this is a bug, me not understanding sqlite3 right, or me not understanding Node right. I am using serialized mode and trying to do simple nonconcurrent programming. The delay only occurs when the database is closed, btw, sqlite3 is accumulating the changes in a journal before committing them.

Three approaches:

  1. Call db.run(‘INSERT…’) a bunch of times. This is fast, but only if I wrap the inserts in a transaction.
  2. Prepare a statement once, then call stmt.run(‘INSERT…’) a bunch of times. This is slow whether I use a transaction or not.
  3. Prepare a separate statement for each row, then call stmt.run(‘INSERT…’) on it and then stmt.finalize(). This is fast.

Sample code is in this gist. “db” and “finalize” both take about 400ms on my Linux box. “reuse” takes 16 seconds to insert the 100 rows.

Update: the problem was I was creating the prepared statement before I did the “begin transaction”. Apparently that means the prepared statement is executed without the transaction context. Beginning the transaction and then preparing the statement, the three methods all run fast. I suspect the call to finalize has nothing to do with it, but I’m not sure.

Here’s the revision view: version one had the begin transaction too late.