Debugging vector tiles performance

My vector river tiles app is unacceptably slow when drawing maps way out, like at zoom=6. I know what the problem is; there’s just too many rivers and geometry. But I’d hoped that it would magically work anyway. Some notes on where the time is going.

First, a testing setup. This is remarkably awkward; there have never been good off the shelf tools for measuring database slowness :-( After a failed attempt to make TileStache use a LoggingCursor what I came up with is:

  • gunicorn configured for a single synchronous worker. It’s unrealistic, but much easier to follow.
  • TileStache configured to debug level INFO. It emits nice lines like
    INFO:root:TileStache.getTile() riverst/6/13/23.json via layer.render() in 2.090
  • Basic client test; load a map and when it gets slow, look in the TileStache logs for specific tiles that are taking a long time.
  • Reproducible client test: load a single tile
    /bin/time curl -s http://localhost:8000/riverst/6/13/23.json | wc -c
    Ensure that TileStache isn’t loading from cache.
  • Postgres configured with log_duration=on and log_statement=all. This gives you helpful log lines like
    LOG: statement: SELECT ….
    LOG: duration: 621.962 ms

By syncing these all up I know that the tile z=6 13/23 is taking 2.09s for TileStache to create, 620ms of which is the database. And the GeoJSON output is a megabyte. That’s already very useful data; I had no idea TileStache itself was doing so much work.

Now I can take that query and look at it. For instance, that one tile returns 6911 rows! This is the NHD schema biting me in the ass; rivers are broken up into a lot of little points. The Cheyenne River on that tile, for instance, is returning 448 different rows with little LineStrings of 2-30 points each. Way too much detail for such a broad overview tile.

Some basic timings for the slowest query I could find at various zoom levels.

Tile     Time    DBtime   Rows    GeoJSON size
7/25/49   510ms   157ms    1715     264,410
6/13/23  2090ms   620ms    6911   1,073,684
5/ 7/12  5375ms  1598ms   17781   2,723,314
4/ 3/ 6  9461ms  2940ms   32439   4,972,224

Looking at this data, I’m thinking the bigger problem may really be the GeoJSON size. The tile generation time is awful but then again there’s only a few thousand tiles at zoom 7-0, can just cache them all. But there’s no love in sending all those tiles to a web browser client! The chopped up geometry is a problem for the GeoJSON too; each little LineString gets encoded as a separate feature with its own properties, so the string “Cheyenne River” is in the GeoJSON tile 448 times. Really need to fix that.

Here’s the query TileStache issues for 4/3/6 as well as the “explain analyze” output:

	                     ST_AsBinary(ST_Transform(ST_Intersection(ST_Simplify(q.geometry, 9783.94), ST_SetSRID(ST_MakeBox2D(ST_MakePoint(-12523442.71, 2504688.54), ST_MakePoint(-10018754.17, 5009377.09)), 900913)), 4326)) AS geometry
	              FROM (
	                select geometry, name as n, strahler as s from rivers where strahler >= 6
	                ) AS q
	              WHERE ST_IsValid(q.geometry)
	                AND q.geometry && ST_SetSRID(ST_MakeBox2D(ST_MakePoint(-12523442.71, 2504688.54), ST_MakePoint(-10018754.17, 5009377.09)), 900913)
	                AND ST_Intersects(q.geometry, ST_SetSRID(ST_MakeBox2D(ST_MakePoint(-12523442.71, 2504688.54), ST_MakePoint(-10018754.17, 5009377.09)), 900913))

 Bitmap Heap Scan on rivers  (cost=9362.71..41858.70 rows=943 width=478) (actual time=176.078..2661.831 rows=32439 loops=1)
   Recheck Cond: ((strahler >= 6) AND (geometry && '010300002031BF0D000100000005000000EC51B856F6E267C152B81E45F81B4341EC51B856F6E267C15C8FC245F81B5341D7A37045F81B63C15C8FC245F81B5341D7A37045F81B63C152B81E45F81B4341EC51B856F6E267C152B81E45F81B4341'::geometry))
   Rows Removed by Index Recheck: 65107
   Filter: (st_isvalid(geometry) AND _st_intersects(geometry, '010300002031BF0D000100000005000000EC51B856F6E267C152B81E45F81B4341EC51B856F6E267C15C8FC245F81B5341D7A37045F81B63C15C8FC245F81B5341D7A37045F81B63C152B81E45F81B4341EC51B856F6E267C152B81E45F81B4341'::geometry))
   ->  BitmapAnd  (cost=9362.71..9362.71 rows=8487 width=0) (actual time=175.357..175.357 rows=0 loops=1)
         ->  Bitmap Index Scan on rivers_strahler_idx  (cost=0.00..1904.82 rows=102971 width=0) (actual time=10.056..10.056 rows=102825 loops=1)
               Index Cond: (strahler >= 6)
         ->  Bitmap Index Scan on rivers_geometry_gist  (cost=0.00..7457.17 rows=217628 width=0) (actual time=163.464..163.464 rows=754699 loops=1)
               Index Cond: ((geometry && '010300002031BF0D000100000005000000EC51B856F6E267C152B81E45F81B4341EC51B856F6E267C15C8FC245F81B5341D7A37045F81B63C15C8FC245F81B5341D7A37045F81B63C152B81E45F81B4341EC51B856F6E267C152B81E45F81B4341'::geometry) AND (geometry && '010300002031BF0D000100000005000000EC51B856F6E267C152B81E45F81B4341EC51B856F6E267C15C8FC245F81B5341D7A37045F81B63C15C8FC245F81B5341D7A37045F81B63C152B81E45F81B4341EC51B856F6E267C152B81E45F81B4341'::geometry))
 Total runtime: 2666.323 ms

Update: I played around with merging geometries in the database by grouping on gnis_id. It’s got some problems, I think I broke the gist (every query is taking 35 seconds at any zoom level!) But the approach has some promise. At least it’s making for smaller outputs; 28k vs 264k for 7/25/49, 88k vs 1073k for 6/13/23.

explain analyze  SELECT q.*,
                     ST_AsBinary(ST_Transform(ST_Intersection(ST_Simplify(q.geometry, 2445.98), ST_SetSRID(ST_MakeBox2D(ST_MakePoint(-11897270.58, 5009377.09), ST_MakePoint(-11271098.44, 5635549.22)), 900913)), 4326)) AS geometry
              FROM (
                select ST_LineMerge(ST_Union(geometry)) as geometry, max(name) as n, max(strahler) as s from r where strahler >= 5 group by (gnis_id, strahler)
                ) AS q
              WHERE ST_IsValid(q.geometry)
                AND q.geometry && ST_SetSRID(ST_MakeBox2D(ST_MakePoint(-11897270.58, 5009377.09), ST_MakePoint(-11271098.44, 5635549.22)), 900913)
                AND ST_Intersects(q.geometry, ST_SetSRID(ST_MakeBox2D(ST_MakePoint(-11897270.58, 5009377.09), ST_MakePoint(-11271098.44, 5635549.22)), 900913))

 Subquery Scan on q  (cost=354441.41..380026.21 rows=10258 width=72) (actual time=2563.823..36912.769 rows=93 loops=1)
   ->  GroupAggregate  (cost=354441.41..377282.19 rows=10258 width=497) (actual time=2562.281..36041.899 rows=93 loops=1)
         Filter: (st_isvalid(st_linemerge(st_union(r.geometry))) AND (st_linemerge(st_union(r.geometry)) && '010300002031BF0D000100000005000000295C8FD236B166C15C8FC245F81B5341295C8FD236B166C1E17A144E777F5541E17A144E777F65C1E17A144E777F5541E17A144E777F65C15C8FC245F81B5341295C8FD236B166C15C8FC245F81B5341'::geometry) AND (st_linemerge(st_union(r.geometry)) && '010300002031BF0D000100000005000000295C8FD236B166C15C8FC245F81B5341295C8FD236B166C1E17A144E777F5541E17A144E777F65C1E17A144E777F5541E17A144E777F65C15C8FC245F81B5341295C8FD236B166C15C8FC245F81B5341'::geometry) AND _st_intersects(st_linemerge(st_union(r.geometry)), '010300002031BF0D000100000005000000295C8FD236B166C15C8FC245F81B5341295C8FD236B166C1E17A144E777F5541E17A144E777F65C1E17A144E777F5541E17A144E777F65C15C8FC245F81B5341295C8FD236B166C15C8FC245F81B5341'::geometry))
         Rows Removed by Filter: 3599
         ->  Sort  (cost=354441.41..354953.32 rows=204764 width=497) (actual time=2009.868..2244.670 rows=204904 loops=1)
               Sort Key: (ROW(r.gnis_id, r.strahler))
               Sort Method: external merge  Disk: 73896kB
               ->  Bitmap Heap Scan on r  (cost=3835.46..197798.53 rows=204764 width=497) (actual time=31.229..642.727 rows=204904 loops=1)
                     Recheck Cond: (strahler >= 5)
                     Rows Removed by Index Recheck: 1371179