TopoJSON for rivers

Mike Bostock took a crack at using TopoJSON to encode the NHDFlowline dataset. Just the geometry for rivers in 2 dimensions; no properties, etc. Tested just for California. All sizes are one million byte megabytes.

  • Source shapefile: 132M, 72M gzipped.
  • Naive GeoJSON conversion: 184M, 56M gzipped.
    ogr2ogr -dim 2 -f GeoJSON -select ” new.geojson NHDFlowline.shp
  • GeoJSON rounded to 5 digits: 95M, 21M gzipped.
    liljson -p 5 < new.geojson
  • GeoJSON rounded to 3 digits: 80M, 9M gzipped.
    liljson -p 3 < new.geojson
  • TopoJSON: 20M, 3.3M gzipped.

So for this data TopoJSON is about 1/4 – 1/5 the size of the equivalent GeoJSON. And those advantages persist through gzip. And that’s for a pathologically bad case, where there’s no shared topology along polygon boundaries. Pretty much all the savings here must be coming from the delta encoding. Neat!

Update: Mike converted the whole US. 2.5G of .shp file input, 327M of topojson output.

Note that TopoJSON quantizes. Mike used the default TopoJSON settings which I think work out to about 10,000 x 10,000 resolution, which makes the comparison to GeoJSON rounding to 3 digits about fair. Here’s a snapshot of a render of the TopoJSON that Mike gave me. It looks right.


3 thoughts on “TopoJSON for rivers

  1. Now we just need to convinve MongoDB to adopt it (they just annouced they were supporting GeoJSON…) ;-)
    Stupid question: one can expect delay in encoding/decoding? vs GeoJSON?

    1. Converting geodata (say, from PostGIS) to TopoJSON will be a slower than GeoJSON. TopoJSON is doing global calculations on the whole dataset. I think for single tiles like my river map it’s nothing. OTOH doing TopoJSON on all 2.5G of river geometry must take 10+ minutes. Haven’t tried it!

Comments are closed.