Compare two OpenAddress runs

I’m re-running OpenAddress many times trying to be sure the output stays roughly consistent. It’s never the same twice; servers are unreliable and the data is changing. Also some code changes make small cosmetic differences like rounding error.

What works best for me is comparing the line counts in the output files:

wc -l out/*/out.csv > wc.txt

diff  –suppress-common-lines -y oa-full-790/wc.txt oa-new-esri/wc.txt

This only highlights sources that output different number of lines; if the contents in the columns is garbled you won’t see that. But it’s a good way to get an overview of what changed between two runs.