For the last couple of months I’ve been reading a lot about redistricting, hoping to find a way I can use my programming skills to have some political influence. There’s a remarkable story in recent politics summarized in the book Ratf**ked; the Republican Party engineered a takeover in 2010 of the House of Representatives. Long story short, they worked hard to win state legislature elections in 2008 and 2010. This gave them control over the redistricting process in 2010/2011, drawing the lines for House of Representatives elections. That line-drawing worked, the cartography alone is responsible for something like 10% of the Congressional seats that the GOP holds. You get crazy things like North Carolina having 10 of 13 representatives being Republican, despite the popular vote for those seats going about 50/50. The GOP is quite explicit about what they did, too:
The state representative who drew that [North Carolina] map said he had engineered 10 safely Republican seats only “because I do not believe it’s possible to draw a map with 11 Republicans and two Democrats.”
I’m interested in the data and cartography part of this, the technical question. So I’ve been reading up. Here’s some of what I learned. See also Mike Migurski’s notes.
Maptitude: ArcGIS for politics
The key software tool used in 2010 is Maptitude for Redistricting. It’s a GIS tool specialized for redistricting. It comes preloaded with demographic data, election data, and political boundary shapes. It lets you draw district plans and see what the result is. Think of it as ArcGIS but for political scientists. There are many demo videos online.
An individual license for Maptitude costs $700 on Amazon. Mostly they negotiate sales contracts for group use, Ratf**ked suggested it was $5000-$10,000 for a state. Compare $500,000 for previous software from the 1990s.
One nice thing about Maptitude is it comes with political data that you buy from them. But it’s all proprietary and expensive. What about open source / open data hacking? Wanted, for every state, for every election every 2 years:
- Shapefiles for each voter precinct. The voting precinct is the smallest building block of election data. You can’t find out how an individual voted in the United States, but you can see how a group of ~25-300 people in one precinct voted.
- Per-precinct vote tallies for every major national election. President, Senate, House of Representatives. State legislature data would be nice too.
- Shapefiles for every House of Representatives voting district. These are typically, but not always, the union of precincts. State election districts would be nice too.
America has no centralized election system. There is no simple database of election results, political districts, etc. Particularly not for data as detailed as per-precinct results. Every state maintains their own data. Some states publish nice clean CSV and Shapefiles. Some states will send you a scanned handwritten ledger if you call and pay a $4.95 document fee. It’s a mess. Here’s what’s available for open use.
- OpenElections collects election returns. It came out of a data journalism product and they are doing good work towards a 2016 set but it is a long process.
- election-geodata is a project headed up by Nathaniel Kelso which collects precinct and district shapefiles. It’s not complete but has gotten a whole lot of data, particularly for 2016.
Between those two projects I think we have ~75% of 2016’s election in easy-to-use format.
Several folks have published detailed maps on the Web but have not published formal data exports. Decision Desk HQ published a per-precinct map of the Presidential election in 2016. ESRI published a per-precinct map for the 2008 presidential election. I could have sworn I’ve seen a 2012 national per-precinct map too, but I can’t put my hands on it. The LA Times Data Desk has also done good work but it may be California-only.
Also worth reading: Mike Migurski’s work on North Carolina elections, where he goes through the exercise of collecting and analyzing the 2016 election for that state. It’s a model of what I’d like to be able to do easily for every state and every election.
I haven’t researched this much yet, but the other half of redistricting is understanding the demographics of the people you’ve put into the districts.
Census data is the basic data standard here, free and public and easy to work with. It comes broken down by census blocks, groups of 0-500 people. Those blocks do not line up with voting precincts, so some slicing and joining is required to produce per-precinct views of census data. Migurski did this for North Carolina.
Marketing data is the other interesting option here, and a total black box to me. But all of the tracking tools that enable direct sales and Internet advertising are producing data that’s a political goldmine. I suspect little to none of this data is available for free open hackery. News articles about the 2016 campaign are full of stories about how various political groups used this data with varying levels of effectiveness.
Edit: see this amazing article about Cambridge Analaytica, Trump, and Brexit.
Areas of work for redistricting
When I started this research I had no idea something like Maptitude existed. I thought maybe I could help build a GIS-for-elections and revolutionize politics. Ha! I’m at least 10 years too late on that. The state of the art in 2011 was a tool that let political experts draw district plans and then understand with great detail how those people voted in recent elections and therefore, how they would likely vote in the next election. What about 2020?
- Demographic prediction. District plans last ten years; you can design a plan that looks great for your party in 2022 only to find it fails in 2028, what is called a “dummymander”. Predicting demographic trends sounds like a good data problem. There’s a lot of expertise on this already.
- Automated districting. I get the impression 2010 districts were still mostly hand drawn. But computers can easily produce zillions of plans. It’s a problem ripe for optimization algorithms.
- Smarter measures of gerrymanders. This topic is hot politically now, particularly a new measure called the efficiency gap which quantifies the partisan bias in a district plan. Communicating these measures to the voting public seems really valuable.
The hard part for me with all this is I’ve rapidly learned that redistricting is not a technical problem, not something I can program a solution to. It is a political problem. People with political expertise and power are going to set the agenda. But software is a tool for that politics, maybe a tool I can help craft.