iPhone consolidated.db location tracking notes

I’m trying to understand the consolidated.db iPhone location database, which means understanding how iOS stores and managing data. Here’s some notes.

Resources

  1. @aallan’s original announcement
  2. iPhone Tracker (Objective C code)
  3. Technical instructions from howto.wired
  4. nphonetrack (.NET code)
  5. iPhoneLs.py
  6. iphone-tracker.py
  7. Quick start for sqlite3
Notes
  • When you make a backup of an iPhone, it creates a big directory full of files with random 160 bit filenames. These are named files in the iPhone that iphonels.py can display. My 3 biggest files are named
    HomeDomain::Library/Caches/com.apple.WebAppCache/ApplicationCache.db
    AppDomain-com.foreflight.ForeFlightMobile::Documents/Procedures.zip
    RootDomain::Library/Caches/locationd/consolidated.db
  • consolidated.db is the file with the location data in it. It’s a sqlite3 database.
  • The juicy tables are (apparently) CellLocation and WifiLocation. The *Counts tables seem to contain just a row count. Presumably the Boxes tables are some sort of spatial index. I can find no record of how often I was at a specific location other than what’s implicit in the timestamps and density of cell/wifi locations.
  • I have 32,013 rows in CellLocation, 171,040 rows in WifiLocation
  • Timestamps are like 309803342: I believe this is NSDate seconds since 1 Jan 2001. Add 978307200 to get it in seconds since Unix 1970 epoch.
  • HorizontalAccuracy varies from 50-300 for WifiLocation, and up to 80,000 for CellLocation.
  • Confidence is 0, 50, 60, 65, 68, or 70
  • Speed and Course are always -1 (no surprise).
Some quicky sqlite3 shell code
.mode csv
.output wifilocation.csv
select Timestamp, Latitude, Longitude from WifiLocation order by Timestamp;
.output CellLocation.csv
select Timestamp, Latitude, Longitude from CellLocation order by Timestamp;
Some pasted info from sqlite3

sqlite> .tables
CdmaCellLocation                   CellLocationCounts
CdmaCellLocationBoxes              CellLocationHarvest
CdmaCellLocationBoxes_node         CellLocationHarvestCounts
CdmaCellLocationBoxes_parent       CellLocationLocal
CdmaCellLocationBoxes_rowid        CellLocationLocalBoxes
CdmaCellLocationCounts             CellLocationLocalBoxes_node
CdmaCellLocationHarvest            CellLocationLocalBoxes_parent
CdmaCellLocationHarvestCounts      CellLocationLocalBoxes_rowid
CdmaCellLocationLocal              CellLocationLocalCounts
CdmaCellLocationLocalBoxes         CompassCalibration
CdmaCellLocationLocalBoxes_node    Fences
CdmaCellLocationLocalBoxes_parent  Location
CdmaCellLocationLocalBoxes_rowid   LocationHarvest
CdmaCellLocationLocalCounts        LocationHarvestCounts
Cell                               TableInfo
CellLocation                       Wifi
CellLocationBoxes                  WifiLocation
CellLocationBoxes_node             WifiLocationCounts
CellLocationBoxes_parent           WifiLocationHarvest
CellLocationBoxes_rowid            WifiLocationHarvestCounts

CREATE TABLE CellLocation (
  MCC INTEGER,
  MNC INTEGER,
  LAC INTEGER,
  CI INTEGER,
  Timestamp FLOAT,
  Latitude FLOAT,
  Longitude FLOAT,
  HorizontalAccuracy FLOAT,
  Altitude FLOAT,
  VerticalAccuracy FLOAT,
  Speed FLOAT,
  Course FLOAT,
  Confidence INTEGER,
PRIMARY KEY (MCC, MNC, LAC, CI));

CREATE TABLE WifiLocation (
  MAC TEXT,
  Timestamp FLOAT,
  Latitude FLOAT,
  Longitude FLOAT,
  HorizontalAccuracy FLOAT,
  Altitude FLOAT,
  VerticalAccuracy FLOAT,
  Speed FLOAT,
  Course FLOAT,
  Confidence INTEGER,
  PRIMARY KEY (MAC));

subverting iframe same-origin policy

I want to embed a part of a web page from a third party on to my web page. Easy, right, that’s what iframes are for! Only I don’t want to embed the whole foreign page, just a 200×200 pixel chunk in the middle of the page. You can crop 200×200 with the width and height parameters, but there’s no clean way to scroll to the middle of the embedded page. The simple Javascript scrollTo fails because the third party page in the iFrame does not have the same origin as the main page.

There’s a work around: doubling up on iframes. Your main page embeds an iframe to a shim page on your own server. The shim page in turn just embeds an iframe of the foreign page. The master page can scroll the shim page since they’re in the same origin. Ugly, but it works.

aprs.fi as a source of GPS tracks

I have some friends who track their positions via ham radio on APRS. Cool stuff. The site http//aprs.fi/ has a really nice map viewer of the data. I’m looking at extracting that data for my flight tracker.

The easy way is to have the user use the data export tool to download a file. It’s a bit awkward and manual but easy to understand. There are multiple format options, CSV seems sufficient. Fields are:

  • time: full text UTC, “when the target first reported this (current) position”
  • lasttime: identical to time in my data. “the time when the target last reported this (current) position”
  • lat: decimal degrees
  • lng: decimal degrees
  • speed: kilometers / hour
  • course: degrees, probably true?
  • altitude: meters
  • comment: empty in my examples. “APRS comment”

aprs.fi also has a simple REST API. Unfortunately it only seems to report the last fix, not a history of the track. A fetch of the last location works like http://api.aprs.fi/api/get?name=CALLSIGN&what=loc&format-JSON&apikey=KEY.

The granularity of track data is up to the APRS sender. I think one fix every 5 seconds is considered fast. My friend’s tracker is smart enough to send more rapid updates when turning, etc. 66 points for a 90 minute flight.

There are other sources of this data. aprs.fi notes you can also get the data directly from APRS-IS via a library like Ham-APRS-FAP. There’s also aprsworld.com; Adam Fast has some nice API code that relies on his having database access to aprsworld.

Ubuntu 11.04 notes

Time for my yearly experiment running Linux as a Desktop OS. I installed Ubuntu 11.04 beta1 (“Natty Narwhal”). And immediately hit graphics corruption and system crashes. To be fair I have a weird graphics card that Windows hates too, a Radeon HD 2400 PRO AGP.

So first order of business was switching from the ATI binary driver to the open source VESA driver. There are no docs on this I could find anywhere; there’s no longer an xorg.conf to edit. I finally just used apt-get to uninstall xserver-xorg-video-ati; happily, Ubuntu is smart enough to drop back to the VESA driver if the ATI driver is not around.

The second problem is Unity, the fancy new GUI, doesn’t run on the VESA driver. Too much GL or something. The old UI is there but if you want Unity the solution for that is to install unity-2d, a simpler UI.

With that all done, things sorta work. As with every time I try to use Linux as a desktop, it was a disaster. I can’t really blame Ubuntu for not handling my weird graphics card well, but I can blame it for there being no clear path to diagnose and fix the mistake. Also I’d say unity-2d ought to be a default option, not some weird thing to install.

 

D3 scales and interpolation

D3 has a notion of “scales”, transformations of data from a domain to a range. Say your data is percentages (0% to 100%) and you want to draw them as bars of length 10-20. You can easily construct a linear scale to map your domain [0,100] to a range [10,20]:

var s = d3.scale.linear.domain([0,100]).range([10,20])
d3.selectAll("rect").data(data)
  .enter().append("svg:rect")
    .attr("height", s)

If that use of s seems magic, equivalents would be

    .attr("height", function(n) { return s(n); })

    .attr("height", function(n) { return 10 + 10 * (n/100) });

D3 provides various useful scales. Numeric scales like linear, log, and pow, also discrete scales like quantize and ordinal.

One thing you can set on a scale is the interpolate function. It’s invoked when mapping the domain to the range and lets you make the range be something other than the usual numeric range.

s = d3.scale.linear()
  .domain([0,100])
  .interpolate(d3.interpolateRgb)
  .range(["#ff0000", "#0000ff"])
s(0) == "rgb(255,0,0)"
s(100) == "rgb(0,0,255)"
s(50) == "rgb(128,0,128)"

There are pre-defined interpolation functions for numbers, colours, strings, objects, etc. I believe they’re also used by D3′s transition animations, specifically to automatically warp complex things like SVG paths from one shape to another.

The application of interpolation in d3.scale.linear.interpolate is pretty simple to follow:

  interpolate = d3.interpolate,
  i = interpolate(y0, y1);

  function scale(x) {
    return i((x - x0) * kx);
  }

In the linear.js code [x0,x1] is the domain, kx is 1/(x1-x0). [y0,y1] is the range. So the above code takes the input x, maps it to the interval [0,1], then applies cached interpolation function i(). By default that function is a linear interpolation in [y0,y1], but the developer can override it.

Building node without SSL

I’m trying to build node.js from source on Debian/wheezy. There’s an RPM but it’s two major versions behind. I got an error about SSL methods not being declared when compiling. No idea why Debian’s dev environment makes node’s configure think it can build SSL when it can’t (I have libssl-dev installed), but a quick workaround is buidling Node without crypto, via ./configure --without-ssl.


[74/75] cxx: src/node_crypto.cc -> build/default/src/node_crypto_4.o
/usr/bin/g++ -pthread -m32 -g -O3 -DHAVE_OPENSSL=1 -DHAVE_MONOTONIC_CLOCK=1 -DEV_FORK_ENABLE=0 -DEV_EMBED_ENABLE=0 -DEV_MULTIPLICITY=0 -DX_STACKSIZE=65536 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DEV_MULTIPLICITY=0 -DHAVE_FDATASYNC=1 -DPLATFORM="linux" -D__POSIX__=1 -Wno-unused-parameter -D_FORTIFY_SOURCE=2 -DNDEBUG -Idefault/src -I../src -Idefault/deps/libeio -I../deps/libeio -Idefault/deps/http_parser -I../deps/http_parser -Idefault/deps/v8/include -I../deps/v8/include -Idefault/deps/libev -I../deps/libev -Idefault/deps/c-ares -I../deps/c-ares -Idefault/deps/c-ares/linux-ia32 -I../deps/c-ares/linux-ia32 -Ideps/v8/include ../src/node_crypto.cc -c -o default/src/node_crypto_4.o
../src/node_crypto.cc: In static member function ‘static v8::Handle<v8::Value> node::crypto::SecureContext::Init(const v8::Arguments&)’:
../src/node_crypto.cc:101:29: error: ‘SSLv2_method’ was not declared in this scope
../src/node_crypto.cc:103:36: error: ‘SSLv2_server_method’ was not declared in this scope
../src/node_crypto.cc:105:36: error: ‘SSLv2_client_method’ was not declared in this scope
Waf: Leaving directory `/usr/local/src/node/build'
Build failed:  -> task failed (err #1):
        {task: cxx node_crypto.cc -> node_crypto_4.o}
make: *** [program] Error 1

mobile safari and viewport

Apple’s mobile browser is great, but the viewport meta tag is kind of a hash. There seems to be two conflicting ways to hint to the iPhone or iPad how big to draw things.

  1. width and height. Tells the browser this content is going to render best at a certain width, so render it to that size. width=device-width or width=320 is one way to tell the iPhone not to scale at all. width=600 or the like often solves the problem of stuff looking too small on the iPhone, while still letting it scale. If you set a small width the iPad will scale content up, making it bigger.
  2. initial-scale, minimum-scale, maximum-scale. Gives an explicit scaling factor. The user can override this (unless you disable it).

What’s confusing is if you set both a width and scaling behaviour: which takes precedence?

Mike suggested what seems to work well for maps is setting initial-scale to 1.0 and maximum-scale to 1.0. This keeps Safari from doing its own scaling on the page at all. Even better, when you rotate the device it doesn’t try to scale the width to match the old view; instead it just loads some stuff that was previously off-screen. The only drawback is this effectively sets the width to 320px on an iPhone, you no longer get any in-browser scaling.

What I really want is no scaling at all on the iPad, and 1.5 scaling on the iPhone. I can’t figure out a way to do that without changing the tags depending on what device loaded the page.

Map tile licenses

I’m trying to launch a map project that uses another site’s map tiles in Polymaps. Here’s links to licensing details for various tile options, along with my very cursory understanding based on my non-expert quick read of their licenses.

Google Maps API: various limits, the most important being you are required to use their Javascript library. Or so I’m told, I didn’t read the license to find the specific language.

Microsoft Bing Maps: you can use their web services with your own code, although the way the metadata service works means you may end up having to display some Microsoft branding, links etc. Education and non-profits seem to have pretty generous usage terms. Commercial use can also proceed without a license agreement, but limited to “500,000 transactions” per year. Beyond that you email them for pricing.

OpenStreetMap: It’s Creative Commons, Attribution ShareAlike. No requirements on access code or amount of traffic beyond their tile usage policy. What “ShareAlike” means for a map that uses OSM tiles as a layer is complex: see common license interpretations. The key language I noted was

If you overlay OSM data with your own data created from other sources (for example you going out there with a GPS receiver) and the layers are kept separate and independent, and the OSM layer is unchanged, then you may have created a collective work. … If you have created a collective work, then only the OSM component of the work must be subject to the OSM licence.

CloudMade. Their map tiles are derived from OSM so are similarly CC-A-SA. They may have further restrictions in their ToS.

D3 selections

The core datatype in the D3 Javascript library is the selection. But what exactly is a selection? What is its type and how do we work with it? It’s possible to use D3 without really understanding what selections are, but it makes a lot more sense if you understand the core type.

D3 selection type

A D3 selection is an array of arrays of DOM elements. That top level array object also has extra methods defined on it, the D3 selection methods like append(), attr(), etc.  Both select() and selectAll() return a 2d array; the difference is that select() only returns the first of the matching nodes. Most of the time your Javascript code won’t ever iterate inside these arrays; D3′s methods themselves implicitly loop over the elements.

Sample document

The rest of these notes will work on a simple sample HTML document that contains two tables each of size 2×2 for a total of 8 td nodes.

<table id="table1">
  <tr id="tr1-1">
    <td id="td1-1-1">1-1-1</td>
    <td id="td1-1-2">1-1-2</td>
  </tr>
  <tr id="tr1-2">
    <td id="td1-2-1">1-2-1</td>
    <td id="td1-2-2">1-2-2</td>
  </tr>
</table>
<table id="table2">
  <tr id="tr2-1">
    <td id="td2-1-1">2-1-1</td>
    <td id="td2-1-2">2-1-2</td>
  </tr>
  <tr id="tr2-2">
    <td id="td2-2-1">2-2-1</td>
    <td id="td2-2-2">2-2-2</td>
  </tr>
</table>

A note on developer tools

D3 selections are basically just arrays and if you look at a D3 selection in a developer tool like the Google Chrome console it only shows the array. See line 2 below. But don’t be confused! The selection doesn’t show up as having a special type or representation, but it is special. It has extra methods assigned on it and you can inspect and call them:

>>> d3.selectAll("table")
[ Array[2] ]
>>> typeof d3.select("table").attr
"function"
>>> d3.select("table").attr("id")
"table1"

Getting a D3 selection

Where does a D3 selection object come from? The select() and selectAll() methods. Those methods return a selection matching the input parameter. Often the developer selects for strings, using the selectors we know and love from the CSS specification. Selectors are a powerful mini-language worth reviewing to get the most out of D3.

>>> d3.selectAll("table")
[ Array[2] ]
>>> d3.selectAll("td")
[ Array[8] ]
>>> d3.selectAll("#td1-1-1")
[ Array[1] ]
>>> d3.selectAll("#broken")
[ Array[0] ]

In addition to strings, select() and selectAll() also work with ordinary DOM elements returned by other Javascript APIs. This capability can also be useful inside D3: inside an iterator function like each() you can use it to elevate the DOM element passed in the this parameter back into a selection object so you can then modify it with D3 methods.

>>> document.getElementById("td1-1-1")
​1-1-1​
​
>>> d3.selectAll(document.getElementById("td1-1-1"))
[ Array[0] ]

>>> d3.selectAll("td").each(function() { this.style("font-weight", "bold") })
TypeError: Property 'style' of object is not a function

>>> d3.selectAll("td").each(function() { d3.select(this).style("font-weight", "bold") })
[ Array[8] ]

Using a D3 selection

Most of your work with D3 is getting a selection and then invoking methods on it. All selection methods except call are implicitly loops, they will operate once per element in the selection.  The methods attr(), style(), property(), text(), and html() allow modification of the selected elements. each() allows you to call arbitrary functions for each element; methods like attr() are implemented by calling each(). append(), insert(), and remove() allow elements to be added to the DOM. on() allows events to be bound to the selected elements.

Subselections

D3 allows you to create a subselection within a selection, to work on a subset of elements from an initial selection. Subselections allow like elements to be grouped so you can process your data in a structured fashion. (This also explains why selections are 2d arrays; they support the nesting). I haven’t needed to use subselections yet: the scatterplot matrix example shows their use.

>>> d3.selectAll("td")
[ Array[8] ]
>>> d3.selectAll("td")[0][0].id
"td1-1-1"
>>> d3.selectAll("table").selectAll("td")
[ Array[4], Array[4] ]
>>> d3.selectAll("table").selectAll("td")[0][0].id
"td1-1-1"
>>> d3.selectAll("table").selectAll("td")[1][0].id
"td2-1-1"

Data

The data() method allows data to be stored for each element in the selection. Data is stored in a __data__ property right on the DOM node. You should normally not access this property directly, but it’s helpful to understand how D3 keeps data associated with your document elements.

>>> d3.selectAll("td").data([1, 2, 3, 4, 5, 6, 7, 8])
>>> d3.selectAll("td").each(function(d) { console.log(d); })
1 2 3 4 5 6 7 8
>>> d3.selectAll("#td2-1-1").each(function(d) { console.log(d); })
5
>>> document.getElementById("td2-1-1").__data__
5

data() returns the updating selection; you can immediately use attr(), each(), etc after calling data().

>>> d3.selectAll("td").data([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
>>>    .each(function(d) { console.log(d); })
1 2 3 4 5 6 7 8
[ Array[10] ]

Note how the array returned by each() is length 10? We bound 10 data items to the selection, but we only had 8 matching td elements. The extra two new elements implied by the data are accessible as the enter selection. This enter selection only has append() and insert() methods defined on it; you cannot call attr(), etc until you create DOM nodes. There is a symmetric exit selection that has the full D3 selection methods, typically used for removing nodes from the DOM.

this in javascript

Trying to understand the this magic variable in Javascript. You’d think it’d be like self or the like in most OO languages. But Javascript is barely OO and everything is so dynamic, it’s confusing. Some useful references: Quirks Mode, MDC this.

My takeaway from reading stuff is that this is generally set to the object the function is defined on. If you’re saying foo.go(), then inside go() this will be set to foo. If you don’t know what object you called a function on, say because you have a naked go() in your code, then this is probably bound to the global window object.

The usual this behaviour can be overridden in various ways. Both call and apply let you invoke a function and explicitly provide a value for this. D3′s each() function does this so that inside each, this is bound to the DOM node for the specific element being manipulated. There’s also a new ECMA function bind that lets you create a function with a new value automatically used for this.

I’m still a bit fuzzy on the mechanics of how this is assigned, in particular when you copy a function from object A to object B. In general this seems to be set dynamically at the last moment before the function is invoked, so it generally picks up the value A or B depending on how the function was invoked.