Mapbox GL Native telemetry

I finally closed the loop on a long-standing problem we have had with our Wanderings location tracker. We wrote our code to be very low impact; it uses the Significant Location Changes API on iOS to track your movements in a very passive way. But our app was causing the phone to use as much as 100 MBytes/day when running in the background! All that bandwidth was accounted to “Time & Location” in iOS itself, not our app.

Much sleuthing later we narrowed the problem down to Mapbox Metrics, their telemetry service. You can see some discussion in this bug report, but the behavior is expected, not a bug. And somewhat unique to our app. If we turn off telemetry our usage goes down to < 1MB / day of data.

By default telemetry is turned on if you use Mapbox native in your app. And it listens to location updates your app gets. If your app gets a notification the person has moved, Mapbox then requests fine grained location updates for a short period of time. (5 meter movements for 5 minutes, I believe.)

There’s nothing nefarious or dumb here. The location telemetry is a disclosed part of the product and the code is open source. They have appropriate privacy protections. And it’s a reasonable form of telemetry for most apps. I imagine most Mapbox apps have the map front and center as part of the UI with location tracking already enabled, so Mapbox’s telemetry requests don’t really generate any more traffic. Unfortunately our app is just unusual in wanting low impact background updates only, not a good match for the telemetry behavior. Fortunately our users can turn off the telemetry tracking if they want. (In fact, Mapbox requires that apps provide a telemetry opt-out option for users.)

The other confusing piece of this is why does location tracking use so much network bandwidth, and why is it attributed to iOS itself and not our app? Only Apple really knows, but I have some guesses. Phones almost never use an honest GPS signal to track location; mostly location is tracked by reference to nearby Wifi and cellular signals. My guess is every time the phone wants to know its position the iOS location daemon itself makes a TCP/IP request to some Apple server to do a lookup. (I’ve definitely noticed that my location tracking goes way worse when I have cellular data turned off, say when traveling in Europe.) That data usage is significant if you are tracking location in a fine-grained way. And since the request comes from iOS and not the app, the bandwidth is accounted to iOS itself.

While I’m guessing, I think in practice “significant location change” means “moved into range of a new cell tower”. My guess is those location updates come “for free” because the cellular chip already has to do a bunch of work for tower handoffs. Based on observation, moving in range of a new Wifi network is not enough to trigger the location update.

Learning about containers

I decided it was time I learned about containers. My immediate goal is considering running the Pi-Hole DNS server in a container via docker-pi-hole or manually installing it under LXC. It’s a toy problem, but it’s one I’m considering at the moment. Really it’s an excuse to learn more about containers. I’m a newbie here and there’s a lot of depth I don’t have, but maybe my notes are helpful to someone.

What’s a container? It’s a way to run a program or collection of processes so that the service they implement is relatively isolated from the host operating system and other services running on the host. The isolation is mostly for system administration convenience, a little like we use virtualenv in Python or node_modules in Node to keep libraries separate. Only a lot more complicated than just a bundle of files. Containers package up files, running processes, network access, I/O, device access, etc.

A key reason people like containers is they provide a way to isolate a complex software package in a repeatable lightweight environment. You can hand a new employee a Docker image for their development environment without laborious instructions on how to get set up to work. Or a developer can spin up an environment on their dev machine that’s just like the one in production, run some tests, tear it down, and rebuild it frequently and fast. See this AWS Lambda emulator as an example.

Another key piece of containers is they are lightweight. A VM takes 10-60 seconds to start up (or more); a Docker container can take less than a second. The container doesn’t need much more memory than the processes running natively. And there’s a lot of tooling that makes it easy to build container images and manage them in a space-efficient manner.

Some overview reading:

Not containers

To understand a thing, first understand what is not that thing.

Virtual machines are not containers. Well, a VM is a container in some sense, but it’s more heavyweight than what people usually call a container. In a container there’s not a whole kernel and guest OS running; the processes are running directly on the host kernel. But those processes are isolated in important ways. A container is more like the traditional Unix security model of running a service as a user with access limited to just some smalls et of files. Or like the venerable chroot, which dates back to v7 Unix in 1979. But the old Unix methods of isolating a service’s processes don’t work very well, so modern containers use a newer set of kernel interfaces to enforce isolation.

AppArmor is also not containers. AppArmor uses Linux Security Modules to grant certain programs a very narrow set of permissions. These are configured with files in /etc/apparmor.d, you can browse Ubuntu’s here. For instance chronyd is limited to things like setting the time, binding a network socket, and writing a few files like logs in specific chronyd directories. If chronyd starts trying to do anything else AppArmor should stop it.) AppArmor has some overlap with container tech (LSM is a bit like seccomp). And some containers (like Docker) use AppArmor as extra security. but AppArmor is intended as a security measure, not a devops tool. (Same goes for SELinux; can be used to help secure a host running containers, but not a core component of the container.)

Container technologies in the Linux kernel

Containers implement isolation with help from the kernel. Unix has always had a modest concept of processes being isolated from each other. Memory isn’t shared by default, user permissions enforce some isolation on who can access files and other resources. But it’s not very secure and it’s not very flexible, so modern containers rely on newer APIs developed in the last 10-15 years. Many of these are fairly Linux specific, although both BSD and SysV were influential in the 90s and 00s developing isolation technologies like BSD Jails.

Namespaces are a Linux kernel feature for completely isolating one set of processes from another. In normal Unix you can see another user’s processes even if you can’t do anything to them. Inside a PID namespace you can’t even see others’ PIDs to try to access them. Same goes for filesystem namespaces, network namespaces, etc. Namespaces are created with the unshare(1) command which uses the unshare(2) system call. See also namespaces(7). You can create separate namespaces for mounted filesystems (a bit like chroot), hostname, IPC, IP network, PIDs, cgroups, and users.

cgroups (“control groups”) are a way to manage resource usage for a group of processes. nice(1) is an example of a very simple form of CPU resource usage. cgroups let you control not only priority (like nice) but also absolute amount of CPU used in a period of time, memory usage, I/O, process creation, device access, etc. The API is quite complicated, I think mostly because it works on groups of processes (and threads). See also the cgroup v2 design doc.

Seccomp-bpf is a way to control what system calls a process makes. System calls are the primary way a process affects the rest of the world; limiting system call access can greatly isolate a set of processes.

capabilities are a different way to restrict what a process can do, based on abstract capability concepts instead of system calls. I mostly think of it as a way to avoid giving a process full root access just to do one simple thing. Instead you can give a process just the ability to bind privileged ports or set the system clock or whatever specific capability it needs.

A word on security

I can’t figure out how secure containers are supposed to be. How hard should it be for a rogue process to break out of a container? The best answer I’ve gotten is “it depends on how secure you made your container”. (Which in practice seems to be “not very“). FWIW I asked on Twitter and pretty much no one considers a Docker container to be a significant security measure.

I looked a bit closer into this with Docker in particular. Docker is so complicated and uses so many different Linux security mechanisms that exploitable bugs are inevitable. And Docker is designed to not be fully isolated or virtualized; most Docker containers deliberately have at least some access to the host system, a network port or part of the filesystem at least. The Docker folks do take security seriously, and when security holes are found they get the full CVE treatment. Still it seems no one considers Docker a full solution for running untrusted code safely.

One big thing I learned is if a process has root privileges inside Docker, then if it escapes the container it very likely will have root outside the container too. There is a way to map root-in-the-container to a non-privileged user outside the container, but it’s complicated and I’m not sure how common it is. Pretty sure it’s not the default.

Bottom line: don’t assume a malicious program inside a container is really trapped inside the container. Too bad, I’d been hoping Docker would be a good way to mitigate the risks of daemons that run as root. It might yet be such a tool but it ain’t simple.

Container implementations and tools

So there’s a bunch of low level kernel support for isolating processes in Linux. How do you use them conveniently? A container implementation bundles all those mechanisms together into one easy-to-use tool. And miraculously, a tool that might even work on something other than Linux (perhaps with a VM to help things along). There are a bunch of container implementations out there, let’s look at a few of them.

Docker is the big player for containers and whole books are written about it. Docker wraps up a bunch of those isolation technologies and makes them easy for an end user to use. Docker is cross platform: you can run other OSes both as Docker guests and as Docker hosts, although often when you look there’s a VM involved in making that possible.

To use Docker you start by getting a Docker image from a repository like Docker Hub or else create your own image. Docker images are created from a specification in a Dockerfile; the image is a collection of files and expected behavior. You then launch the image via Docker’s runC / containerd system, which creates a container. Inside that container environment is whatever the image was specified to do; one simple command line tool, a persistent daemon, a full-fledged operating system image with interactive shells. Docker takes care of managing file persistence so you can shut a container down and restart it again with the same files. Docker’s use of overlay file systems is particularly nice here, you can assemble a container out of a stock Ubuntu file system, an overlay for (say) a database server, and then an extra overlay for your actual database state that’s changing all the time. It’s all quite efficient.

The key thing about Docker is that it’s a pretty solid tool, friendly for developer-users. Also Docker Hub is amazing, it’s a community managed library of operating system images with all sorts of complex and useful things preinstalled. My devops friends also like the way a developer can run a Docker container on their dev machine to test their product, then run basically that same container on production machines to serve live traffic.

Docker’s not the only game in town though. Linux Containers (aka LXC/LXD) are a nice smaller alternative. (Docker started life as an LXC wrapper although it has since evolved.) I liked this tutorial on running Pi-Hole in LXC because it’s so straightforward and simple compared to all the stuff surrounding Docker. OTOH it lacks all the conveniences the Docker ecosystem has built up.

Snap also has my attention because it’s built in to Ubuntu. It’s Canonical’s answer to Docker. I haven’t heard of many people really using Snap but there’s a few hundred Snap-packaged apps in the store (including some that baffle me; why package the command line tool jq as a Snap?) Most of what I’ve read about it is trying to use Snap to solve the “how do I distribute user applications on Linux?” problem. Canonical wants Snap to work across Linux distros. Also I’ve read a claim that Snap is somehow better suited for GUI apps than Docker would be. It’s interesting that Firefox and Slack and the like are available as snaps.

gVisor is an interesting alternative I mention briefly because it takes a different approach. Instead of a bunch of Linux isolation technologies it works by emulating most of the Linux kernel in a user space process. That makes it more like a VM, and also a little like the Windows Subsystem for Linux. I have no idea if anyone uses it, but it’s nice that it drops in to Docker as an alternative for runC.

So there’s a bunch of ways to run a single container. In addition there are complex orchestration software layers out there to help you manage a machine cluster running lots of services in containers. Kubernetes is the one that gets most of the attention right now, there’s also Docker Swarm and Apache Mesos and a bunch of others. I haven’t looked into using any of these but I can say back in the day Google Borg was The Shit, a key part of how that company could operate at the scale they do.

So what about Pi-Hole in a container?

So now that I’ve read several books on yak shaving.. should I run Pi-Hole in a container at home? I think no. The main problem is that at least with Docker, I still have to make changes in my host computer in order to use Pi-Hole on it under the container. And there’s no real reason I can’t just run Pi-Hole directly.

Pi-Hole distributes a nice Docker image I got up and running. But it’s a bit tricky. First, Pi-Hole wants to own port 53/TCP and 53/UDP on your host, to provide DNS services. Fair enough but by default in Ubuntu systemd-resolvd already laid claim to it and gets in the way because of the strange way Ubuntu machines do DNS. So you have to modify the host network config to make it available, which sort of obviates the whole “I’m running Docker so I don’t have to make changes on my host machine” argument. Pi-Hole also wants 80/TCP and 443/TCP; not just for the admin page but also to redirect ad HTTP requests to something that returns an empty page. Again same problem, I already have a web server.

Bottom line; the Pi-Hole service wants to do stuff in the host’s network namespace, and that’s all a bit awkward since the whole point of Docker is to contain things. I’m out of my depth here but it’s a shame there’s not some way to just publish the services on a second IP address for the container, like the docker0 network device listening on 172.17.0.1. Actually that may be entirely possible, but then I have to convince my whole home network where it can find 172.17.0.1 and that’s more routing infrastructure than I have.

On top of it being awkward, Docker does introduce a fair amount of overhead. There’s the docker daemon itself (1 gigabyte of RAM mapped! Although only 100MB used.) Also containerd. Those are more daemons to fail, or have security holes, or just to have to know about.

Still Docker does all more or less work. And the overhead wouldn’t be such a big deal if I were using Docker for other things; installing various versions of PHP, say. And if I didn’t quite trust Pi-Hole to be well behaved Docker would be a more appealing choice. I kind of like the idea of an OS where every user process runs in something like a container. We’re not quite there yet.

I’m glad I did all this learning about containers! They’re neat! But they mostly solve a complexity problem I don’t really have in my little world. I still get by OK hand-managing a single Linux server and making sure my stuff is all installed in separate directories as systemd services or whatever. I definitely see where Docker is a huge help though when coordinating with people or reusing stuff in a more composed way.

Ubuntu, DNS, and sudo

I’m confused about something in Ubuntu 18.10. Why does sudo require DNS to be working? And is that appropriate in the face of Ubuntu’s default network configuration?

First, the fun part. If DNS is broken on an Ubuntu host then you can’t sudo.

$ sudo bash
sudo: unable to resolve host gvl: Resource temporarily unavailable
[sudo] password for nelson:
hostname: Temporary failure in name resolution

I’ve seen discussions of this problem going back to 2006 and vaguely recall it being even older than that. It’s not clear if it’s even a bug; /etc/sudoers does have network-based restrictions. But it sure makes it hard to fix a system with a broken network if you can’t become room in the first place.

The usual advice / workaround for this problem is to add your system’s hostname to /etc/hosts. That way you can always resolve the name via the file even if networking / DNS is down. The dumb advice suggests adding it as an alias for 127.0.0.1, the smarter ones have you using the real IP address. However Ubuntu 18.10 doesn’t seem to have any way of adding that hostname for you, neither at install time nor at boot time via netplan. I’ve added this manually, is it a bad idea? I do worry it could get out of sync if I renumber my network some day.

Confounding this problem is DNS is kind of complicated on Ubuntu 18.10. My 18.10 system has an /etc/resolv.conf that lists “nameserver 127.0.0.53”. (Which is sort of equivalent to localhost, 127.0.0.1, but a different number.) systemd has grabbed this port and is providing DNS service via systemd-resolved. That’s a name service daemon that mostly just forwards requests to an upstream name server. The configuration for that is confusing because a lot of files in /etc get rewritten at boot time by Netplan, but in my case it boils down to “we get our DNS servers from DHCP”. All well and good and I think systemd-resolved may even be answering the host’s own name internally, not forwarding it to the DNS server.

The problem is I stopped systemd-resolved via systemctl (while I was experimenting with Pi-Hole DNS). Then I had no way to sudo to start it again. Oops.


DNS ad blocking, Pi-Hole

I’m fooling around with Pi-Hole, the DNS server that also blocks ads by refusing to answer DNS requests for ad servers on various black lists. It was originally designed to run on a standalone Raspberry Pi server but it runs fine on any Ubuntu box. (Beware, the install script is a bit aggressive and will reconfigure your network if you let it.) Unforunately running Pi-Hole on a hacker-friendly router is not easy.

Anyway, Pi-Hole seems to work OK for me. I’m using it as the DNS server for my Windows desktop and for the Linux box itself. The funny thing is I have enough ad blockers installed in clients on Windows (uBlock Origin, Privacy Badger) that very few ad DNS requests get through to Pi-Hole.

The real reason I’m interested in using this is for various devices that can’t easily block ads. Like my iPad; Apple added some support for ad blockers but it’s not very good. Or my Roku, or my PS4, or… I like the idea my whole home network is a black hole for advertising.

The huge drawback here is that DNS is really the wrong place to be injecting policy like this in a network. DNS should be a simple caching service that does its one thing very directly. I’ve never been a fan of policy DNS systems like OpenDNS because they add so much complication. I’m looking with dread at the first thing in my house that doesn’t work in a confusing way because it can’t connect to something that Pi-Hole thinks is an ad server. I’ll never figure it out.

The other drawback is redundancy; what happens if my Pi-Hole server goes down? You can configure secondary DNS for most operating systems as a backup (including via DHCP) but it turns out most systems treat that second server as an alternative server, not just a backup, so it’ll get half the requests. I guess I could run two Pi-Hole servers? Lol, no.

Pi-Hole as a software package also feels a bit like overkill. At it’s core is pihole-FTL, a daemon based on dnsmasq plus extra code. dnsmasq is fine but that extra code implements a bunch of weird stuff. Maintaining the blocklists, OK. Logging every DNS query to a world readable file? Not a great idea (albeit configurable). There’s also a fancy web interface implemented in PHP which is nice, but more surface area. It also listens on localhost:4711 for its own API. It’s just doing a lot of things on top of being a DNS server. I think it’s reasonable, but it is a lot.

(To be fair, Pi-Hole is very well documented. Also the default install seems to be smart, like running as its own user, etc. Whoever built this distribution made an effort to do a good job. It also seems to autoupdate its own binaries, which sure offers an exciting new attack vector to my Linux box.)

One limitation of Pi-Hole is it only works if you can configure the DNS servers of your device. A lot of Internet-of-Crap devices apparently just hardcode a DNS server; see Paul Vixie’s recent rant against Google Chromecast. One solution is to capture and redirect DNS requests back inside. Yes, you can act like your own hostile wifi network! I’m anxious about the complexity this introduces.

I imagine this is only going to get worse with DNS-over-TLS or DNS-over-HTTPS. The Pi-Hole can make its own upstream DNS requests via Cloudflare’s DNS-over-HTTPS, but I’m not sure there’s a way to have the Pi-Hole work as a secured DNS server itself. Not an issue now, there are precious few secure DNS clients, but maybe in a year or so.

Another drawback of Pi-Hole is there’s no good way to temporarily disable it. I mean you can via the admin interface, but it disables ad blocking for everyone. Also it requires a separate web page. I like the way uBlock has a simple UI in the browser for me to temporarily turn it off.

Plan 9 rides again; WSL file access

There’s a handy new feature planned for WSL in Windows 10 v1903: “Accessing Linux files from Windows”. Previously there was no way to get to your WSL Linux files from Windows, the host Windows OS didn’t know how to interact safely with the LXFS files in the WSL system. Now you’ll be able to mount the Linux files as a network share.

The cool part is how it works: “a 9P protocol file server facilitates file related requests, with Windows acting as the client”. Before they edited the announcement they called this “a Plan 9 file server”, which is accurate if a bit overstated. 9P (aka Styx) is a Plan 9 network filesystem technology.

It makes sense to use a network filesystem to bridge Windows and Linux. I’ve been doing something similar, network-mounting the Linux files via SFTP. You could also imagine running a Samba server inside WSL that the host Windows system accesses. Either way the Linux process gets to mediate the file access, at the expense of a network file system.

Plan 9 is the wacky part. There’s been a little discussion about why, hopefully Microsoft will write a blog post explaining the tech. My guess is they chose 9P because it is very simple to implement. IIRC part of the Plan 9 design is that many applications implement a filesystem as their API, so they built a simple embeddable filesystem any program could control. Wikipedia’s summary of the protocol certainly looks straightforward, roughly what a FUSE filesystem implementation has to implement. Still, a tiny bit surprised they didn’t go with their own SMB, or NFS, or something new and proprietary and super simple.

There’s obscure precedent for using Plan 9 this way, see this discussion. QEMU’s VirtFS uses 9P to share files between guest and host operating systems. (Which led to at least one security issue in a QEMU user.) And there’s some 9P code knocking around: the old u9fs code presumably hasn’t entirely rotted, there’s v9fs (looks abandoned?), someone took a crack at 9P in GoLang, and there’s probably useful 9P code in this Plan 9 port. Also Wikipedia mentions Styx on a Brick which looks like a way to control a Lego Mindstorms robot via something that looks like a filesystem.

Bad ethernet cable

Just had a real life experience of a bad ethernet cable hurting my network. My Linux server (or rather, the 4 port switch it is connected to) sometimes gets gigabit ethernet, sometimes 100 MBit. I couldn’t figure out why, but today I was frustrated it was stuck at 100 MBit so I found the actual cable in the patch panel and replaced it. Boom, gigabit ethernet.

I’ve got fairly elaborate-but-janky ethernet wiring in my house. Some 20 cables in the walls punched down into a cheap Leviton patch panel. Then short patch cables from the panel to the switch. In retrospect the patch panel is probably overkill; I’m never gonna repatch it. And it adds two more connectors, some circuit board traces, and a patch cable to every cable run. Still it mostly seems to do gigabit ethernet fine, just wish we hadn’t used cheap patch cables. They don’t have any rating printed on the jacket at all, just some numbers that boil down to basic Underwriter Lab type building codes. They’re flat, and in my experience flat cables tend to be crap.

Time to replace all the patch cables. Happily I found some gay pride Cat6

Update: success!