I took a quick look at sshuttle, a poor man’s VPN. It creates an ssh tunnel to any Linux host you can ssh too, then munges your OS to route most, but not all, traffic through the tunnel.

I’m confused about how it works. On my Mac I couldn’t find any evidence of it in ifconfig or the routing tables. It does set some ipfw rules, and more if I enable DNS as well. Can ipfw manipulate traffic in this way? There’s some more commentary on this 4 year old Hacker News discussion.

The big drawback is it only handles TCP, with an option to also handle DNS traffic. But not arbitrary UDP or all IP. I think for many practical things that doesn’t matter, but it is a drawback.

The nice thing is it’s super easy to set up and doesn’t require any server infrastructure. It’s a clever hack. Clients work on MacOS and Linux, at least.

13936 Go games

Inspired by a visualization of which chess pieces die in aggregate games (see on Reddit today) I looked at 13,936 Go games and made a simple map of where people place stones. HSL scale linearly proportional to number of moves, the text is the percentage of games where someone placed a stone on that spot. (A bit fudged; didn’t account for how some games have people placing a stone more than once in the same spot.)

Screen Shot 2014-10-20 at 5.06.38 PM

No big surprises, but it’s nice to do little projects like this.

Data storage for Logs of Lag

My little Logs of Lag project is doing pretty well, I need to create some sort of more serious datastore for it. It’s a small service, at most 1000 logfiles uploaded a day (each about 20k per file), but over time that adds up. Right now my data store is “put a file with a random name in a directory”. And I have no database or analysis of all the files.

The high-end solution would be to start putting data files in AWS and create a local Postgres database for aggregate statistics. I know how to build this. But that adds two big moving parts, one of which I have to maintain myself. And it’s kind of overkill for the relatively small amount of traffic I have (and am likely to ever have).

So I’m thinking now just to keep storing the files themselves on the filesystem, but spread it out over a bunch of directories so I don’t have a single giant directory. The Linux kernel no longer chokes on lots of files but shell tools are still a PITA. File names are base 64 encoded things so maybe hash by the first two characters, for 4096 directories. That’ll carry me to about 16M log files (4096 files per directory), good enough. The file system only has 60M inodes available anyway. (Unfortunately – is a valid leading character in the filenames. Oops.)

I definitely want a database, I’m curious about aggregate stats. I’m thinking of keeping this all asynchronous. When a user uploads a log it doesn’t go straight to the database, just written to a file. Then a cron job picks up the files and does the postprocessing. That works fine as long as the database data still isn’t needed by the user application. It’s not right now, in fact the server isn’t needed at all, the user gets 95% of the value solely in client side javascript.

What database? I should probably bite the bullet and just use Postgres, but I hate having to manage daemons. I wonder if SQLite is sufficient? It supports concurrent read access but only one writer at a time, and the writer blocks readers “for a few milliseconds”. I think that constraint isn’t a problem for me. Right now I’m tempted to go for that, just for the fun of doing something new.

Another problem to solve is the log parsing code. Right now logs are parsed by the client browser, in Javascript, and the client sends both the raw log text file and the parsed data (as JSON) to my server. (The server is a Python CGI which just writes the files to disk.) I’d like to retain that client-only capability, but also start parsing logs on my server. I don’t really want to maintain a second parser in Python.

So maybe I write the server parser / database stuff in Node to reuse the Javascript parser. Here’s MapBox’s node-sqlite. There’s zero need for me to make these scripts asynchronous, so the Node paradigm is not a great match, but I can certainly make it work.

(Naively I’d thought Node startup times would be bad, like Java, but that’s not true. From Node v0.10 on Ubuntu: “NODE_PATH=/tmp time node -e 1″ is about 27ms, compare Python 15ms. Not enough difference to matter. strace shows Node makes about 230 system calls for an empty script (it’s nondeterministic!), compare Python’s 883.

ubuntu 14.04, udev, ethernet

After 10 years of fiddling with remote servers, I finally killed one so I couldn’t access it on reboot. The system came up fine but not the network. The nice remote admin logged in and found that the Ethernet device was now named /dev/p4p1 and the config files were expecting /dev/eth0. He updated the config and it works, for now. I’m scared to reboot again since I don’t understand what caused it to change in the first place.

tl;dr: installing biosdevname causes devices to be renamed at boot.

Summary: I installed biosdevname and I think that caused my ethernet device to get a different name after reboot. A fresh Ubuntu 14.04 system uses biosdevname to give Ethernet devices names like /dev/p4p1. These names are based on the physical location of the adapter on the PCI bus. An old Ubuntu system upgraded to 14.04 may not install biosdevname and instead use older working names like /dev/eth0 that are configured by MAC address in /etc/udev/rules.d/70-persistent-net.rules.

Some detailed notes on what may have broken it:

  • The system started as Ubuntu 10.10, I did two do-release-upgrades to bring it to 14.04. After installing 14.04 it booted correctly at least once, configured as /dev/eth0 for ethernet. Now it wants /dev/p4p1
  • After the working boot I installed a few packages that I regret installing. Specifically biosdevname libsystemd-daemon0 systemd-shim systemd-services discover linux-image-3.13.0-36-generic.
  • I removed the 3-13.0-36 kernel and then had to do some awkward stuff to fix grub to use the linux-image-3.13.0-37-generic I already had. I think I got that right.
  • I don’t think the systemd packages would cause any problems, but I don’t know for sure. I don’t think I’m running systemd as a whole.
  • I’m not sure what discover does. My hope is it’s an install-time thing only.
  • biosdevname’s whole job is to give names to devices, so it seems a likely troublemaker. One theory is that the mere presence of it being installed alters the boot methods of Ubuntu in some way. But that’s a total guess. This forum post gives some credence to things.
  • /proc/cmdline says
    BOOT_IMAGE=/vmlinuz-3.13.0-37-generic root=UUID=e7833d2a-17a4-4598-95e2-65d354589db9 ro quiet splash

Some notes on how it’s supposed to work:

  • /etc/udev/rules.d/70-persistent-net.rules is supposed to give a persistent name to the device. Mine seems to be saying “call this eth0″. That may be a legacy from the 10.10 or 12.04 history of the system, and was apparently working until yesterday
  • I have another system that began life as ubuntu 14.04. Its ethernet is /dev/p2p1. It has no files in /etc/udev/rules.d
  • My system appears to be booting with Upstart, not systemd. (PID 1 is init, not systemd). Details on systemd and Ubuntu
  • These Arch Linux notes on the upgrade to the new persistent naming scheme are interesting. Also these systemd docs. But since I’m not using systemd, they are probably not relevant.
  • This GenToo doc on udev upgrades is also useful.
  • Here’s an Ubuntu bug saying the 14.04 docs have not been updated to reflect that 70-persistent-net.rules is no longer used. Also here
  • biosdevname installs /lib/udev/rules.d/71-biosdevname.rules which may override that 70-persistent-net.rules thing.
  • Apparently the kernel has a command line parameter biosdevname=0 to disable renaming. (net.ifnames=0 in Fedora). 71-biosdevname.rules also can be edited to do that.

After 2+ hours of reading I convinced myself it was safe to reboot. It was! I removed the /etc/udev/rules.d/70-persistent-net.rules file that was effectively being ignored after biosdevname was installed.

I filed a bug:

Related: bless Arch Linux. I’ve never run it, but their incredibly detailed Wiki documentation on how Linux systems work is so useful. Sometimes you have to go to the low level tools manufacturers to understand how something works.

Notes on setting up a new Ubuntu 14.04 box

I’ve now recovered from my server hardware failure on Saturday. I can’t say enough good things about my remote rsnapshot setup. I’d kind of forgotten that backup even was running and it totally saved my ass. The hard drive had failed, the system wouldn’t reboot, would have been a nightmare getting it back. And while I don’t have any essential data on the remote server I do have a bunch of logsoflag logs I’d hate to have lost. Not to mention the server setup at all.

I ordered a new machine and set up Ubuntu 14.04 on it. I have no automation for building these servers, the overhead is hard. Couldn’t even easily import crap over from my backup of the Ubuntu 12.04 box, too much has changed like the shift to Apache 2.4.

Here’s a dump of the journal I kept as I set up the new server, the steps I took. Nowhere near complete.

Change root password and reboot
Set hostname, reboot
Edit /etc/network/interfaces to use Google DNS. “ifdown p2p1; ifup p2p1″ and hold breath.
Partition the extra disk, make file system, add to fstab
Create user nelson, add nelson to sudoers, also to group sudo.
Restore /home files from backup, put /home/nelson in order

Install unattended-upgrades
Install mg git gcc make munin nmap rtorrent screen mtr traceroute zip unzip curl optipng graphicsmagick gawk libimage-size-perl ntp
Install munin munin-node sensors
Install postfix bsd-mailx

Set up remote rsnapshot

Install apache2
enable a lot of mods
reconfigure deflate to not be stupid

apache2 2.4 auth changes
set up backups

Set up web sites

Restore /var/www/* from backups
Reinstall sites in /etc/sites-available/* and enable

Restore user contrabs from /var/spool/cron

Some notes from old setups:

Apache 2.4 security stuff

My new server setup has mostly gone well, but I got tripped up by an error
authz_core client denied by server configuration

Turns out this is a change made between Apache 2.2 and 2.4. It’s reasonably well documented by Apache itself and this Stack Overflow post goes into more detail.

Long story short, for directories outside /var/www where I had some stuff stashed to load (namely CGI scripts), I had to Require all granted in Directory blocks for those oddball directories. I guess Apache moved to a “not allowed by default” model, which makes a lot of sense from a safety point of view.

Note that Munin in Ubuntu 14.04 has Apache 2.2 style config files, despite Apache 2.4 now being the default Ubuntu server. They have a bug tracking this.

Apache optimization still too hard

Setting up my new server, I’m struck by a few things.

  • Ubuntu doesn’t install mod_deflate at all by default
  • The default mod_deflate configuration is very conservative, effectively only compressing HTML. I replace it with a config which says “compress everything but GIF, JPG, MP3, that kind of thing”. Apparently it breaks IE6 or something to compress Javascript or JSON or the like. Who gives a shit?
  • I have no idea what, if any caching Ubuntu HTTPD does by default
  • I don’t even know how to test it any more. Chrome’s developer tool audit suggests stuff is being marked cacheable for 3600 seconds, which it complaints is “too short” but actually as default policies go, isn’t terrible

Hitting reload or Cmd-R on causes Chrome to reload everything. Hitting return in the location bar or pasting it in causes Chrome to load most things from cache, only loading the page itself and the JSON blob in the AJAX from cache. I don’t know if Chrome is being strictly standards-compliant or is a bit extra aggressive in trying to load things from cache.

I feel like I used to understand HTTP caching a lot better, but it’s been a couple of years since I looked closely at what was being cached and how. I also seem to remember there were more tools online that would load a page and give you advice on how to better optimize performance. Like there used to be a server hosted variant of YSlow, right? All I know to use now is Chrome’s developer tools in the browser.