Did MacOS 10.9.5 break SMB again?

I had Samba working more or less right between a Linux server and a Mac client. But recently all the files are showing up as permission 700 on the mac client, creating symlinks makes bogus files on the server, etc.

Unfortunately I don’t know what broke it. I upgraded Ubuntu 12.04 to 14.04 and kept my old smb.conf file. I also updated to MacOS 10.9.5. Either could have broken it.

On the Linux box, I started with a clean fresh default config file from 14.04 and just added some minimal stuff for myself. The old config file had some cruft like “map archive” that may or may not have been good. I mean, it used to work, but who knows. Anyway the new config file looks correct, but with Samba you can never be sure.

On the Mac, 10.9.5 includes an ominous patch note “improves the reliability of accessing files located on an SMB server.” Given how badly Apple has broken their Samba client in the past, I don’t trust any changes.

My work around now is to mount the fileshare as “cifs://ub/” on the Mac instead of “smb://ub/”. Apparently this forces MacOS to use an earlier protocol and implementation that’s not as broken in Mavericks.

Is my host secure in IPv6?

I want to verify that my Ubuntu 14.04 box is secure in IPv6 space. netstat shows various things are listening on IPv6 ports. Do I have a publically reachable IPv6 address? I think not. ifconfig shows this IPv6 address:

inet6 addr: fe80::52e5:49ff:feae:3a13/64 Scope:Link

ServerFault has a useful explanation. The Scope:Link means Link-local address, something explicitly not routed and only reachable on the local Ethernet segment. This means that these servers are reachable by anything else on that same Ethernet segment, which might mean other servers in the rack in my colo. I believe all IPv6 address that start with fe80:: are link scope.

There’s a bunch of online services that will test ports or do nmap for you for IPv6.

Windows is worse than MacOS

I bitch about MacOS a lot, complaining about the rough edges in what is mostly a good product. But MacOS really is bad at one thing: desktop gaming. Some times I’m tempted to switch back to Windows just so I can play game. This week I was desperate to play Civilization: Beyond Earth before the Mac port is finished. So I booted my Bootcamp partition into Windows for the first time in two years.

Wow, what a terrible experience. Pretty much all the badness centers on software management, installs and upgrades. I was prepared for the initial thing to take forever; it was a two year old Windows 7 image, after all. Some 60+ updates and a few reboots later and all was set up. That wasn’t so bad really.

What was bad was installing software with Steam. In particular, the way you have to “install VC redist”. Steam makes the wise choice to require a separate install for every game, basically the 2014 solution to the 1995 problem of DLL hell. But the installers are fucking ridiculous. It takes 30+ seconds to do the install. What the hell is it doing? Copy a few files, maybe update a registry key or two, and you should be done. And for some reason I had to install like 15 of these for Steam the first time, then another 4 for Civ:BE.

The Bootcamp experience is not good either. The display drivers don’t work perfectly, for instance the sleep mode doesn’t shut the back light off. And I’m scared to try updating the 3 year old driver. You can’t use the vendor display driver, you have to use Apple’s, and all Apple ships is one giant install disc every once in awhile. Two, actually, you have to pick the release based on knowing what model iMac you own. Only iMac’s don’t have model numbers, so you have to remember the year you bought the thing instead.

Windows itself is still mostly, well, Windows. It’s funny how much I’m used to the Mac way of doing things now. I miss my menu bar. I can never tell if Chrome is running and just has no windows, or what. I even hacked the registry to enable reverse mouse wheel. And the stupid system is still prompting me for urgent reboots by switching me out of my full screen game, interrupting me with a popup dialog.

sshuttle

I took a quick look at sshuttle, a poor man’s VPN. It creates an ssh tunnel to any Linux host you can ssh too, then munges your OS to route most, but not all, traffic through the tunnel.

I’m confused about how it works. On my Mac I couldn’t find any evidence of it in ifconfig or the routing tables. It does set some ipfw rules, and more if I enable DNS as well. Can ipfw manipulate traffic in this way? There’s some more commentary on this 4 year old Hacker News discussion.

The big drawback is it only handles TCP, with an option to also handle DNS traffic. But not arbitrary UDP or all IP. I think for many practical things that doesn’t matter, but it is a drawback.

The nice thing is it’s super easy to set up and doesn’t require any server infrastructure. It’s a clever hack. Clients work on MacOS and Linux, at least.

13936 Go games

Inspired by a visualization of which chess pieces die in aggregate games (see on Reddit today) I looked at 13,936 Go games and made a simple map of where people place stones. HSL scale linearly proportional to number of moves, the text is the percentage of games where someone placed a stone on that spot. (A bit fudged; didn’t account for how some games have people placing a stone more than once in the same spot.)

Screen Shot 2014-10-20 at 5.06.38 PM

No big surprises, but it’s nice to do little projects like this.

Data storage for Logs of Lag

My little Logs of Lag project is doing pretty well, I need to create some sort of more serious datastore for it. It’s a small service, at most 1000 logfiles uploaded a day (each about 20k per file), but over time that adds up. Right now my data store is “put a file with a random name in a directory”. And I have no database or analysis of all the files.

The high-end solution would be to start putting data files in AWS and create a local Postgres database for aggregate statistics. I know how to build this. But that adds two big moving parts, one of which I have to maintain myself. And it’s kind of overkill for the relatively small amount of traffic I have (and am likely to ever have).

So I’m thinking now just to keep storing the files themselves on the filesystem, but spread it out over a bunch of directories so I don’t have a single giant directory. The Linux kernel no longer chokes on lots of files but shell tools are still a PITA. File names are base 64 encoded things so maybe hash by the first two characters, for 4096 directories. That’ll carry me to about 16M log files (4096 files per directory), good enough. The file system only has 60M inodes available anyway. (Unfortunately – is a valid leading character in the filenames. Oops.)

I definitely want a database, I’m curious about aggregate stats. I’m thinking of keeping this all asynchronous. When a user uploads a log it doesn’t go straight to the database, just written to a file. Then a cron job picks up the files and does the postprocessing. That works fine as long as the database data still isn’t needed by the user application. It’s not right now, in fact the server isn’t needed at all, the user gets 95% of the value solely in client side javascript.

What database? I should probably bite the bullet and just use Postgres, but I hate having to manage daemons. I wonder if SQLite is sufficient? It supports concurrent read access but only one writer at a time, and the writer blocks readers “for a few milliseconds”. I think that constraint isn’t a problem for me. Right now I’m tempted to go for that, just for the fun of doing something new.

Another problem to solve is the log parsing code. Right now logs are parsed by the client browser, in Javascript, and the client sends both the raw log text file and the parsed data (as JSON) to my server. (The server is a Python CGI which just writes the files to disk.) I’d like to retain that client-only capability, but also start parsing logs on my server. I don’t really want to maintain a second parser in Python.

So maybe I write the server parser / database stuff in Node to reuse the Javascript parser. Here’s MapBox’s node-sqlite. There’s zero need for me to make these scripts asynchronous, so the Node paradigm is not a great match, but I can certainly make it work.

(Naively I’d thought Node startup times would be bad, like Java, but that’s not true. From Node v0.10 on Ubuntu: “NODE_PATH=/tmp time node -e 1″ is about 27ms, compare Python 15ms. Not enough difference to matter. strace shows Node makes about 230 system calls for an empty script (it’s nondeterministic!), compare Python’s 883.