More Logs of Lag aggregate stats

Following on my analysis of the average LoL game, here’s some scatterplots and histograms of various statistics for games. Specifically, for each game I’m looking at match length (0-3600 seconds is typical), best ping (10-150ms is typical), median ping (10-150ms is typical), and packets lost / minute (0-1 is typical, but long tail).

huge caveat for this data: this is not a valid statistical sample of all LoL players. This is a random sample of Logs of Lags users, almost certainly biased towards people with network problems. I don’t intend to publish this data widely because it’s not a fair sample.

Honestly, the scatter plot is not the best treatment here; too many dots, not enough correlation. The histograms at top and right examining single variables are probably more interesting. Some notes:

  • Median ping and best ping are pretty tightly correlated. No big surprise there; they are nearly identical on a connection with low jitter and no packet loss. Not sure about the diagonal lines above the x=y line, maybe an artifact of how the game synthesizes a fake ping number in the presence of packet loss?
  • People’s pings are an interesting bimodal distribution, roughly peaking at 50ms and 110ms. I can’t account for the relative unpopularity of 80ms. Well if this were all North American data I’d say it was west coast vs. east coast, but I get a fair amount of European uploads too.
  • Packet loss / minute has a lovely long tail distribution. FWIW, I call anything over 5 a “bad link”; that’s ~30% of my logs, perhaps I should move that number up.
  • I was hoping to see evidence of correlation between packet loss and best ping, on the theory longer connections would be worse. I don’t see any evidence of that in the picture, it’s just a mess of dots. Need a better analysis tool. Same problem with packet loss vs. match length
  • Similarly, I can’t draw any conclusion from median ping vs. match length. That previous graph I did suggests the average ping for players does not go up during the game.

MacOS: sshfs and FUSE

When I first started using Macs the situation with FUSE, the user-space filesystem shim, was a mess. But it’s stabilized now and it appears OSXFUSE is the winner. Their binary packages for OSXFUSE and sshfs work just great. Here’s some very n00b friendly instructions.

sshfs -C ub:/home/nelson /Volumes/ub

I’m not sure if it’s possible to automount. sshfs needs my ssh credentials which aren’t available at login. probably not worth the hassle.

Unfortunately git prompt does not work well with sshfs; if you’re in a git-controlled directory it tries to do a whole bunch of file stats over sshfs, which is slow even though sshfs has some caching.

unset PROMPT_COMMAND

Even better, I use liquidprompt which lets you disable git in certain paths via LP_DISABLED_VCS_PATH, for just this reason.

The other drawback with sshfs is Finder is very slow. MacOS seems to do a lot of work on every single file, even when all it ends up displaying is the filename.

Companies selling “make your network better” products

Some interesting companies I ran across, all selling products to make home internet better. I can’t tell if any of these really work or not, just an interesting market category.

Netduma is going to sell a router optimized for gamers. Very little info on the website but so far it seems like better firmware, along the lines of what you get if you know how to make Tomato or DD-WRT work. Ie, “Anti-Flood” sounds like QoS scheduling. This product seems smart, in that it’s very hard right now to buy a good consumer router with good firmware. And they could get deeper into tweaking TCP/IP (like messing with windows, Nagling, etc), but even just doing decent QoS and buffering is a big help.

Speedify is a service that offers channel bonding for higher throughput by using multiple links. It sounds a little like a VPN, their client sends your traffic through to their “speed servers“. I’m skeptical this will really work in practice, but it’s interesting.

WTFast promises faster gaming, “cut your latency in half or better!”. The site is very marketing heavy and sounds like snakeoil to me, but maybe it could work. They have a “Gamer Private Network” which they claim is faster than the public Internet. There’s a $250k Kickstarter for router hardware, selling a 300MBps wireless router for the very good price of $50. They have a working software client and these Redditors report it helps them. I suppose a VPN service optimized for gamers makes some sense.

Bandwidth usage of LoL games over time

I had a theory that packet loss often gets worse after about 20 minutes of a League of Legends game, because people tend to group up and there’s more team fights. As a quickie-test of that theory, here’s a graph of the average characteristics of thousands of LoL netlogs for games that last 30-60 minutes.

Screen Shot 2014-11-19 at 4.32.19 PM

Not a lot of trend here. Bandwidth (grey line) does go up over time, but not radically. That could be just as much because cooldowns tend to go down as folks gear up as any effect of people grouping up. No useful visible trend in ping or lost packets, but that may be because the analysis is not good enough.

Lots of errors in this. The regular drops every minute are probably an aliasing problem having to do with the way the logs sample every 10 seconds.. usually. The population of netlogs is a mix of various game types, although by limiting it to 30+ minute games it’s mostly the main Summoners Rift games. (ARAM would significantly distort my hypothesis.)

Averages are a terrible way to calculate aggregate stats about packet loss and ping. But I don’t think they’d hide a trend of average ping or packet loss going up as a game progresses, I’d expect to still see it. Maybe not with the packet loss; the aggregate loss is so low perhaps it’s just not visible.

Testing DNS without cacheing

Helping a friend figure out a DNS problem, passing on some ancient wisdom about how to test if your DNS is right. The key thing is to avoid all the caches, make direct queries to the servers of authority.

$ whois somebits.com

Ask the Internet name authority about the domain somebits.com. Lots of crap is returned there, what you care about is the name servers.

$ whois somebits.com | grep 'Name Server'

Name Server: NS1.HOVER.COM
Name Server: NS2.HOVER.COM
Name Server: NS3.HOVER.COM

So now we know that there are three authoritative DNS servers for the domain. Let’s ask one of them for information (and strip out comments, lines starting with semicolon)

$ dig @NS1.HOVER.COM somebits.com any | grep -v '^;'

somebits.com. 900 IN NS ns1.hover.com.
somebits.com. 900 IN A 107.150.51.74
somebits.com. 900 IN SOA ns1.hover.com. dnsmaster.hover.com. 1394118405 10800 3600 604800 900
somebits.com. 900 IN NS ns2.hover.com.

ns1.hover.com. 900 IN A 216.40.47.26
ns2.hover.com. 900 IN A 64.98.148.13

The important thing here is learning the A record for somebits.com, pointing to 107.150.51.74. That’s the DNS record for my server.

If you’re debugging a DNS problem, start with this type of query. Go straight to the Name Server of authority. If those records are wrong then it’s hopeless. If they look right then something else can still have gone wrong. Particularly cached data, whether cached in DNS or else cached by client software. Google Chrome seems overaggressive about caching DNS info, for instance. Java used to be awful.

Node.js sqlite3: very slow bulk inserts

I’m running into a baffling problem where the straightforward way to use prepared statements in the MapBox sqlite3 Node.js code is super slow. Like inserting 100 rows in a table takes 16 seconds!

Update: found the problem, I wasn’t using transactions right. See bottom. Simple solution for faster bulk inserts in sqlite3: do them with a transaction.

The API docs have a strange note for Statement.finalize() which says “if you experience long delays before the next query is executed, explicitly finalizing your statement might be necessary”. I’m seeing exactly that. But after you’ve finalized it you can’t reuse the prepared statement; what’s the point of using statements at all then?

There’s notes online that naive bulk inserts in sqlite3 are slow, nothing to do with Node. See StackOverflow and Blog::Quibb. The quick fix I learned from both of those is to wrap the insert block in a transaction, because without that sqlite3 treats each insert as a separate transaction and is understandably slow. OK, so we’ll use a transaction.

But even with transactions using prepared statements is still slow in the Node version. I don’t know if this is a bug, me not understanding sqlite3 right, or me not understanding Node right. I am using serialized mode and trying to do simple nonconcurrent programming. The delay only occurs when the database is closed, btw, sqlite3 is accumulating the changes in a journal before committing them.

Three approaches:

  1. Call db.run(‘INSERT…’) a bunch of times. This is fast, but only if I wrap the inserts in a transaction.
  2. Prepare a statement once, then call stmt.run(‘INSERT…’) a bunch of times. This is slow whether I use a transaction or not.
  3. Prepare a separate statement for each row, then call stmt.run(‘INSERT…’) on it and then stmt.finalize(). This is fast.

Sample code is in this gist. “db” and “finalize” both take about 400ms on my Linux box. “reuse” takes 16 seconds to insert the 100 rows.

Update: the problem was I was creating the prepared statement before I did the “begin transaction”. Apparently that means the prepared statement is executed without the transaction context. Beginning the transaction and then preparing the statement, the three methods all run fast. I suspect the call to finalize has nothing to do with it, but I’m not sure.

Here’s the revision view: version one had the begin transaction too late.