Battlefield 4 graphs

Some quick data views, looking at player statistics from the game Battlefield 4. I’m scraping stats from a site that lets me get a bunch of numbers for players, statistics like “Kill/Death Ratio” and “Score per Minute” and the like. I collected a bucket of stats for the top 1000 players (by rounds completed) for all 5 platforms, then graphed various variables. Honestly I haven’t found a very interesting story in this data yet, the main thing I’ve learned is that time played is not really correlated with any measure of skill. Ie: no one learns to play better. But more to consider.

Scatter plot is for two variables (lableled on axes), histogram below is the x axis.

Kill/Death Ratio vs. Time Played Win/Loss Ratio vs Kill/Death Ratio

Interactive data exploration

I’m exploring a multivariate data set. Battlefield 4 data, to be exact, I have a dataset for thousands of players with such statistics as “time played”, “kill/death ratio”, “win/loss ratio”, etc. Fun stuff. Now I’m exploring it looking for intresting patterns, clusters, etc. And doing it by hand-writing Javascript code with D3.

I wish there were a solid consensus tool for exploring datasets and showing graphics. Excel and R seem to be the two most popular. But Excel is too primitive and R is too much like programming already.

 

Wanted: browser based page scraping

Doing yet another HTML scraping project, contemplating the slowness and desolation that is BeautifulSoup or spend hours learning scrapy or surely there’s something better by now?

There is, my web browser, with DOM CSS queries. Just load the page and do querySelector and you’re done. Most modern HTML is quite nicely scrapable in browser Javascript. The problem is you can’t effectively script a browser to process thousands of pages. I’d hoped node.js would offer a solution but they don’t have some battle-hardened HTML parser like a browser has. There are some options, wonder if any are worth the time to learn about.

 

Chrome extension gripes

Working along on my Chrome extension, generally impressed with how well thought out and thoroughly documented the extension support is. That being said, some wrinkles…

There’s no support for reloading extension code when it’s changed during development. There’s some hacks, I’m using Extensions Reloader which gives me a button to press to reload the Javascript code. But it won’t reload the manifest, so it’s not a complete solution. And even then I have to hit the extension reload the button then refresh the page to debug stuff, it’s awkward.

Chrome has adopted an awkward API for signaling errors in the extension API, the variable chrome.runtime.lastError is set. This makes checking for errors a huge nuisance, but maybe it’s the only way to do this given the weird lifecycle of extensions? Good thing Javascript is single threaded :-P. It’s a shame Javascript’s built in exceptions are not very useful. I like the D3 pattern of setting an error object on the callback function, at least it makes the error variable a bit more explicit.

The Chrome extension storage API is fairly capable, I particularly like that it lets you store data in a place that Chrome synchronizes between the user’s browsers. But the API is just enough different from DOM localStorage to be a bit obnoxious. Also the get() method is incredibly slow, like 500ms to retrieve my 10 bytes of state. That means I can’t just store stuff there and fetch it from my content script, too slow for something whose purpose is to modify a page’s presentation. So I have to create a background page and move the config fetching there, an unwelcome complication. Update: it’s not always so slow. Sometimes it’s only 40ms. Sometimes its 200ms. I’m running a bunch of other extensions, should test it more cleanly.

 

My first Chrome Extension

I wrote a Chrome Extension today. Or the start of one. Want to modify the content of some pages on Metafilter in order to try out a UI idea. I started with the diediedead userscript, that runs just fine by itself if you manually load it. But I want a proper Chrome extension so I can make it configurable (and easily usable) so I read up on how to build Chrome extensions. It’s easy, at least for this easy thing I’ve done so far, although there’s a lot more complexity for a more complex extension.

The one hassle is Chrome doesn’t auto-reload extensions under development. It does reload userscripts though, weird. Anyway the Chrome Extensions Reloader Extension helps, at least there’s a single button to press.

Automatic software updates considered harmful

Another day, another bandwidth management problem. Being on a slow 1Mbit link really makes you aware of how much crap Apple products downloads, with no user control.

Today’s scenario: trying to watch LoL tournament games on Twitch. That’s an 800kbps stream, so anything else running on my 1000kbps link is instantly notable. Like, say, my laptop. The one I’d like to use for very light web browsing while watching. Only the moment I open it up a bunch of Apple shit decides now’s the time it has to download updates. All invisibly, with no user feedback or control. Right now it’s softwareupdated downloading who knows what or how much, I’m guessing it’s the 120mb command line developer tools. Yesterday storeagent decided it had to download something, I still don’t know what, but that one started downloading with the lid closed and the machine nominally asleep.

Background downloads that happen quietly so your machine is ready to go are great. Background downloads that run in the middle of the day and consume all bandwidth are bad. If only they had some network saturation detection, or if there were a way to specify low priority network usage, or something. But this is TCP/IP and there’s not, so Apple’s a bit stuck. Still it’s awful and annoying.

Oh yeah, you can’t even kill storeagent. Ignores SIGINT. I finally did a SIGKILL. It seems to have recovered intelligently from that.