Just finished a small longitudinal data project. Not real happy with how it turned out, but figured I should document it anyway. Frustrated it doesn’t look better, also that it didn’t get more attention.
Here’s the deliverable: a Reddit post. It’s a visualization of rank inflation in League of Legends over the course of the last year, Season 6. I collected the data by hand out of curiosity. The resulting visualization more or less confirms what people who pay attention already knew; there is inflation. Only 15 upvotes which is crappy engagement for something I spent several hours on.
I tinkered with the visualization for awhile. The colors are meaningful, they represent Bronze / Silver / Gold / Platinum to folks who play the game. (There’s also White for Diamond+). There’s 5 divisions within each tier, that’s the black lines. And then some ugly caption text because I wanted to condense the whole thing down to a single shareable image.
I produced it using iPython, Pandas, and matplotlib. I really like Pandas’ abstraction of a DataFrame, it’s a very powerful tool for working for a matrix of numbers. Even so I have a lot of data massaging code to get from my original spreadsheet to a clean list of numbers. Deleting extra rows, transposing rows/columns, converting “93%” to 0.93, etc etc.
I really don’t like matplotlib. I’ll paste my graph generation code below, it’s super ugly. There’s like seven ways to set options. Flags passed to the Pandas plot() function, functions called on the global matplotlib object, functions called on the global pyplot object, functions called on the axis objects, two different ways to modify RC parameters, etc etc. Just a mess. Maybe this is my ignorance though and there’s a way to simplify / rationalize it all.
# Plot plt.style.use('seaborn-whitegrid') matplotlib.rc('axes', grid=True) matplotlib.rc('grid', color='w') matplotlib.rcParams['font.family'] = ['Liberation Sans'] # Filled area for the tiers ax = tdf.plot( kind='area', stacked=False, figsize=(10, 6), legend=False, linewidth=0, alpha=1.0, color=('#ffffff', '#87fffd', '#ffdb57', '#dce2f2', '#b2a07e')) ax.set_yticks(numpy.arange(25,100.01,25)) ax.yaxis.set_major_formatter(matplotlib.ticker.FormatStrFormatter('%.0f%%')) # Lines for every division df.plot(ax = ax, legend=False, color='k', linewidth=0.5) # Configure the chart a bit more ax.set_axisbelow(False) plt.title('Season 6 distribution of ranks in NA') plt.text(0, -35, '''There is a general trend of rank inflation...''', linespacing=1.4, size=12)