Identify Reddit deplorables

Interesting new Reddit tool: Masstagger. You install it and it pops up little red warnings next to user’s posts. “the_donald user”, or “kotakuinaction user”, or the like. A quick way to get some insight into a Redditor’s history and reputation. Makes it easy to identify the Nazi-wannabes at least.

More about it in this Reddit discussion. I particularly like the author’s responses to the kind of crap these projects always attract. “Why not open source? … Because I don’t want to”. “This is just like giving Jews yellow stars! … No, not really.” “Can I add my own subreddits to tag to meet my own personal desires? … No, this identifies Nazis. My old tool was editable and people used it to stalk porn posters.”

Behind the scenes the way it works is they have a list of deplorable subreddits (104 right now) that they monitor. The server on the backend is constantly downloading posts to those subreddits and keeping statistics on which users post there. There’s a second service that lets you look up the scores for a list of usernames. That’s used by the browser addon; when you load a Reddit page it gets the scores and annotates accordingly.

They had some scaling problems today;  unfortunately the service is dynamically generating the statistics data when users ask. I was thinking they could just do things statically, generate a statistics file once an hour for the addon to download. But tracking 100,000 users over 100 subreddits that’s 10M records, or maybe 200M of static data. That’s a lot to serve in a single file.

There’s a variety of existing “profile Reddit users” sites; see SnoopSnoo, Reddit User Analyser, and Reddit Investigator. I wonder if any of them have a backend suitable for this use? Reddit User Analyser works by fetching comments from Reddit directly in the browser page; no server, so probably too heavyweight for this addon. SnoopSnoo appears to have a database on the backend, the report pages come back with little bits of data injected as scripts in the HTML source. Reddit Investigator is down right now.

Anyway it’d be pretty simple to build a custom service for this. Less clear how hard it’d be to make it scale. Static files are clearly the best choice, but it’s a lot of data. Maybe one static file per profiled user? That would require the addon fetch like 40 static files with each page load, that’s not great but it’s not awful.