Nelson's log

Butt-based machine learning

I’ve been doing somework with scikit-learn but running code on your own computers is for chumps, why not do our machine learning in the butt in the cloud? Here’s the offerings from a few, there are lots of others, too. This cross-service developer test is interesting data on prediction and running times.

Google Prediction API. $0.50 / 1000 predictions, $0.002/MB of training data. Focussed on training prediction models, both numeric (regression) and categorical. Didn’t see any documentation of what kind of algorithms they implement.

Amazon Machine Learning. $0.10 / 1000 predictions, $0.42 / hour for training. The FAQ says it does logistic regression, but the docs say it does both categorical and numeric predictions. They also have some data exploration tools and quality metrics.

Microsoft Azure Machine Learning. $0.50 / 1000 “API transactions”, $2 / hour. Also a per-seat charge which is completely “lol Microsoft”, I mean seriously? Offers classification but also other stuff: text analytics, vision, speech. The algorithm cheatsheet makes it look liek you have a lot of choices for basic prediction algorithms, too.

BigML: $0.10 / 1000 predictions, $0.01/MB of training data. Didn’t review what all algorithms it supports, clearly they do predictions. They’re a startup up in Portland and in Spain, with just $1.6M in funding in 4 years.

I’m not qualified to evaluate these yet. Google’s the oldest and simplest, also the most limited. Amazon’s product looks pretty simple and pretty. Microsoft’s has the most bells and whistles but you have to wade through Microsoft’s sales cluelessness. No idea on BigML but I like the idea of a startup making headway here.