I was recently confronted through my work to a classification problem : given a set of explanatory variables, which category a player will most likely end up in (I work in the videogame industry).
To be frank, my statistical knowledge was a little rusty since I have been doing web-dev for a year (unfortunatly stats are not like riding a bike : you do forget after a while). So I ended up doing a quick litterature review in order to list the tools that could help me with this task.
I began with a logistic regression but wasn’t that happy with the accuracy of the result and the implementation was not that easy due to the high volume of data I was dealing with.
Through my readings, I came in contact with various techniques of machine learning and was eager to try them out. I’ve heard about it in the paste but it seemed like a mystical and out of reach corner of computer and information sciences.
And I was wrong. It’s accessible, it works and it’s a lot of fun (well, data-scientist-kinda-fun). What follows is my naive attempt at solving a problem that puzzled me for a while somehow : automatic shape recognition. I mean, how long did it took us as kids to be able to put those damn educational toys into the rightly shaped slot ? Well, quite a long time after all…