At bitly we study behaviour on the internet by capturing clicks on shortened URLs. This link traffic comes in many forms yet, when studying human behaviour, we’re only interested in using ‘organic’ traffic: the traffic patterns caused by actual humans clicking on links that have been shared on the social web. To extract these patterns, we employ Python/Numpy, streaming Hadoop and some Machine Learning to create a model of organic traffic patterns based on bitly’s click logs. This model lets us extract the traffic we’re interested in from the variety of patterns generated by inorganic entities following bitly links.
8th–9th November 2011