Your current filters are…
At bitly we study behaviour on the internet by capturing clicks on shortened URLs. This link traffic comes in many forms yet, when studying human behaviour, we’re only interested in using ‘organic’ traffic: the traffic patterns caused by actual humans clicking on links that have been shared on the social web. To extract these patterns, we employ Python/Numpy, streaming Hadoop and some Machine Learning to create a model of organic traffic patterns based on bitly’s click logs. This model lets us extract the traffic we’re interested in from the variety of patterns generated by inorganic entities following bitly links.
by Yanpei Chen and Todd Lipcon
Performance is a thing that you can never have too much of. But performance is a nebulous concept in Hadoop. Unlike databases, there is no equivalent in Hadoop to TPC, and different use cases experience performance differently. This talk will discuss advances on how Hadoop performance is measured and will also talk about recent and future advances in performance in different areas of the Hadoop stack.
8th–9th November 2011