Monday 15th June, 2015
1:30pm to 2:00pm
Most Taobao's graph datasets for production are very huge. An increasingly challenge in network analysis is efficient detection of communities in dynamic networks. It's necessary for model to dynamically update communities with real-time data streams. So we can better predict users' behavior. In our work, we propose Hybrid Community Detection, a hybrid process model which takes full advantage of Spark, combines with online incremental community detection using Spark Streaming, and offline community detection using Spark GraphX. Results of real-world network data demonstrate that Hybrid Community Detection can continuously provide stable results with high quality. Meanwhile, Hybrid Community Detection is much faster than other offline algorithms, and show great potential in many areas， like fraud detection, marketing strategy and so on.
Sign in to add slides, notes or videos to this session