Visualizing AutoTrader traffic in near real-time with Spark Streaming

A session at Spark Summit 2015

Tuesday 16th June, 2015

2:00pm to 2:30pm

The Hadoop team at AutoTrader was tasked with moving the website's core metric logic over from Netezza for hourly processing. Two solutions were proposed: one on Hive, and one on Spark. The Spark solution processed the results in 1.5 minutes, compared to 18 minutes for Hive, and the Spark solution is currently in production today. But a surprising benefit came in how much quicker development is with Spark, and the team finished the Spark solution with a month to spare. With the hourly Spark results already validated, the team copied the code into a Spark Streaming job, then used d3.js to visualize the results in real-time. In the three months since installing Spark on their cluster, AutoTrader went from an hourly Netezza process to a 30-second lag Spark Streaming visualization that delivers near real-time insights into site activity.

About the speaker

This person is speaking at this event.
Jon Gregg

Sr. Analytics Engineer at AutoTrader.com bio from LinkedIn

Sign in to add slides, notes or videos to this session

Tell your friends!


Time 2:00pm2:30pm PST

Date Tue 16th June 2015

Short URL


Official event site


View the schedule


See something wrong?

Report an issue with this session