Machine Learning and Social Media

A session at SXSW Interactive 2011

Monday 14th March, 2011

9:30am to 10:30am (CST)

Social media applications encounter messy user-generated data in blog posts, status updates, tweets, user profiles, etc. These documents contain free-form text that obeys no particular rules of grammar, punctuation or spelling.

If the data is so messy, how can a computer program recognize adult content or hate speech or spam? How can a computer program tell the difference between an advertisement and a product review? How can a computer program distinguish between a positive and a negative product review?

Machine learning offers some solutions. For example, given sample tweets labeled (by people) as spam or non-spam, machine learning tools can generate a program (or model) that attempts to duplicate the human judgments. You could use this kind of model in your application to filter out tweet spam.

In this talk we will describe
•Some common machine learning algorithms
•Machine learning tools – free and commercial
•Acquiring and managing training data
•Extracting useful features from your documents
•Choosing the right technique for a problem
•Measuring quality and improving your model over time
•Integrating a machine learned model with your application

Coming out of this session, you will know where you might use machine learning in your applications, and you will know how to get started.

LEVEL: Intermediate

About the speaker

This person is speaking at this event.
Bruce Smith

I'm a researcher who wonders what you're all twittering about, when I'm not exercising with Russian kettlebells or training chickens. bio from Twitter

Next session in Salon J

11am Cryptography, Technology, Privacy: Philip Zimmermann, Inventor of PGP by Phil Zimmermann

29 attendees

  • blanu
  • Bruce Smith
  • Sandeep Parikh
  • Devon Smith
  • Dustin Mihalik
  • Don Cruse
  • Chris Clarke
  • Héctor Ramos
  • Iskander Smit
  • Jesper Andersen
  • José Padilla
  • Rowan
  • Matt Biddulph
  • Morgan Craft
  • micah craig
  • Michiel Berger
  • ɹǝɯoɹʞ (dılɟ) dılıɥd
  • Rick Benavidez
  • Ricky Yean
  • Rob Myers
  • Yesenia Sotelo
  • Benjamin Bykowski
  • Lisa van Gelder
  • the daniel
  • Zach Pousman
  • Tom Whitwell
  • Ted Leung
  • vikramtank
  • Gavin Bell

29 trackers

  • Anna Manasova
  • Brad Dougherty
  • Beau
  • Sean Patrick Henry
  • Carrie Stalder
  • Dan Weingrod
  • Eamonn O'Brien-Strain
  • erwin blom
  • Jeremy Van Fleet
  • Rogelio H. Umaña
  • Dan W
  • Jamie Unwin
  • Jason Wehmhoener
  • Justin Shreve
  • Orson Ka'ili
  • kellan
  • Mo
  • Tim Holden
  • Bradley Heilbrun
  • Brett Camper
  • Markus Wegscheider
  • scldn
  • Kipp Jones
  • Tan Lam
  • TechSmith
  • Adam Keys
  • Tibet Sprague
  • Martijn Verver
  • Andy Baio

Sign in to add slides, notes or videos to this session

Sign in to track this session

Tell your friends!


Time 9:30am10:30am CST

Date Mon 14th March 2011


Salon J, Hilton Austin Downtown

Session type


Session Hash Tag


Short URL


Official session page


View the schedule



See something wrong?

Report an issue with this session