Building A Python-Based Search Engine

A session at PyCon US 2012

Sunday 11th March, 2012

1:30pm to 2:10pm (PST)

Search is an increasingly common request in all types of applications as the amount of data all of us deal with continues to grow. The technology/architecture behind search engines is wildly different from what many developers expect. This talk will give a solid grounding in the fundamentals of providing search using Python to flesh out these concepts in a simple library.

  • Core concepts
  • Terminology
  • Document-based
  • Show basic starting code for a document
  • Inverted Index
  • Show a simple inverted index class
  • Stemming
  • N-gram
  • Show a tokenizer/n-gram processor
  • Fields
  • Show a document handler which ties it all together
  • Searching
  • Show a simple searcher (& the whole thing working together)
  • Faceting (likely no demo)
  • Boost (likely no demo)
  • More Like This
  • Wrap up

About the speaker

This person is speaking at this event.
Daniel Lindsley

I run Toast Driven (@toastdriven) & do all sorts of programming stuff. Don't expect too much seriousness here because I'm allergic to it. bio from Twitter

Next session in E3

2:10pm Parsing sentences with the OTHER natural language tool: LinkGrammar by Jeff Elmore

Coverage of this session

Sign in to add slides, notes or videos to this session

Tell your friends!


Time 1:30pm2:10pm PST

Date Sun 11th March 2012

Short URL


Official session page


View the schedule



See something wrong?

Report an issue with this session