by Nate Solas
Starting with examples of in-house search technologies from many of the leaders in the field (Indianapolis Museum of Art, Powerhouse Museum, and the Victoria and Albert Museum), this workshop will be a hands-on lab to build and tune a from-scratch Solr search engine using a broad sample dataset. Participants will learn how to configure their search engines and examine many useful tips and tricks: autocomplete while typing, faceting, spelling suggestions, "more like this", and more. The focus will be Solr, an open-source enterprise search engine, but we will also touch on Sphinx as an alternative.
Attendees are asked to bring their own laptops and be prepared to work with a virtual server I will set up. To ease compatibility and level the playing field I will set up a web-based sandbox so everyone can play without needing shell-level access or experience beyond editing a configuration file. Using Solr "cores" we will be able to quickly clone a test environment and allow everyone who wants to a chance to configure and test a search engine using shared sample data. In this interactive environment we will be able to easily examine the subtle differences configuration changes can make in fundamental areas like spelling suggestions, proximity searches, etc.
We will finish by exploring search beyond our own sites: how can we best optimize online content to make it most broadly "findable"? As our content spreads beyond our walls we need to help ground it through use of microformats, inline RDFa, and URL structure as it applies to crawlers, robots, and scripts. We will also do a brief inspection of the resurgence of HTML meta tags as described by the OpenGraph protocol, and how these are being used by other search engines.
16th–19th November 2011