Saturday 26th March, 2011
2:10pm to 2:40pm
Akara (akara.info) is a platform for developing data services, and especially XML data services, available on the Web, using REST architecture. It is open source software written in Python and C. An important concept in Akara is information pipelining, where discrete services can be combined and chained together, including services hosted remotely. There is strong support for pipeline stages for XML processing, as Akara includes a port of the well-known 4Suite and Amara XML processing components for Python. The version of Amara in Akara provides optimized XML processing using common XML standards as well as fresh ideas for expressing XML pattern processing, based on long experience in standards-based XML applications. Some of these features include XPath and XSLT, a lightweight, dynamic data binding mechanism, XML modeling and processing constraints by example (using Examplotron), Schematron assertions, XPath-driven streamable processing and overall low-level support for lazy iterator processing, and thus the map/reduce style. Akara does not enforce a built-in mechanism for persistence of XML, but is designed to complete a low-level persistence engine with overall characteristics of an XML DBMS.
Akara, despite its deliberately low profile to date, has played a crucial role in several marquee projects, including The Library of Congress's Recollection project and The Reference Extract project, a collaboration of The MacArthur Foundation, OCLC, and Zepheira. In Recollection Akara runs the data pipeline for user views, and is used to process XML MODS files with catalog records. In RefExtract Akara processes information about topics and related Web pages to provide measures of page credibility. Other users include Cleveland Clinic, Elsevier and Sun Microsystems.
Sign in to add slides, notes or videos to this session