by Norman Walsh
Norm will give an overview of the HTML/XML Task Force and the progress of its work. The task force was created by the W3C Technical Architecture Group to consider the question of the divergence of the HTML and XML technologies. Norm will attempt to summarize the state of the issues and their potential solutions as conceived by the task force at the time of his talk.
by Aleksejs Goremikins and Henry Thompson
One key gap in the integration of XML into the global Web infrastructure is validation. DTD validation is supported natively to different extents by different browsers, and some Web protocols, notably SOAP, explicitly rule it out. Support for more recent schema languages is virtually non-existent. With the growth of interest in rich client-based applications in general, and the XRX methodology in particular, with its emphasis on XML as the cornerstone of the client-server interaction architecture, this gap has become more significant and its negative impact more troublesome.
Designing forms with XForms can benefit from JSON support. This is possible when defining how to map any JSON object into an XML document. The proposed conversion is specifically specified to allow an intuitive use of XPath. This is demonstrated with the integration of an external JSON API in an XForms page.
What would happen if you could put a facade around MarkLogic Server to have it act as a JSON store? This talk explores our discoveries doing just that.
A new open source project, MLJSON, provides a set of libraries and REST endpoints to enable the MarkLogic Server to become an advanced JSON store. Internally the JSON is represented as XML, and the JSON-centric queries are resolved using XML-centric indexes. In this talk we'll present the design of the project, discuss its pros and cons, and talk about the interesting uses for a fully-queryable, highly-scalable JSON store.
by Florent Georges
This article describes the EXPath Webapp Module, as well as one implementation: Servlex. It uses the CXAN website as a case study.
Akara (akara.info) is a platform for developing data services, and especially XML data services, available on the Web, using REST architecture. It is open source software written in Python and C. An important concept in Akara is information pipelining, where discrete services can be combined and chained together, including services hosted remotely. There is strong support for pipeline stages for XML processing, as Akara includes a port of the well-known 4Suite and Amara XML processing components for Python. The version of Amara in Akara provides optimized XML processing using common XML standards as well as fresh ideas for expressing XML pattern processing, based on long experience in standards-based XML applications. Some of these features include XPath and XSLT, a lightweight, dynamic data binding mechanism, XML modeling and processing constraints by example (using Examplotron), Schematron assertions, XPath-driven streamable processing and overall low-level support for lazy iterator processing, and thus the map/reduce style. Akara does not enforce a built-in mechanism for persistence of XML, but is designed to complete a low-level persistence engine with overall characteristics of an XML DBMS.
Akara, despite its deliberately low profile to date, has played a crucial role in several marquee projects, including The Library of Congress's Recollection project and The Reference Extract project, a collaboration of The MacArthur Foundation, OCLC, and Zepheira. In Recollection Akara runs the data pipeline for user views, and is used to process XML MODS files with catalog records. In RefExtract Akara processes information about topics and related Web pages to provide measures of page credibility. Other users include Cleveland Clinic, Elsevier and Sun Microsystems.
In our community there are three main models for representing and processing data: Relations, XML and RDF. Each of these models has its “sweet spot” for applications and its own query language; very few implementations cater for more than one of these. We describe a uniform platform which provides interfaces for different query languages to query and modify the same information or combine it with other data sources. This paper presents methods for completely and correctly translating SQL and SPARQL into XQuery since XQuery provides the most expressive foundation. Early results with our current prototype show that the translation from SPARQL to XQuery already achieves very competitive performance whereas there is still a significant performance gap compared to SQL.
This paper gives an overview of the open standards for the NETCONF protocol and associated tools for configuration data modelling. NETCONF is an XML-based communication protocol that allows for secure management of network devices from remote manager applications. The YANG language for configuration data modelling is described and its features compared to those offered by the existing XML schema langauges. Finally, the standardized mapping of YANG data models to the DSDL schema languages (RELAX NG, Schematron and DSRL) is discussed in some detail and the procedure for instance document validation is outlined.
Sharon C. Adler
Sharon Adler was a Senior Manager at IBM Research in New York specializing in XML standards, Web Services, and other areas for the past eleven years. She recently relinquished her management role to a long-time colleague so she focus her efforts on technical work. Before rejoining IBM in 1999, she was a Director of Product Management for Publishing Tools for Inso Corporation in Providence, Rhode. From 1985-1992, Sharon held key positions with IBM where she led the development of standards-based authoring and document management tools. Sharon has been instrumental in the development of International computer standards for more than 30 years. She served as Vice Chair /Editor of multiple ANSI/ISO standards committees as well as her position as Chair of the XSLT Working Group from the W3C she has held since its inception in 1997.
The speaker, Dr Michael Kay, is founder of Saxonica Limited which develops the popular Saxon XSLT, XQuery, and XML Schema engine. He is a member of the W3C working groups for all three languages, and author of XSLT 2.0 Programmer's Reference, the definitive Wrox guide to the language, recently republished in a fourth edition.
by Michael Kay
Some say that XML "on the web" (meaning "on the browser") has failed. For documents, web servers generally deliver HTML+CSS, using a wide variety of server-side tool chains to create it (often from XML). For AJAX-style data exchange, JSON has become the popular choice.
The XMLHttpRequest (XHR) interface has been available various browsers as early as 1999. While the name is prefixed with "XML", this interface has been widely used to allow browser-based applications to interact with a variety of web services and content--many of which are now in other formats which includes JSON. At the time of origin of this interface, the design pattern of building and unmarshalling whole XML documents into a DOM probably made sense, but there are now many use cases where processing a document to build a whole DOM representation is u ndesired if not infeasible.
This paper describes enhancements and new interfaces for the browser to enable event-oriented parsing of XML. The enhancements enable the ability to bind XML efficiently to local data structures and to process large amounts of XML content with very little memory. The new interfaces provide a number of new possibilities including better interoperability between XML and JSON.
Murata is a member of the IDPF EPUB WG, the coordinator of its Enhanced Global Language Support Subgroup, and the technical lead of an EPUB project funded by the Japanese government. He gives an overview of EPUB3 and then focuses on global language support and comic in EPUB3.
村田 真 (MURATA Makoto [FAMILY Given])
Murata has contributed to standardization, research, evangelization, education, and practical applications of XML. In particular, he is internationally recognized as an expert of XML schema languages. He has participated in several standardization committees including the W3C XML WG and OASIS RELAX NG technical committee. He is now the convenor of ISO/IEC JTC1/SC34/WG4 (OOXML). He graduated from the Science Department of Kyoto University, and hold a Ph.D. from Tsukuba University. Since 2008, he has been on the board of directors of JSSST (Japan Society for Software Science and Technology). He has received the survey paper award from JSSST in 2007, the achievement award and the international standardization development award from the Information Processing Society of Japan in 2006, and the best paper award of Internet Conference in Japan in 1998.
by Mark Howe and Tony Graham
The link between the Bible and publishing technology is at least as old as Gutenberg's press. 400 years after the publication of the King James Bible, we were asked to convert five modern French Bible translations from a widely-used ad hoc TROFF-like markup scheme used to produce printed Bibles to a standard XML vocabulary, and then to EPUB. We opted to use XSLT 2.0 and ant to perform all stages of the conversion process. Along the way we discovered previously unimagined creativity in the original markup, even within a single translation. We cursed the medieval scholars and the modern editors who have colluded to produce several mutually incompatible document hierarchies. We struggled to map various typesetting features to EPUB. E-Reader compatibility made us nostalgic for browser wars of the 90s. The result is osisbyxsl, a soon-to-be open source solution for Bible EPUB origination.
by George Bina
DITA, DocBook and TEI are among the most important frameworks for XML documents. While the latest versions of DocBook and TEI use Relax NG as the schema language DITA is still using DTDs. There were some fragile attempts to get DITA working with Relax NG but it takes more than writing a Relax NG schema to have this working. DITA NG is an open source project that aims to provide a fully functional framework for a Relax NG based implementation of DITA.
DITA NG provides the Relax NG schemas for DITA 1.2 and also support for default attribute values based on Relax NG a:defaultValue annotations - this is the critical part that makes DITA work.
The presentation covers an overview of the Relax NG schemas, how DITA specializations can be done using Relax NG (a lot simpler than with DTDs), the support for default attribute values for Relax NG and includes a demo of the complete workflow of working with DITA based on Relax NG.
by Eric Van der Vlist
We all know (and worry) about SQL injection, should we also worry about XQuery injection?
With the power of extension functions and the implementation of XQuery update features, the answer is clearly yes and I will start by showing how an attacker can send information to an external site or erase a collection through XQuery injection on a naive and unprotected application using the eXist REST API.
This was the bad news.
The good news is that it's quite easy to protect your application to XQuery injection and after this word of warning, I'll discuss a number of simple techniques (literal string escaping, wrapping values into elements or moving them out of queries in HTTP parameters) to do so and show how to implement them in different environments covering traditional programming languages, XSLT, XForms and pipeline languages.
This presentation will be fairly technical and practical, the goal being that after the presentation, attendees are able to implement what has been presented.
Although the details and demonstrations will be based on Orbeon Forms and eXist, they will be generic enough to be easily transposable to other environments.
Even though our current version is still considered to be at an alpha stage, we were able to deploy it successfully on most major desktop and mobile browsers. The size of the JS code is about 700KB. By activating compression on the web server (reducing the transfered data to less than 200 KB) as well caching on the client using the XQuery engine does not causes noticable overhead after the initial loading.
In addition, we are already reaching a large level of completeness and compliance, more than 95 percent correct tests at the 1.0.2 XQuery Test Suite. We have not yet done formal testing on Update and Full text, but plan to do so in the near future.
by John Snelson
One of the big challenges for any emerging database product is the maturity of its query optimizer. This is even more of a problem with XQuery, which unlike SQL hasn't yet had the benefit of forty years of optimization research. Any efforts to advance the state of the art in optimizing XQuery are therefore important as steps towards fulfilling its promise as a new database paradigm.
This paper introduces a novel meta language for efficiently specifying rewrites to the expression tree of an XQuery program. The applications of this language are wide ranging, including: use by XQuery implementers to efficiently communicate and execute optimizations, use by XQuery library writers to extend the optimization semantics of the host implementation with a deep understanding of the library functionality, and use by XQuery end-users to provide optimization hints and greater insight into the program written or data operated on.
This paper also discusses the use of this language to replace and extend the optimization layer in XQilla, an open source implementation of XQuery.
Michael Sperberg-McQueen (Black Mesa Technologies)
26th–27th March 2011