Beyond the XML mirage

For various reasons, developers seem to expect that they can solve a wide variety of problems by simply using XML formats and using XML tools to manipulate that information. They cast their expectations on using XML itself to solve the problem, when in fact their problems need much more attention than a common syntax.

Developers who focus directly on creating and manipulating XML structures rather than using XML to represent the information they need to create and manipulate are often disappointed by the amount of work they create for themselves. XML can be an elegant syntax for representing information, but labeled structures are not themselves data models.

Making XML useful requires an understanding of the information, and a separation between the understanding of the information and the expectations for processing it. Treating the XML as the information and processing it directly makes it much harder for different people and organizations to share and process the same documents.

Unfortunately, an enormous amount of effort has gone into blurring that separation. The Document Object Model (DOM) and similar APIs are widely used by developers who just want to manipulate the XML without importing it into their own structures. Tools which treat XML purely as a serialization of objects or database tables often focus on the information as they appear in the objects or tables, but pay little attention to how that information might best be represented in XML. In both cases, the separation between XML and the information is barely respected, and it's not surprising that these approaches are both limiting and frustrating.

Making XML work - using XML syntax to share information - requires more than generic tools and frameworks. Building ever more abstract models which represent XML contents is useless without direct connections to ever more specific applications. Generic tools are powerful and useful, but only to the extent that they can solve specific problems.

Developers need to take responsibility for building conduits between their information and XML representations of it which take into account how XML works. This project requires a combination of generic tools and specific expertise, combining tools like XML parsers and XSLT processors with an understanding of how the information in the XML document connects to the information expectations of the application. Simple pathways connecting the two are possible but probably unusual, at least if documents are widely shared between applications with even slightly different information expectations.

All this mapping may seem like extra work to developers who just want to ship information from point A to point B, but it's at the heart of making markup useful. XML itself doesn't solve any of these problems - developers do.

Monastic XML Copyright 2002 Simon St.Laurent.