One Input - Many Outputs: Walsh: JoDI

XML: One Input - Many Outputs: a response to Hillesund

Norman Walsh
Email: ndw@nwalsh.com

Abstract

Hillesund (2002) argues that XML does not and cannot fulfil the often touted benefit that it allows authors and publishers to create documents that can be effectively presented in a variety of formats; that the "doctrine of 'one input - many outputs' ... is basically wrong." This Letter defends the position that XML is an effective technology, in fact the most effective technology in widespread use, for producing multiple output formats from a single input document.

1 Introduction

Hillesund employs several arguments in his attempt to refute the effectiveness of XML for creating multiple outputs from a single input. Although it would be unfair to assert that all of his arguments are without merit, many of them appear to miss the point.

In brief, he argues that:

Meaning and presentation are inextricably linked. Any change to the presentation of a document changes its essential meaning. "... structure and display are closely interwoven and in many cases indistinguishable."

Separation of content and presentation is logically impossible. "In the context of publications there really is no way of separating content structures and presentation ... content and format."

Separation of content and presentation is practically impossible. Authors and editors are incapable of distinguishing between content and presentation. "The concept [of a clear distinction between content and presentation] is ... contradicted by both the insight of semiotics and the experience of writers and editors."

Reuse of information is rare. "... in a publishing house, most texts are made for a special use in a specific context for a limited group of readers."

It is impossible to break information into useful pieces independent of the context of the entire document in which they were originally written. "Give a piece of text a new wrapping ... and that text no longer has the same meaning."

Before considering the flaws in each of these arguments, it is interesting, if slightly incongruous to his arguments, to note that Hillesund's paper includes no less than four examples of the successful use of XML precisely for the publication of multiple output formats from a single input document.

If those examples are not sufficient to demonstrate that in practice XML is often used for precisely the purpose that Hillesund is attempting to show "is basically wrong", I will add one more.

DocBook (Walsh 2002) is a popular XML schema for writing computer hardware and software documentation. Using freely available tools, it is possible to transform any valid DocBook document into a paper document, an HTML document, a collection of linked HTML documents, an HTML Help document, or a JavaHelp document. This allows hundreds, perhaps thousands, of authors and publishers to produce tens-of-thousands, perhaps millions, of pages of documentation in any of at least five formats all from a single input. They are doing so today.

2 Meaning and Presentation are Inextricably Linked

Hillesund argues that typographic distinctions are often "not primarily aesthetic or typographic, but essentially semantic." He uses a concrete example to make this point by displaying the following text without the benefit of typographic distinction:

Å seile inn i fremtiden Livet i en seilbåt eller robåt gir folk anledning til å gjenerobre den sakte tiden, den som i våre dager i våre dager er i ferd med å bli en mangelvare. Forbundet KYSTEN har formulert som vesentlige målsetninger å gi vern til kystkulturen, ta vare på det som var i ferd med å gå tapt, i tillegg til å styrke vår identitet som kystfolk. Denne fortidsorienteringen har sine kritikere. Både blant ekstrem-urbanistene som Erling Fossen og blant samfunnsforskere har man sett tradisjonsorienteringen som nostalgiske klynk etter en svunnen tid.

He goes on to display the same text with some typographic distinctions and draws the conclusion that "everybody will know what parts of the text are what" even if they can't read Norwegian. In particular, he expects all readers to conclude that this document consists of a title, "some kind of introduction", and a couple of paragraphs:

Å seile inn i fremtiden

Livet i en seilbåt eller robåt gir folk anledning til å gjenerobre den sakte tiden, den som i våre dager i våre dager er i ferd med å bli en mangelvare.

Forbundet KYSTEN har formulert som vesentlige målsetninger å gi vern til kystkulturen, ta vare på det som var i ferd med å gå tapt, i tillegg til å styrke vår identitet som kystfolk.

Denne fortidsorienteringen har sine kritikere. Både blant ekstrem-urbanistene som Erling Fossen og blant samfunnsforskere har man sett tradisjonsorienteringen som nostalgiske klynk etter en svunnen tid.

I think Hillesund's assertion is factually incorrect. Readers forced to draw inference from typographic distinction will apply their own varied experience to the process and will draw different conclusions. Personally, I would have guessed that the initial italics indicated some kind of quotation rather than an abstract or resume.

The structure of this document can be described unambiguously in XML:

<title>Å seile inn i fremtiden</title>
<resume>Livet i en seilbåt
eller robåt gir folk anledning til å gjenerobre den sakte tiden,
den som i våre dager i våre dager er i ferd med å bli
en mangelvare.</resume>
<para>Forbundet KYSTEN har formulert som vesentlige målsetninger
å gi vern til kystkulturen, ta vare på det som var i ferd med
å gå tapt, i tillegg til å styrke vår identitet
som kystfolk.
</para>
<para>Denne fortidsorienteringen har sine kritikere. Både
blant ekstrem-urbanistene som Erling Fossen og blant samfunnsforskere har
man sett tradisjonsorienteringen som nostalgiske klynk etter en svunnen
tid.</para>
What's more, not only does this remove ambiguity, it also provides semantics that can be processed automatically. Hillesund's assertion that "the example shows that structure and display are closely interwoven and in many cases indistinguishable" is false. The XML document provides unambiguous structure with no presentation whatsoever, except perhaps relative ordering of the elements.

It is clearly possible to separate meaning and presentation.

3 Separation of Content and Presentation

Hillesund makes two related arguments about the separation of content and presentation. First, he asserts that they are logically inseparable, that "typography organises and structures content in such a fundamental way that one cannot differentiate between content structure and appearance...especially not when marking up the text.". He goes on to assert that even if it were logically possible to distinguish between them, it is practically impossible: "for the author, content structure and text appearance are mutually dependent qualities."

3.1 Logically Impossible

I will not argue that a perfect separation of content and presentation is always possible in all cases (nor do I believe I have ever made such an argument). But it is often possible to come very close.

The extent to which separation is possible is dependent in part on how much of the essential information is really semantic. If typographic distinctions are required where there is no semantic difference (that quotation has to be in italics, but that one has to be in small caps, and that one has to be red, because that's the way I want them) then the markup will have to provide some presentational information.

It is important to note, however, that in most document designs, typographic changes are the result of some semantic import. A common typographic treatment of emphasized text is italics, but the fact that an author wants to emphasise some piece of text can be viewed as semantic. And it can be entirely separated from how that semantic information is to be conveyed to the reader.

Hillesund's other motivating example for his assertion is table markup. In this case, I think he entirely misses the point.

It's true that the most popular table models, CALS (Bingham 1995) and HTML (Pemberton 2002), use a row-and-column structure that parallels the visual appearance of the table, but to conclude that "the meanings of the tables, their contents, bear on visual representation and cannot fully be captured by the structural logic of XML" is to miss the point.

In fact it is possible to write software (I have done it more than once) that extracts meaningful information from a table without ever rendering it. And, in fact, more logical table models that are entirely independent of presentation have been discussed.

Formatting tables for visual display, which is certainly one of the things that authors and publishers want to be able to do, is a non-trivial exercise. Until recently, commonly available tools that were powerful enough to transform a logical table model into a presentation did not exist.

The point of XML markup, in this case, is to provide a framework in which information is accessible. The tabular format of an unstructured word processor does not provide any standard way of accessing the individual cells so that, for example, a list-based presentation can be produced for pure-text presentation environments.

3.2 Practically Impossible

Leaving aside matters of logical possibility, Hillesund would have us believe that it is a practical impossibility that "for the author, content structure and text appearance are mutually dependent qualities" and that "in publications there is no clear distinction between content and appearance".

I don't believe that either of these arguments stands up to scrutiny. My own experience with authors and editors who work with structured markup such as DocBook (Walsh 2002) and XML Spec (Maler and Walsh 1998) is that they quickly adapt to the notion that it is the semantic construct that is important, not the ultimate appearance in some medium.

Asked, why did you use use <filename> or <citetitle>, they answer "because that's what it is" or "because it's a filename or a title citation" not because it will eventually be in a particular font.

Hillesund's arguments for the inseparable nature of content and appearance is rooted firmly in the history of typographic presentation going all the way back to Gutenberg. That the distinction between content and appearance has historically been unimportant seems a poor argument.

On the printed page, where all information is lost except that preserved in presentation, the structural characteristics of the document may be lost, but that doesn't mean they aren't there, or they aren't valuable.

4 Reuse

There's no question that reuse is one of the often touted benefits of XML. The successful application of reuse depends on your content, how much reuse is applicable, and whehter you can overcome the editorial issues.

4.1 Reuse is Rare

To the extent that a document is created exactly once, published exactly once and never updated or changed, it makes a poor motivational example for selecting XML.

At least, it's a poor example for motivating the creation of an XML publishing system. If a system already exists, it may still be cost-effective to use XML.

The extent to which reuse is applicable depends on your content and the applications you design. It has little to do with XML. What can be said is that if your content is boxed away in a proprietary format, your ability to reuse it will be dramatically reduced. Using XML at least gives you the option of reuse.

It's also important to note that reuse occurs in many forms. When discussing reuse, people often think of grand designs, reusing content from several documents to create whole new works. That's one form of reuse, clearly. But much less dramatic forms of reuse, such as making an annotated table of contents with titles and abstracts is reuse as well. Smaller forms of reuse are equally amenable to XML construction.

4.2 Reuse is Hard

Yes it is. Or, more precisely, the technical challenges to reuse are reasonably easy to solve with XML, but the technical challenges pale into insignificance when compared with the editorial challenges.

Successful applications of large-scale reuse generally require considerable author and editor training. Hillesund asserts that "XML does not solve all problems", and he's absolutely right.

5 Conclusion

XML certainly can be used to achieve "one input - many outputs". Hillesund's position that this doctrine "is basically wrong" is a dramatic overstatement.

There are problems, mostly editorial in nature, for which there are no technical solutions. As such, XML as a technology does not solve them. However, I think it does provide a platform on which to build solutions.

If authors and publishers develop new and unique genres of publishing as information technology evolves, as Hillesund suggests, the ability to transform and reuse content will become more important. Although the editorial extremes may not be amenable to automated translation, it seems likely to me that some of the middle ground will.

It may not be possible to achieve one input - all outputs, but surely one input - many outputs is an entirely practical goal.

References

Bingham, H. (ed.) (1995) CALS Table Model Document Type Definition. OASIS Technical Memorandum TM 9502:1995., OASIS, Inc. http://oasis-open.org/specs/a502.htm

Hillesund, T. (2002) "Many Outputs - Many Inputs: XML for Publishers and E-book Designers". Journal of Digital Information, Vol. 3, No. 1 http://jodi.tamu.edu/Articles/v03/i01/Hillesund/

Maler, E. and N. Walsh (eds) (1998) The XML Spec Schema and Stylesheets, World Wide Web Consortium http://www.w3.org/2002/xmlspec/

Pemberton, S., et al. (2002) XHTML 1.0 The Extensible HyperText Markup Language (Second Edition), World Wide Web Consortium http://www.w3.org/TR/html/

Walsh, N. (ed.) (2002) The DocBook Document Type: Committee Specification 4.2, OASIS, Inc. http://www.oasis-open.org/docbook/specs/cs-docbook-docbook-4.2.html

Author Details

Norman Walsh is an active participant in a number of standards efforts worldwide. At the World Wide Web Consortium (W3C), he is an elected member of the Technical Architecture Group and serves on the XML Core, XSL and XML Linking Working Groups. At the Organization for the Advancement of Structured Information Standards (OASIS), he serves on the RELAX NG Technical Committee (TC), the Entity Resolution TC for which he is the editor, and the DocBook TC which he chairs. He is the principal author of DocBook: The Definitive Guide, published (O'Reilly & Associates).