Why We Need Structured Content

Do you remember Jon Bosak’s famous quip from the late ’90’s article XML, Java, and the future of the Web? The context was about XML “giving Java something to do.” It was meant to characterize the benefits of processing semantic XML content close to the user–the vision and expected architecture for The Semantic Web. But the phrase fell into disuse as technologies like HTML microformats and asynchronous JavaScript libraries have replaced Java for big data visualization in the browser. And while content may be king of something or other, HTML has remained the undisputed King of Content—at least for gluing together the hodgepodge Web we’ve come to know.
Alas, there are no words of wisdom in the Weather Underground Doppler radar data. There is no path for success in Google Maps. And Twitter trends are to physics what a roomful of monkeys banging on typewriters are to a Shakespeare sonnet. Great Data does not mean Great Knowledge.


The information that propels humanity’s greatest achievements has always been coded in words and scripts, and the meaning has always been conveyed by organizing and structuring those words into knowledge. Knowledge, and especially now knowledge as structured content, is the basis for the advancement and resplendence of human culture.

And the most outstanding thing about cultural data vs weather data or most Twitter content is that we create places of storage for it: libraries, archives, and hard drives. We create addressing for it: floor number, shelf number, catalog number, tracks, sectors, offsets and registers. We index it so that we can search and find that knowledge, right where we stored it, and where it is ready to access again and again. And within the information, we provide structures—title pages, indexes, chapters, and headings—to better represent the inherent organization and storage of information itself within the information.

One more layer of structure, semantics, is the exercise of identifying not just the physical address of a chunk of information, but moreover its properties, and what its relationship with other semantic information structures might mean. Numbered lists become Steps, and thus a procedure is defined. A highlight tag around a number becomes Voltage or Ounces and thus a parameter is defined for that procedure. And when our computers have this kind of enriched data to process, we can take that kind of knowledge to amazing new places, such as placing a car sized rover on a planet halfway across the Solar System from us, extending our reach and influence far beyond our world.

You see, XML still gives all computer languages something to do. And the the richer that structure, the more interesting things we can do.

So yes, we still need XML to hold that knowledge as structured content. For in that content is not only our heritage, but also our future.

For in that content is not only our heritage, but also our future.

(And if you still think you can get by with unstructured data, then you need this kind of search: http://mrdoob.com/projects/chromeexperiments/google_gravity/)

This entry was posted in Analysis, Opinion, XML and tagged , . Bookmark the permalink.

2 Responses to Why We Need Structured Content

  1. Don says:

    Thanks to Scott Abel for pointing out this timely related link:
    Pieces of our common history, proofs of our collected life

  2. Mark Lewis says:

    I agree Don. It’s about find-ability. Structure helps make content findable. In your blog “Worth Repeating: Why Transformations”, Mark Baker’s comment about “making content into a database” so we can run queries against the content shows support of this.

    You can write the best content in the world, but if the reader can find it, you’ve failed….a statement “worth repeating”.

Comments are closed.