5.1 What Is the DOM?

The DOM is an interface for manipulating XML content, structure, and style in an object-oriented fashion. It provides a standardized way of manipulating XML documents, including accessing elements and other nodes, taking actions on an object tree based on events, applying styles to documents, loading documents into object trees and saving object trees to documents, and more.

The DOM is language- and platform-neutral, meaning that it can be applied to any programming language on any hardware platform or operating system. Since its start in 1997, the DOM Working Group has made it a specific goal to ensure the DOM's language- and platform-neutrality. They've been successful; you can easily find a DOM implementation in just about any modern programming language, on any modern hardware platform.

The DOM represents an XML document as a tree of objects. Each object in the tree is called a node. The types of nodes that the DOM specifies are Document, DocumentFragment, DocumentType, EntityReference, Element, Attr, ProcessingInstruction, Comment, Text, CDATASection, Entity, and Notation. Some of these node types can have subnodes, and the types of subnodes that a particular node type can have are specified. To handle collections of nodes, the DOM also specifies a NodeList object and, for dictionaries of nodes (keyed by their names), the NamedNodeMap object. Figure 5-1 shows the DOM inheritance hierarchy.

Figure 5-1. The DOM inheritance hierarchy

The DOM specifies a group of interfaces, not actual objects. This means that the implementation of the objects is not mandated, only the methods that must be accessible from a client of the DOM. Because the objects are specified by their interfaces, they cannot be created with traditional constructors; instead, factory methods are commonly used.

The DOM also specifies a number of lower-level types, such as DOMString and DOMTimeStamp. These are used internally in the DOM recommendation, but particular language bindings are free to use their own native formats for these types. In C#, these are string and DateTime, respectively.

5.1.1 A Brief Introduction to the DOM Specification

The DOM architecture is divided into several modules. Although there is no real meaning to the term, a module of the DOM can be thought of simply as a group of related functionality. The modules as defined by the W3C DOM Working Group are:

DOM Core: DOM Core defines the actual tree-like object model you can use to navigate an XML document.
DOM XML: The XML DOM extends the DOM Core to deal with XML 1.0-specific features and requirements, such as entities, processing instructions, and character data sections.
DOM HTML: The HTML DOM extends the DOM Core to deal with HTML-specific requirements. These include the ability to identify a particular link in an HTML document.
DOM Events: This module enables you to access the DOM tree through mouse, keyboard, and HTML-specific events.
DOM Cascading Style Sheets: DOM CSS allows you to manipulate the formatting of documents through Cascading Style Sheets (CSS), as well as manipulating the style sheets themselves. For information on CSS, see Cascading Style Sheets: The Definitive Guide, by Eric A. Meyer (O'Reilly).
DOM Load and Save: Loading and saving documents is an integral part of XML work, and this is the part of the DOM that allows you to do so.
Document Editing: This module includes methods for manipulating a DOM tree while still maintaining its validity.
DOM XPath: DOM XPath includes a set of functions for querying a DOM tree using XPath 1.0 expressions. Although we will use some XPath features in this chapter, XPath is discussed in detail in Chapter 6.

In addition, the DOM Working Group has defined several levels of functionality. The requirements for each level are formally documented by the W3C at http://www.w3.org/DOM/DOMTR.

Level 0: DOM Level 0 is not an official standard or recommendation of the W3C. Level 0 actually represents the object-oriented document functionality as implemented in Netscape Navigator 3.0 and Microsoft Internet Explorer 3.0.

HTML DOM is also sometimes referred to as DOM Level 0, although a DOM Level 0 is formally described in the DOM Level 1 documents.

Level 1: DOM Level 1 specifies the DOM Core and HTML DOM modules. The recommendation itself, like all the DOM recommendations, includes IDL (Interface Definition Language) definitions and Java and ECMAScript bindings. The DOM Level 1 Core specification includes such things as the actual tree structure, memory management, and naming conventions. The Level 1 HTML DOM includes naming conventions and HTML-specific elements.
Level 2: DOM Level 2 includes recommendations for DOM Core, Views, Events, Style, Traversal and Range, and HTML (still in progress as of this writing). The changes in DOM Level 2 Core include new types and changes to interfaces and exceptions, and the IDL version has been made more up-to-date.
Level 3: DOM Level 3 includes more changes to DOM Core and Events, as well as new Load and Save and XPath recommendations. As of this writing, all of the DOM Level 3 recommendations are still in the Working Draft stage, so there is no support for Level 3 in the .NET Framework.
Other Levels: The future holds any number of additional levels. Anything that you see in the list of DOM modules that is not listed in Levels 1 through 3 is fair game for some future level. Stay tuned to http://www.w3.org/DOM/ for the latest news about DOM.

For more information on the DOM generally, refer to XML in a Nutshell, 2^nd Edition, by Elliotte Rusty Harold and W. Scott Means (O'Reilly).

5.1.2 When to Use the DOM

Because the DOM represents an XML document as a tree in memory, it is best used for small documents or documents for which the memory footprint is known in advance, and when the application needs to manipulate the document's structure rather than just reading in the XML data.

One thing to keep in mind if you are considering using the DOM is that the entire document must be read into memory before any of it is available for use. This differs from the read-only, forward-only model of XmlReader, which allows you to read a single node at a time, and thus gives you the ability to deal with very large XML documents efficiently.

For this reason, the DOM is also appropriate when you need to access XML elements or attributes non-sequentially. The entire document is resident in memory, so searching for a particular node does not require disk access.

[ Team LiB ]