Chapter 11. XML and Databases

XML is good for many things. It makes an excellent data interchange format for sharing data between disparate systems, whether through files on disk or through web services on a network. It can be used to share data among homogeneous systems, as in .NET remoting. It can even be used to present data to a person using a text editor for review and modification. In the end, though, the uses of XML are limited by the underlying data storage associated with the XML data; whether it's in a file or accessed across a network, it usually comes down to some sort of I/O stream.

Relational databases are optimized to store large amounts of data, provide non-sequential access to it, and search and sort the data, all things which XML is not great at. Ultimately, this comes down to the structural difference between a piece of software that is built for the specific purpose of providing this sort of data storage versus XML, a data format which is not optimized for anything in particular.

In addition to the structural differences, relational databases provide several properties that XML by itself cannot. The main properties of a relational database are usually referred to by their acronym ACID:

Atomicity: Any group of actions (called a transaction) taken on the database are done as a group and can only be undone as a group. Any failure of a part of a transaction causes the entire transaction to fail, and roll back the previous actions.
Consistency: Any transaction must cause the database to move from one consistent state to another. If a transaction causes the database to enter an inconsistent state, the whole transaction must fail atomically.
Isolation: Each transaction takes place in its own transaction space, and changes that are made within one transaction are invisible to other transactions until the transaction is complete. This ensures that other transactions always see the rest of the database in a consistent state.
Durability: The completed results of each transaction are permanent and will survive any sort of system failure.

Obviously, XML is only a data format and cannot by itself ensure that any of the ACID properties will be implemented. In conjunction with ACID, relational databases provide fast, direct access to data in a way that XML cannot.

It's important to note that XML could be used as the underlying storage format for a relational database, if the database designer wanted to implement the layers of logic to enforce ACID. XML can also be stored within a database to take advantage of ACID. XML, as a technology, does not provide a reliable data store for the sorts of mission-critical application that relational databases are designed for.

The .NET Framework contains support for relational database access, and, as you might suspect, this support includes a rich set of XML-related features. I can't hope to tell you everything about using XML in databases with .NET, but I hope to give you a good introduction and tell you where to look for more information.

In addition to ADO.NET, SQL Server and Microsoft Access both have their own native methods of accessing their data as XML. For basic information on SQL Server, the Microsoft SQL Server home page at http://www.microsoft.com/sql/ contains links to a wealth of information. SQL Server Magazine, at http://www.sqlmag.com/, is an excellent resource for SQL Server database administrators. The Microsoft Access home page is at http://www.microsoft.com/office/access/.

I assume some knowledge of relational databases and the Structured Query Language (SQL) in this chapter. If you don't already know what SQL is, I suggest picking up SQL in a Nutshell, by Kevin Kline with Daniel Kline, Ph.D. (O'Reilly). For specific information on the flavor of SQL used in Microsoft SQL Server, look at Transact-SQL Programming, by Kevin Kline, Lee Gould, and Andrew Zanevsky (O'Reilly).

[ Team LiB ]