Previous Page
Next Page

Hack 64. Export XML Data Sanely

Working around the thorny issue of exporting related data to XML.

Exporting a single table to XML produces some easily reusable data. Exporting multiple tables to XML, however, might not produce data that other applications can use; it all depends on how you structured your tables and relationships. You can solve this problem in two ways: restructure your data or use a query to export data that's been unnormalized.

For our initial example, we'll start with a database containing a table that defines a list of books. Figure 7-30 shows the Design view for that table. It includes six fields of three different types.

Figure 7-30. A simple table for export


For the initial tests, this table contains just a little bit of information. Exporting mature tables with thousands of records can quickly produce large XML filesdefinitely useful in real life but difficult for initial analysis. Figure 7-31shows a partial view of the content in the test table.

Figure 7-31. Test data in the books table


Exporting this table to XML involves a few steps, most of which will be familiar to developers who have exported information from Access databases before. The process starts by selecting the books table in the database, then selecting Export… from the File menu. The dialog box shown in Figure 7-32 will appear, and you'll need to select XML (*.xml) from the "Save as type" drop-down box.

Figure 7-32. Selecting the destination for the export


When you perform the export, Access might actually create more files than just the XML file, but they'll all appear in the same directory together with the XML. Once you click the Export button, a small dialog box with basic options, shown in Figure 7-33, appears.

Figure 7-33. Basic export options


For now, we'll accept the defaults and just click OK. This results in two files: books.xml and books.xsd. The books.xml file contains the information from the table, and books.xsd contains an XML Schema description of that content, annotated with a bit of information specific to Access and its Jet database engine.

The books.xml file, shown in Example 7-5, reflects the structure and content of the original table closely.

Example 7-5. A simple table export

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="books.xsd"
generated="2003-03-26T13:49:17">
<books>
<ISBN>0596005385</ISBN>
<Title>Office 2003 XML Essentials</Title>
<Tagline>Integrating Office with the World</Tagline>
<Short_x0020_Description>Microsoft has added enormous XML functionality to Word,
Excel, and Access, as well as a new application, Microsoft InfoPath. This book
gets readers started in using those features.</Short_x0020_Description>
<Long_x0020_Description>Microsoft has added enormous XML functionality to Word,
Excel, and Access, as well as a new application, Microsoft InfoPath. This book
gets readers started in using those features.</Long_x0020_Description>
<PriceUS>34.95</PriceUS>
</books>
<books>
<ISBN>0596002920</ISBN>
<Title>XML in a Nutshell, 2nd Edition	</Title>
<Tagline>A Desktop Quick Reference</Tagline>
<Short_x0020_Description>This authoritative new edition of XML in a Nutshell
provides developers with a complete guide to the rapidly evolving XML space.</
Short_x0020_Description>
<Long_x0020_Description>This authoritative new edition of XML in a Nutshell
provides developers with a complete guide to the rapidly evolving XML space.
Serious users of XML will find topics on just about everything they need,
including fundamental syntax rules, details of DTD and XML Schema creation, XSLT
transformations, and APIs used for processing XML documents. Simply put, this is
the only references of its kind among XML books.</Long_x0020_Description>
<PriceUS>39.95</PriceUS>
</books>
<books>
<ISBN>0596002378</ISBN>
<Title>SAX2</Title>
<Tagline>Processing XML Efficiently with Java</Tagline>
<Short_x0020_Description>This concise book gives you the information you need to
effectively use the Simple API for XML, the dominant API for efficient XML
processing with Java.</Short_x0020_Description>
<Long_x0020_Description>This concise book gives you the information you need to
effectively use the Simple API for XML, the dominant API for efficient XML
processing with Java.</Long_x0020_Description>
<PriceUS>29.95</PriceUS>
</books>
</dataroot>

This document's root element, dataroot, is the only piece of this document specific to Access:

<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="books.
xsd" generated="2003-03-26T13:49:17">

It makes a namespace declaration for the od prefix, which isn't actually used in this document, and it includes a pointer to the XML Schema describing this document's structure. Because the element names used here aren't in any namespace, the document uses the xsi:noNamespaceSchemaLocation attribute to identify the schema that should be used for all the elements in this document that have no namespace. It also includes one small bit of metadata in the generated attribute, that identifies the time and date when this XML document was created.

The dataroot element contains three child books elements, each indicating a row in the books table. Their contents map fairly simply to the names and values of the table columns:

<books>
<ISBN>0596002920</ISBN>
<Title>XML in a Nutshell, 2nd Edition</Title>
<Tagline>A Desktop Quick Reference</Tagline>
<Short_x0020_Description>This authoritative new edition of XML in a Nutshell
provides developers with a complete guide to the rapidly evolving XML space.
</Short_x0020_Description>
<Long_x0020_Description>This authoritative new edition of XML in a Nutshell
provides developers with a complete guide to the rapidly evolving XML space.
Serious users of XML will find topics on just about everything they need,
including fundamental syntax rules, details of DTD and XML Schema creation,
XSLT transformations, and APIs used for processing XML documents. Simply
put, this is the only references of its kind among XML books.</Long_x0020_
Description>
<PriceUS>39.95</PriceUS>
</books>

The only significant variation here involves the column names, which include spaces. Instead of Short Description, now we have Short_x0020_Description, following a convention Microsoft developed for representing spaces in XML element names.

XML forbids spaces in element names because they make it difficult to separate the element name from the attributes, so Access uses _x0020_, the Unicode hex number, for the space.


Exporting individual tables is useful, but sometimes you might want to export multiple tables and preserve the relationships between them. Access allows you to export a set of tables, though it works most easily when only two tables are involved.

7.7.1. Exporting from Tables in a One-to-Many Relationship

For our first example, we'll add a table that contains information about (very fictional) promotions for various books. Figure 7-34 shows what this table looks like.

Figure 7-34. The promotions table


The promotions table links to the books table through its BookID field, as shown in Figure 7-35.

Figure 7-35. Relationship between the books and promotions tables


Exporting this pair of tables takes a few more steps because Access lets you choose how the export works. The choice of which table is the base table makes a big difference in the export results, so the following examples will export it both ways.

We'll start by exporting the books table again, but this time, we'll select More Options from the dialog box shown in Figure 7-36.

Figure 7-36. Basic export options


Clicking More Options brings up a larger dialog with a lot more choices, as shown in Figure 7-37.

In this case, all the information we need is on the first (Data) tab. Checking the Promotions box and clicking the OK button tells Access to export both the books table and the linked records of the promotions tablein this case, all of them. Example 7-6 shows an abbreviated version of the export, with the new content from the promotions table in bold.

Figure 7-37. The full version of the Export XML dialog box


Example 7-6. Exported linked tables

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="ch0804.xsd"
generated="2003-03-31T16:37:01">
<books>
<ISBN>0596005385</ISBN>
<Title>Office 2003 XML Essentials</Title>
<Tagline>Integrating Office with the World</Tagline>
<Short_x0020_Description>...</Short_x0020_Description>
<Long_x0020_Description>...</Long_x0020_Description>
<PriceUS>34.95</PriceUS>

<promotions>
<PromotionID>1</PromotionID>
<BookID>0596005385</BookID>
<Name>Palm civet bonus</Name>
<Venue>Anywhere interested</Venue>
<Description>A stuffed-animal palm civet,
lovingly screen-printed to match the cover,
with every copy of the book.</Description>
<Cost>10000</Cost>
</promotions>
<promotions>
<PromotionID>3</PromotionID>
<BookID>0596005385</BookID>
<Name>Key chains</Name>
<Venue>Conferences</Venue>
<Description>keychains adorned with lovely palm civets
and the title of the book.</Description>
<Cost>1000</Cost>
</promotions>

</books>
<books>
<ISBN>0596002920</ISBN>
<Title>XML in a Nutshell, 2nd Edition</Title>
<Tagline>A Desk top Quick Reference</Tagline>
<Short_x0020_Description>...</Short_x0020_Description>
<Long_x0020_Description>...</Long_x0020_Description>
<PriceUS>39.95</PriceUS>
</books>
<books>
<ISBN>0596002378</ISBN>
<Title>SAX2</Title>
<Tagline>Processing XML Efficiently with Java</Tagline>
<Short_x0020_Description>...</Short_x0020_Description>
<Long_x0020_Description>...</Long_x0020_Description>
<PriceUS>29.95</PriceUS>

<promotions>
<PromotionID>2</PromotionID>
<BookID>0596002378</BookID>
<Name>Free filters</Name>
<Venue>Online/Safari</Venue>
<Description>Bonus SAX filters, open source-licensed,
for developers who visit the SAX2 book site.</Description>
<Cost>0</Cost>
</promotions>

</books>
</dataroot>

The general pattern here is much like the original export of the books table, except that zero or more promotions elementswhose BookID holds the same value as the containing books element's ISBN elementnow appear inside each books element. This works the same way that zero or more books elements appeared inside the dataroot element. All the table columns are listed inside each promotions element, making it easy to reconstruct the information in the promotions table or to treat the information as a complete set of information about each book. There's no need to reconstruct the original tables and calculate primary key/foreign key links.

As soon as you step beyond the one-to-many relationship, however, this kind of simple containment will fail you.

7.7.2. Exporting from Tables in a Many-to-Many Relationship

A many-to-many relationship, implemented with an intermediary table, as shown in Figure 7-38, produces XML that most likely will be useful only if someone reimports it into Access and works with it there.

Access lets you traverse this relationship in an XML export, as shown in Figure 7-39. This time, the export uses a [Lookup Data] element to indicate that simply nesting the data in the XML document structures isn't going to work. One-to-many relationships are represented using containment, and many-to-one relationships are represented as separate pieces. In this case, the many-to-many relationship includes both of those choices.

Figure 7-38. Related tables with a many-to-many relationship, expressed as two one-to many relationships


Figure 7-39. Exporting related tables with a many-to-many relationship


[Lookup Data] provides a warning that reassembling some of these relationships is going to require extra lookup work on the part of the consuming application.

If you reimport this data into Access, it'll do that work, so this might not be a problem.


Example 7-7 shows the results of this export.

Example 7-7. A many-to-many export combining containment and lookup

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="ch0806.xsd"
generated="2003-04-01T21:01:50">
<books>
<ISBN>0596005385</ISBN>
<Title>Office 2003 XML Essentials</Title>
<Tagline>Integrating Office with the World</Tagline>
<Short_x0020_Description>...</Short_x0020_Description>
<Long_x0020_Description>...</Long_x0020_Description>
<PriceUS>34.95</PriceUS>
<authorBookLink>
<bookISBN>0596005385</bookISBN>
<authorID>1</authorID>
</authorBookLink>
</books>
<books>
<ISBN>0596002920</ISBN>
<Title>XML in a Nutshell, 2nd Edition</Title>
<Tagline>A Desktop Quick Reference</Tagline>
<Short_x0020_Description>...</Short_x0020_Description>
<Long_x0020_Description>...</Long_x0020_Description>
<PriceUS>39.95</PriceUS>
<authorBookLink>
<bookISBN>0596002920</bookISBN>
<authorID>3</authorID>
</authorBookLink>
<authorBookLink>
<bookISBN>0596002920</bookISBN>
<authorID>4</authorID>
</authorBookLink>
</books>
<books>
<ISBN>0596002378</ISBN>
<Title>SAX2</Title>
<Tagline>Processing XML Efficiently with Java</Tagline>
<Short_x0020_Description>...</Short_x0020_Description>
<Long_x0020_Description>...</Long_x0020_Description>
<PriceUS>29.95</PriceUS>
<authorBookLink>
<bookISBN>0596002378</bookISBN>
<authorID>2</authorID>
</authorBookLink>
</books>
<authors>
<AuthorID>1</AuthorID>
<GivenName>Simon</GivenName>
<FamilyName>St.Laurent</FamilyName>
<FullName>Simon St.Laurent</FullName>
</authors>
<authors>
<AuthorID>2</AuthorID>
<GivenName>David</GivenName>
<FamilyName>Brownell</FamilyName>
<FullName>David Brownell</FullName>
</authors>
<authors>
<AuthorID>3</AuthorID>
<GivenName>Elliotte</GivenName>
<FamilyName>Harold</FamilyName>
<FullName>Elliotte Rusty Harold</FullName>
</authors>
<authors>
<AuthorID>4</AuthorID>
<GivenName>Scott</GivenName>
<FamilyName>Means</FamilyName>
<FullName>W. Scott Means</FullName>
</authors>
</dataroot>

Now each books element contains one or more authorBookLink elements, each holding an authorID element. The value of that authorID element maps to an authorID element inside an authors element. If the data is going back into Access, this is fine, but if it's going to another applicationExcel, perhaps, or an XSLT transformation into HTML for a browserthis isn't much fun.

This might feel like a case in which it would make sense to store repetitive (nonnormalized) data in the tables, but fortunately, there's a better option: exporting a query instead of a table.

7.7.3. Using a Query to Tame the Export

By themselves, queries don't provide nested views, but they certainly make it easier to present some kinds of informationnotably, many-to-many relationships. The mechanics of exporting queries are much like those of exporting single tables, and the results are similar.

Access supports SQL queries, obviously, because that's at the heart of its functionality. Access doesn't, however, support other standards for querying, such as XQuery.


To demonstrate, let's export a SQL query named booksByAuthor, which uses the books, authors, and authorBookLink tables to create a list of books sorted by author. The SQL for the query expresses the relationships an XML processor working with the linked table export would otherwise have to deal with:

SELECT authors.GivenName, authors.FamilyName, books.ISBN, books.Title
FROM books INNER JOIN (authors INNER JOIN authorBookLink ON authors.AuthorID
= authorBookLink.authorID) ON books.ISBN = authorBookLink.bookISBN
ORDER BY authors.FamilyName;

The interface for exporting a query is the same as the interface for a table, except there is no option for exporting linked information. When you export a query, all the information you want to export must be in that query. Exporting the query produces the result shown in Example 7-8.

Example 7-8. An exported query

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="booksByAuthor.
xsd" generated="2003-04-02T14:47:59">
<booksByAuthor>
<GivenName>David</GivenName>
<FamilyName>Brownell</FamilyName>
<ISBN>0596002378</ISBN>
<Title>SAX2</Title>
</booksByAuthor>
<booksByAuthor>
<GivenName>Elliotte</GivenName>
<FamilyName>Harold</FamilyName>
<ISBN>0596002920</ISBN>
<Title>XML in a Nutshell, 2nd Edition</Title>
</booksByAuthor>
<booksByAuthor>
<GivenName>Scott</GivenName>
<FamilyName>Means</FamilyName>
<ISBN>0596002920</ISBN>
<Title>XML in a Nutshell, 2nd Edition</Title>
</booksByAuthor>
<booksByAuthor>
<GivenName>Simon</GivenName>
<FamilyName>St.Laurent</FamilyName>
<ISBN>0596005385</ISBN>
<Title>Office 2003 XML Essentials</Title>
</booksByAuthor>
</dataroot>

Just as in a tabular representation of the query, information repeatsnotably, the ISBN and title of XML in a Nutshell, which has two authors. If you're sending data to an application that lacks Access's appreciation for relations between tables, this approach will probably work much more easily.

7.7.4. See Also

  • "Import Varied XML Data into Access" [Hack #63]

  • "Break Through VBA's Transformation Barrier" [Hack #65]

Simon St. Laurent

    Previous Page
    Next Page