Previous Page
Next Page

Hack 63. Import Varied XML Data into Access

Access is pretty good at importing simple XML data, but sometimes you want to import data that isn't precisely the way Access expects it to be.

Access lets you import data from XML files into its tables. For example, let's consider a database containing a table that defines a list of books. Figure 7-19 shows the Design view for this table. It includes six fields of three different types.

To begin, the table contains a few books, as shown in Figure 7-20.

The easiest way to see the XML format Access expects to receive when it imports data to this table is to export some of the data, which you can do by selecting a table in the database and then selecting Export... from the File menu. In this case, the XML format we'll need to let Access import automatically looks like the data that was just exported as XML. In other words, exporting records into XML shows the XML node structure any XML data being imported back in should have. Example 7-1 shows the exported data.

Figure 7-19. A simple table to which we'll import data


Figure 7-20. Test data in the books table


Example 7-1. New data for import

<?xml version="1.0" encoding="UTF-8"?>
<dataroot>
<books>
<ISBN>0596002637</ISBN>
<Title>Practical RDF</Title>
<Tagline>Solving Problems with the Resource Description Framework</Tagline>
<Short_x0020_Description>The Resource Description Framework (RDF) is a structure
for describing and interchanging metadata on the Web.</Short_x0020_Description>
<Long_x0020_Description>The Resource Description Framework (RDF) is a structure
for describing and interchanging metadata on the Web - anything from library
catalogs and worldwide directories to bioinformatics, Mozilla internal data
structures, and knowledge bases for artificial intelligence projects.</Long_
x0020_Description>
<PriceUS>39.95</PriceUS>
</books>
<books>
<ISBN>0596003838</ISBN>
<Title>Content Syndication with RSS</Title>
<Tagline>Sharing Headlines and Information Using XML</Tagline>
<Short_x0020_Description>RSS is sprouting all over the Web, connecting weblogs
and providing news feeds.</Short_x0020_Description>
<Long_x0020_Description>RSS is sprouting all over the Web, connecting weblogs and
providing news feeds. Originally developed by Netscape in 1999, RSS (which can
stand for RDF Site Summary, Rich Site Summary, or Really Simple Syndication) is
an XML-based format that allows Web developers to create a data feed that
supplies headlines, links, and article summaries from a web site</Long_x0020_
Description>
<PriceUS>29.95</PriceUS>
</books>
<books>
<ISBN>0596002912</ISBN>
<Title>XPath and XPointer</Title>
<Tagline>Locating Content in XML Documents</Tagline>
<Short_x0020_Description>Referring to specific information inside an XML document
can be like looking for a needle in a haystack: how do you differentiate the
information you need from everything else?</Short_x0020_Description>
<Long_x0020_Description>Referring to specific information inside an XML document
can be like looking for a needle in a haystack: how do you differentiate the
information you need from everything else? XPath and XPointer are two closely
related tools that play a key role in XML processing by allowing developers to
find these needles and manipulate embedded information.</Long_x0020_Description>
<PriceUS>24.95</PriceUS>
</books>
</dataroot>

The structure begins with the dataroot element, though Access doesn't actually care what that container element's name is. The books element tells Access this information goes into the books table, and the ISBN, Title, Tagline, and other elements inside each books element go to fields in the books table. The only trick is in the Short Description and Long Description fields, which, because XML won't accept spaces in tag names, Access prefers to see as Short_x0020_Description and Long_x0020_Description. Access doesn't care what order the fields come in, but it will recognize them only if they're child elements, not attributes.

To get started, select Get External Data from the File menu, and then select Import. The dialog box shown in Figure 7-21 will appear.

You might need to select XML from the "Files of type" drop-down menu at the bottom because the dialog initially defaults to Access formats. Select a file, and click Import. The Import XML dialog box shown in Figure 7-22 will appear.

You can click the plus sign to the left of the books if you want to inspect their structure. If you just click OK, Access creates a new table called books1 (or whatever number avoids a conflict) to import the XML into Access without conflicting with the prior XML table.

Figure 7-21. Initial Import dialog box


Figure 7-22. Import dialog box showing structure of XML documents


That might be perfectly fine because it gives you a chance to compare the new data with the old before merging the two. Access provides two more options, however: one that lets you just create a new table based on the structure of the XML file, and another that lets you append the data in the XML file to an existing table. In this case, we know the new books are different from the old books, so click Options, and select Append Data to Existing Table(s), as shown in Figure 7-23.

If you click OK now, the extra books will be added to the existing books table, as shown in Figure 7-24.

Access refuses to import XML data, which causes a conflict with existing key relationships. For example, if you import that same document again in the same way, you'll be rewarded with the ImportErrors table shown in Figure 7-25.

Figure 7-23. Import dialog box showing more complex structure of XML documents and append options


Figure 7-24. The results of importing a document and appending its data


Figure 7-25. The results of importing a document and appending its data when the data is already there


Using the Transform… button shown in Figure 7-23, you can also perform conversions, which make it easier to import data that doesn't arrive in a form that meets Access's expectations. For example, suppose information about a new book arrived in the form shown in Example 7-2.

Example 7-2. ch0812.xml, an attribute-based XML document for import

<update>
<books ISBN="0596003277" Title="Learning XSLT" Tagline="A Hands-On
Introduction to XSLT and XPath" Short_x0020_Description="A gentle
introduction to the complex intricacies of XSLT" Long_x0020
_Description="A gentle introduction to the complex intricacies of
XSLT and XPath, walking through the spec from simple work to
complex." PriceUS="34.95" />
</update>

In Example 7-2, all data is stored in attributes, and Access won't even look at attributes during an import. To get this information into Access, you need to use a transformation, such as the generic one shown in Example 7-3, which converts all attributes to child elements.

Example 7-3. ch0813.xsl, a stylesheet for transforming attributes into elements

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!--Derived from recipe 6.1 of Sal Mangano's XSLT Cookbook-->

<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="@*">
  <xsl:element name="{local-name(.)}" namespace="{namespace-uri(..)}">
    <xsl:value-of select="."/>
  </xsl:element>
</xsl:template>

<xsl:template match="node( )">
  <xsl:copy>
    <xsl:apply-templates select="@* | node( )"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

When applied to Example 7-2, the stylesheet in Example 7-3 produces the result shown in Example 7-4, which Access can import easily.

Again, Access doesn't care what the root element's name is; update is an appropriate description for human consumption.


Example 7-4. An "elementized" version of the data in Example 7-2

<?xml version="1.0" encoding="UTF-8"?>
<update>
<books>
<ISBN>0596003277</ISBN>
<Title>Learning XSLT</Title>
<Tagline>A Hands-On Introduction to XSLT and XPath</Tagline>
<Short_x0020_Description>A gentle introduction to the complex intricacies of
XSLT</Short_x0020_Description>
<Long_x0020_Description>A gentle introduction to the complex intricacies of XSLT
and XPath, walking through the spec from simple work to complex.</Long_x0020_
Description>
<PriceUS>34.95</PriceUS>
</books>
</update>

If you tell Access to import ch0812.xml, the file shown in Example 7-2, you won't have much to choose from in the Import XML dialog box, as shown in Figure 7-26.

Figure 7-26. Access's initial reaction to the document that stores data in attributes


If you choose Options Transform…, you canadd the stylesheet, much as you did for the export transformation. Add the stylesheet to the list of transformations, and select ch0813, as shown in Figure 7-27.

When you click OK, Access applies the transformation to the document, modifying the display of components you see and producing the result in Figure 7-28.

In this case, the table already exists, so be sure to select Append Data to Existing Table(s). When you click OK, the data from Example 7-1 is added to the books table, as shown in Figure 7-29.

Transformations are a powerful tool in pretty much any area of XML development. Using a bit of XSLTadmittedly, a bit challenging to learnyou can convert the structures you have into the structures Access expects.

Figure 7-27. Selecting a stylesheet for transformation


Figure 7-28. A transformed document ready for import


7.6.1. See Also

Simon St. Laurent

Figure 7-29. The result of importing a transformed document


    Previous Page
    Next Page