Section 8.3. Sarissa

8.3. Sarissa

Sarissa is a GPL license library focusing on providing a cross-browser wrapper for the native JavaScript XML APIs. It provides an ECMA style API on all browsers it supports, which allows you to write to the standard no matter what browser you might be using. Its major features are AJAX communications, XPath, and XSLT support. Sarissa supports most major browsers, including Firefox and other Mozilla-based browsers, Internet Explorer (MSXML 3.0+), Konqueror (KDE 3.3+), Safari, and Opera. The code has reached a stable level and no longer has frequent releases, but the forums are busy and the developers respond to questions. Sarissa can be downloaded from http://sourceforge.net/projects/sarissa, and it has online documentation available at http://sarissa.sourceforge.net/.

8.3.1. Installation

Sarissa is a pure JavaScript library, so it's quite easy to install. Download the zip file from the SourceForge.net download page, and extract its contents to an accessible location on your Web server. The examples in this chapter use Sarissa version 0.9.6.1 installed at http://localhost/sarissa/; the Sarissa code is extracted into a subdirectory below that.

The release includes API documentation, including a basic tutorial located in the doc directory. It also includes unit tests that can be run by loading testsarissa.html and a sample application, minesweeper, in the sample-apps/minesweeper directory.

8.3.2. Making an AJAX Request

Sarissa gives you the ability to access XMLHttpRequest directly (or on IE6, a wrapper classes that looks the same), but that's not how you usually want to use it to make AJAX requests. Sarissa is designed around loading XML documents, so you can easily use the load command on its DOM documents to make a remote request.

Listing 8-1 does three main tasks: It includes the Sarissa library, creates a loadDoc function (which does an AJAX load of an XML file), and provides a simple UI for running the loadDoc function. The Sarissa library is included on line 5; in this example, the library is installed in the Sarissa subdirectory. Lines 921 define the loadDoc function; it's made up of a number of subtasks. Line 10 gets an empty Sarissa DomDocument. Lines 1217 define a handler function that is called each time the ready state of the DomDocument is called. This ready state handler is just like the one on XMLHttpRequest; state 4 is reached when the document is fully loaded. When this state is reached (line 13), we use the Sarissa.serialize method to turn the loaded document back into its textual XML representation and then turn < into its entity form so that we can show the XML document in an HTML document (lines 1415). Line 19 attaches the handler we defined to the DomDocument, and line 20 loads the sarissaNews.xml file from the server. In most cases, this XML file would be a dynamically generated file, but to keep this example simple, a static file is used.

Listing 8-1. `SarissaMakingAnAJAXRequest.html`

1 <html> 2 <head> 3 <title>Making an AJAX Request with Sarissa</title> 4 5 <script type="text/javascript" src="sarissa/sarissa.js"> 6 </script> 7 8 <script type="text/javascript"> 9 function loadDoc() { 10 var oDomDoc = Sarissa.getDomDocument(); 11 12 var rHandler = function() { 13 if(oDomDoc.readyState == 4) { 14 document.getElementById('target').innerHTML = 15 Sarissa.serialize(oDomDoc).replace(/</g,'<'); 16 } 17 } 18 19 oDomDoc.onreadystatechange = rHandler; 20 oDomDoc.load("sarissaNews.xml"); 21 } 22 </script> 23 </head> 24 <body> 25 <a href="javascript:loadDoc()">Load news.xml</a> 26 <pre id="target"></pre> 27 </body> 28 </html>

8.3.3. Basic XML Features

The Sarissa library focuses on providing good cross-browser XML support. To provide this, it creates a standardized interface to DOM documents loaded from any source. Most of this work is providing compatibility methods for Internet Explorer, hiding the fact that the XML capabilities are provided by the MSXML ActiveX control instead of by native JavaScript objects.

8.3.4. Working with DOM Documents

DOM documents are created in Sarissa through the use of the Sarissa.getDomDocument() method. Once you have a document, you can load content into it using three different methods. You can load remote data using AJAX (as shown in Listing 8-1), you can parse a string that contains XML data, or you can create the elements using standard DOM functions. Sarissa also includes a utility method, Sarissa.serialize(), for working with DOM documents. This prints out the document as its XML output, which is useful for debugging or in cases in which you want to send XML payloads to the server. To use the serialize method, just send the method a DOM document; a basic example is shown here:

Sarissa.serialize(domDoc);

8.3.4.1. Loading DOM Documents from a String

Loading DOM documents from a string gives you the ability to load a number of XML documents in a single request and then parse them into DOM documents to work with them. This can be a useful strategy for preloading XML during the normal page load, or it can be used with XMLHttpRequests that return data other than XML. (An example of such data is JSON.) A small example HTML page, which loads a short XML string into a Sarissa DOM document, is shown in Listing 8-2.

Listing 8-2. `SarissaDOMDocumentString.html`

1 <head> 2 <title>Loading a DOM document with an XML string</title> 3 4 <script type="text/javascript" src="sarissa/sarissa.js"> 5 </script> 6 7 <script type="text/javascript"> 8 var xmlData = '<rss version="2.0"></rss>'; 9 10 function loadDoc() { 11 var parser = new DOMParser(); 12 var domDoc = parser.parseFromString( 13 xmlData, "text/xml"); 14 15 document.getElementById('target').innerHTML = 16 Sarissa.serialize(domDoc).replace(/</g,'<'); 17 } 18 </script> 19 </head> 20 <body> 21 <a href="javascript:loadDoc()">Load XML String</a> 22 <pre id="target"></pre> 23 </body> 24 </html>

In Listing 8-2, all the Sarissa interaction takes place within the loadDoc function, which is defined on lines 1017. The Sarissa library is loaded on lines 45, and an example XML string is defined on line 8. In practice, this string would be generated from a server-side language like PHP, allowing XML data to be accessed without an extra HTTP request. Line 10 starts our worker loadDoc functions. First we create a DOMParser (line 11), and then we use its parseFromString method to parse our XML string data contained in the xmlData var (lines 1213). parseFromString takes two parameters: the XML string and its content-type. Content-type is usually text/xml, but application/xml and application/xhtml+xml can also be used. The parseFromString method returns a DOM document, which can be used just like the one from Sarissa.getDomDocument().

On lines 1516, we print out the document using some basic entity replacement so that we can see the output in the browser. The rest of the XML is a link to run the example, line 21, and a pre-element that we use as a target for the printed-out DOM node.

8.3.4.2. Creating a DOM Document Manually

Because Sarissa works with DOM documents, all the normal DOM methods and properties are available. This allows you to create a DOM document with just its root node specified and then append additional nodes to it. In most cases, you won't use this functionality to create a complete DOM document; instead, you will use it to update a document loaded by one of the other methods. When creating a document manually, you'll want to specify the root node to create to the getdomDocument method; this is done by filling in geTDomDocument's optional parameters. Sarissa.getDomDocument takes two parameters: the namespace of the root and the local name of the root node. Listing 8-3 shows a small example using this method.

Listing 8-3. `SarissaCreateNodesWithDom.html`

1 <html> 2 <head> 3 <title>Sarissa: Create elements on a DomDocument</title> 4 5 <script type="text/javascript" src="sarissa/sarissa.js"> 6 </script> 7 8 <script type="text/javascript"> 9 function loadDoc() { 10 var domDoc = Sarissa.getDomDocument(null,'foo'); 11 12 var elBar = domDoc.createElement('bar'); 13 domDoc.firstChild.appendChild(elBar); 14 15 var elBaz = domDoc.createElement('baz'); 16 var text = domDoc.createTextNode('Some Text'); 17 elBaz.appendChild(text); 18 19 domDoc.firstChild.appendChild(elBaz); 20 21 document.getElementById('target').innerHTML = 22 Sarissa.serialize(domDoc).replace(/</g,'<'); 23 } 24 </script> 25 </head> 26 <body> 27 <a href="javascript:loadDoc()">Create an 28 XML document manually</a> 29 <pre id="target"></pre> 30 </body> 31 </html>

Listing 8-3 follows the same pattern as the previous examples: A loadDoc function is called by a small HTML interface. On lines 56, we include the Sarissa library, followed by the main JavaScript block, which defines loadDoc (lines 824). Line 10 creates the empty DOM document; we're not setting the XML namespace, so we pass null into that property, and the root node has a value of foo. Line 12 creates a new element with a tag name of bar; this is appended to the document on line 13. The bar element is appended to the firstChild of the DOM document, not directly to the document. This appending is done because an XML document can have only a single root element.

Lines 1519 repeat the same process for an element with the tag name of "baz". This time, however, the difference is that we add a child node to "baz". In this case, it is a DOM text node with the value of "Some Text", but it could also be any other XML element. There are two main types of nodes you work with in XPath: element nodes, which represent the XML tags, and text nodes, which hold the content within tags. This distinction also exists in HTML, but you don't see it as often because you can use the innerHTML property to grab the text content without worrying about DOM notes. Lines 2122 use Sarissa.serialize to output the generated document to the target element.

8.3.5. Using XPath to Find Nodes in a Document

Many times, when you're displaying data from an XML document, you'll want to look only at specific portions of the document. This is especially true for formats such as RSS that contain a number of news entries. XPath is an XML technology that allows you to select specific nodes within a document. A basic XPath follows the nodes from the root of the document to the element you're specifying. Each element can be directly addressed by a path; these paths start with a / and contain a / between each node (/rss/item). Further specificity can be provided by adding a bracketed number after the node name (/rss/item[1]). This path selects a particular occurrence of the node when there are multiple instances of a tag in this particular branch of the document. XPath can also query a document by starting with a double slash (//); these paths return any matching nodes (//item). Listing 8-4 shows an XML document that is used in some subsequent examples in this chapter.

Listing 8-4. `An Example XML File`

1 <rss> 2 <item> 3 <title>AJAX Defined</title> 4 </item> 5 <item new="true"> 6 <title>Web 2.0 News</title> 7 </item> 8 </rss>

You can refer to the nodes of this document in a number of different ways. First, there are absolute paths. The path /rss/item[1] refers to the item node that starts on line 2 and ends on line 4. The path /rss/item[2]/title refers to the title node on line 6. You can also query style paths; the path //item refers to both the item node on lines 24 and the item node on lines 56. These queries can also look at attributes by using an "@"; the path //item[@new="true"]/title refers to the title node on line 6.

XPath is able to do more complex queries than what is shown in this simple overview. If you're dealing with XML documents in the browser, you will find XPath to be an important tool. XPath is a W3C standard, so you can easily find more information to move past the basics.

Sarissa provides the IE XPath API to all the browsers it supports, which provides an easy to use cross-browser API. The API consists of two methods on a DOM document: the selectSingleNode method and the selectNodes method. Each method takes an XPath, with selectSingleNode returning a single DOM node and selectNodes returning a node collection that you can iterate over to access all the nodes. Listing 8-5 is a small example page that shows how to use these XPath methods.

Listing 8-5. `SarissaSearchingWithXpath.html`

1 <html> 2 <head> 3 <title>Sarissa: Searching XML with XPath</title> 4 5 <script type="text/javascript" src="sarissa/sarissa.js"> 6 </script> 7 <script type="text/javascript" 8 src="sarissa/sarissa_ieemu_xpath.js"></script> 9 10 <script type="text/javascript"> 11 var domDoc; 12 function loadDoc() { 13 domDoc = Sarissa.getDomDocument(); 14 15 var rHandler = function() { 16 if(domDoc.readyState == 4) { 17 document.getElementById('target').innerHTML = 18 "Document Loaded, ready to Search"; 19 20 document.getElementById('afterLoad' 21 ).style.display = 'block'; 22 } 23 } 24 25 domDoc.onreadystatechange = rHandler; 26 domDoc.load("sarissaNews.xml"); 27 } 28

Lines 18 perform the basic HTML setup. Besides including the main Sarissa library file, we also include the sarissa_ieeme_xpath.js file. This file provides the IE XPath API to other browsers, and it is how Sarissa provides cross-browser XPath support. Lines 1227 define a loadDoc function, which loads the remote XML document we will be searching in this example. This code is identical to the earlier AJAX XML loading examples. The only exception is that now, we're defining the domDoc variable outside of the function so that it can be used elsewhere. In addition, we're showing a DIV element, which contains more links when the document is loaded instead of just printing it out. This file is continued in Listing 8-6 where the logic appears for searching the DOM using XPath.

Listing 8-6. `SarissaSearchingWithXpath.html` Continued

29 function searchBuildDate() { 30 var el = domDoc.selectSingleNode('//lastBuildDate'); 31 document.getElementById('target').innerHTML = 32 "Build date = " + el.firstChild.nodeValue; 33 } 34 35 function searchItems() { 36 var list = domDoc.selectNodes('//item/title'); 37 38 var target = document.getElementById('target'); 39 target.innerHTML = "Number of Items = "+ list.length+ 40 "<br>Titles:<br>"; 41 42 for(var i = 0; i < list.length; i++) { 43 target.innerHTML += 44 list[i].firstChild.nodeValue + "<br>"; 45 } 46 } 47 </script> 48 </head> 49 <body> 50 <a href="javascript:loadDoc()">Load news.xml</a> 51 <div id="afterLoad" style="display: none"> 52 <a href="javascript:searchBuildDate()">Last build date</a> 53 <a href="javascript:searchItems()">List item titles</a> 54 </div> 55 <pre id="target"></pre> 56 </body> 57 </html>

Lines 2933 define the searchBuildDate function; this function performs an XPath query against the loaded document to find the last build date of the document. This information is provided in a single tag called lastBuildDate, so the XPath to get the information is //lastBuildDate. The XPath query happens on line 30 when we call selectSingleNode. The value of the resulting node is then displayed in the target element. Because the lastBuildNode is from an XML document, we can't just use the innerHTML attribute. Instead, we access the text node inside the returned element and get its value (line 32).

Lines 3545 define the searchItems function; this function performs an XPath query that selects all the title nodes that are inside item nodes from the document and then outputs their value in the target element. The XPath query takes place on line 36; it returns a node collection to the list variable. On line 39, we use the collection's length attribute to output the number of items in the loaded RSS document. Lines 4245 loop over the returned nodes, outputting the value of the nodes to the target; this lists the title of each item in the RSS feed.

Lines 5055 create the document's basic user interface. Links are provided to run each JavaScript function with the search links that are accessible only after the RSS document is loaded. This delay is accomplished by putting them inside a DIV that is hidden until the document's onreadystatechange change callback shows it on line 21.

8.3.6. Transforming XML with XSLT

XSLT is a powerful XML-based template language. XPaths are used inside the template, which allows you to easily apply multiple subtemplates to different XML templates. Describing how to create an XSLT template could take a book as long as this one, so we focus only on the API that Sarissa provides to transform documents. The API is easy to use; you create a new XSLTProcessor, load a stylesheet that contains the transformation rules, and then transform the document using the processor's TRansformToDocument method. You'll usually want to import the resulting document into the main HTML document using its importNode method so that you can add it to the DOM and display the results. A short example is shown in Listing 8-7. The data is the same RSS feed of the Sarissa news used earlier; the only exception is that the stylesheet is shown in Listing 8-7.

Listing 8-7. `transform.xsl`

1 <?xml version="1.0"?> 2 <xsl:stylesheet version="1.0" 3 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 4 5 <xsl:output method="html" /> 6 <xsl:template match="/rss"> 7 <div> 8 <xsl:for-each select="//item"> 9 <h2><xsl:value-of select="title"/></h2> 10 </xsl:for-each> 11 </div> 12 </xsl:template> 13 </xsl:stylesheet>

This is a really basic stylesheet with a single template that matches the root rss element in the document (lines 511). Inside this template, we output a DIV container so that we have an HTML element encasing the rest of the output, which will make it easy to add to the main document. Lines 810 loop over the results from an XPath query. The query //item selects each item node in the document. The code then displays the value of the title of each item inside an h2 tag (line 9). The rest of the file is basic XSLT boilerplate. This XSLT stylesheet is used by an HTML and JavaScript page to transform an XML document; this page is shown in Listing 8-8.

Listing 8-8. `SarissaTransformWithXSLT.html`

1 <html> 2 <head> 3 <title>Sarissa: Transforming Documents with XSLT</title> 4 5 <script type="text/javascript" src="sarissa/sarissa.js"> 6 </script> 7 <script type="text/javascript" 8 src="sarissa/sarissa_ieemu_xslt.js"></script> 9 10 <script type="text/javascript"> 11 var domDoc = Sarissa.getDomDocument(); 12 var styleSheet = Sarissa.getDomDocument(); 13 styleSheet.load("transform.xsl"); 14 var processor = new XSLTProcessor(); 15 16 function loadDoc() { 17 var rHandler = function() { 18 if (domDoc.readyState == 4) { 19 20 document.getElementById('target').innerHTML = 21 "Document Loaded, ready to transform"; 22 23 document.getElementById('afterLoad' 24 ).style.display = 'block'; 25 } 26 } 27 28 domDoc.onreadystatechange = rHandler; 29 domDoc.load("sarissaNews.xml"); 30 } 31 32 function transform() { 33 processor.importStylesheet(styleSheet); 34 var output = processor.transformToDocument(domDoc); 35 36 var target = document.getElementById('target'); 37 target.appendChild(document.importNode( 38 output.firstChild,true)); 39 } 40 </script> 41 </head> 42 <body> 43 <a href="javascript:loadDoc()">Load news.xml</a> 44 <div id="afterLoad" style="display: none"> 45 <a href="javascript:transform()">Display Items</a> 46 </div> 47 <div id="target"></div> 48 </body> 49 </html>

Listing 8-8 takes the sarissaNews.xml file, transforms it with the transform.xsl XSLT stylesheet, and then adds its results to the main document's DOM. The Sarissa library is included on lines 58. Notice that we're including the cross-browser XSLT support files as well as the main library file. On lines 1114, we set up the objects we will use on the rest of the transformation process. On line 5, we set up an empty DomDocument into which we will load our RSS feed; then, on line 6, we create a similar object into which to load the stylesheet. On line 13, we load TRansform.xsl into the styleSheet document; you could also use the string parser to load transform.xsl. This would be accomplished by loading the contents of TRansform.xsl into a JavaScript variable and then creating the DomDocument using the DOMParser. Doing this would let you reduce the number of HTTP requests needed to load the document, which is helpful from a performance standpoint as long as the stylesheet is small. Finishing the basic setup, we create a new XSLTProcessor on line 14.

Lines 1630 define the loadDoc function, which loads sarissaNews.xml so that it can later be transformed. This works the same as the earlier examples; we're just adding a few more actions to perform after the document is loaded. On lines 2021, we output a message saying the document is loaded, giving the user feedback that something has happened. Then, on lines 2324, we show a DIV in the main HTML document. This DIV contains the links that do the actual transformation; by keeping it hidden until the document is loaded, we are able to prevent errors from happening. The rest of the method contains the simple Sarissa document loading processes; on line 28, we register the callback function, and on line 29, we load the sarissaNews.xml document.

Lines 3239 define a JavaScript function that does the transformation. This is a three-part process. On line 33, we import the stylesheet we previously set up, and then on line 34, we transform the document assigning the result to a variable. We finish the processes on lines 3638, selecting an output element and then appending the output to it after importing it to the HTML document. When importing the nodes, passing a Boolean value of true as the second parameter to importNode makes the method perform a deep import. A deep import imports the element passed in and all its children; without this flag, only the top-level element is imported.

The rest of the document is the basic HTML user interface. A link is provided on line 43 to load the sarissaNews.xml document, with the transform link enclosed in a hidden DIV so that it will be available only after the news document is loaded (lines 4446). We finish up with a target DIV on line 47 that we use for giving messages to the user and for showing the transformed document.

8.3.7. Sarissa Development Tips

Sarissa is a highly focused library that provides an easy-to-use, cross-browser API to the major browsers' XML functionality. If you're looking to use XML technologies such as XSLT or XPath, then Sarissa is a perfect solution for you. While using Sarissa, keep in mind these tips:

Be sure to include the sarissa_ieemu_xpath.js or sarissa_ieemu_xslt.js files if you're working with XPath or XSLT. Without them, your scripts will work only in Internet Explorer.
Use the XML string-loading capabilities to cut down on the number of individual XML files that you need to load.
Run the test cases in testsarissa.html to make sure your browser is supported if you're on a less commonly used browser.
Mix Sarissa with other libraries if Sarissa meets only some of your needs; Sarissa is focused on XML.
XPath is extremely effective at searching XML documents; try using it before creating custom solutions to search XML.
If you have a question about what method to use, check out the project's Web site; it contains complete API documentation.

8.3. Sarissa

8.3.1. Installation

8.3.2. Making an AJAX Request

Listing 8-1. SarissaMakingAnAJAXRequest.html

8.3.3. Basic XML Features

8.3.4. Working with DOM Documents

8.3.4.1. Loading DOM Documents from a String

Listing 8-2. SarissaDOMDocumentString.html

8.3.4.2. Creating a DOM Document Manually

Listing 8-3. SarissaCreateNodesWithDom.html

8.3.5. Using XPath to Find Nodes in a Document

Listing 8-4. An Example XML File

Listing 8-5. SarissaSearchingWithXpath.html

Listing 8-6. SarissaSearchingWithXpath.html Continued