Team LiB
Previous Section Next Section

XML Schema Definition

Before looking at the code involved in working with XSD, you need to spend some time learning about XSD itself. This section of the chapter will show you the basics of XSD, and the basic syntax for defining the elements, attributes, and data types that exist within structured XML documents.

Introduction to XSD

As mentioned earlier, XSD itself is written in XML. You can think of it as a standardized XML format for describing other XML documents. When defining an XML schema, you always start with the <schema> element. Before getting into the details of the syntax and structure of an XSD document, look at a few samples of XSD documents to get a frame of reference for the rest of this section. Listing 29.1 shows a sample XSD file.

Listing 29.1. A Sample XSD File
<?xml version="1.0" encoding="IBM437"?>
<xs:schema xmlns:xs=""> <xs:annotation>
  <xs:documentation xml:lang="en">
    This is an example schema that dictates the format of an XML
    document that contains information on books and their authors.

  <xs:complexType name="AuthorType">
      <xs:element name="Name" type="xs:string" />
      <xs:element name="Age" type="xs:positiveInteger" />
  <xs:complexType name="ChapterType">
      <xs:element name="Title" type="xs:string" />
      <xs:element name="Pages" type="xs:positiveInteger" />
  <xs:complexType name="ChaptersType">
      <xs:element minOccurs="0" maxOccurs="unbounded" name="Chapter"
      type="ChapterType" />
  <xs:complexType name="LibraryType">
      <xs:element name="Book" type="BookType" />
  <xs:complexType name="BookType">
      <xs:element name="Author" type="AuthorType" />
      <xs:element name="Chapters" type="ChaptersType" />
    <xs:attribute name="title" type="xs:string" />
    <xs:attribute name="publisher" type="xs:string" />

  <xs:element name="Library" type="LibraryType" />


Each piece of this schema will be discussed as you progress through this section of the chapter, but you need to see a working XSD file before moving on to the details to give you some context. An XML file that conforms to the preceding XSD might appear as follows:

  <Book title="My Summer Vacation" publisher="Small Shop Press">
      <Name>Kevin Hoffman</Name>
        <Title>Chapter Two</Title>

Primitive Data Types in XSD

As you saw in the XSD document in Listing 29.1, you are defining the position, location, name, and data type of elements and attributes within a document. When you define elements and attributes, you also define the data type. Data types can be simple primitive types, such as integers or strings, or they can be complex types or user-defined types. Table 29.1 shows a list of the most commonly used primitive types. For a complete list of all types, you should consult a book or reference guide dedicated to XML schemas. When indicating a data type in an XSD document, prefix the name of the data type with the xs: XML schema namespace prefix.

Table 29.1. Commonly Used XSD Primitive Types




Represents a uniform resource locator, as defined by RFC 2396.


Indicates a Boolean. Values are written as true or false.


Represents a fixed timestamp including both date and time.


Represents floating-point numbers of varying precision.


Indicates a double-precision floating-point number (64-bit).


Indicates a length of time. The pattern for storing lengths of time is PnYnMnDTnHnMnS, where n is numeric values and the capital letters are fixed pieces of the pattern. For example, P1Y2M3DT1H3M1S is equivalent to the duration 1 year, 2 months, 3 days, 1 hour, 3 minutes, and 1 second.


Indicates a date. This value cannot contain timestamps, only valid dates.


A floating-point number in single precision (32-bit).


Indicates a variable-length character string.


Contains a timestamp with no date portion.


Indicates a string of binary data encoded in hexadecimal notation.


Indicates encoded binary data in base-64.

Derived Data Types

Also available to you is a set of more specialized data types that are part of the XSD language definition. If you are building an XSD, you might save yourself a lot of time and effort if you take a look at the list of derived types to see whether the XSD language definition hasn't already taken care of a particular data scenario for which you are preparing. If it makes things clearer, you can think of derived types as XML data types that inherit from the primitive XML data types listed in Table 29.1, and restrict, constrain, or limit those types in some way. Table 29.2 lists some of the more common derived data types.

Table 29.2. Common Derived Data Types in XSD




Contains a string that represents a standard language identifier. Language identifiers are enumerated in RFC 1766.


Indicates a valid XML token name. Starts with a letter, underscore, or colon. Derived from the token type.


Derived from normalizedString.


Indicates a white-space normalized string (all duplicate whitespace is reduced to a single whitespace character, trailing and leading whitespace is trimmed).


A whole number that can be preceded by a + or ; derived from decimal.


Any integer that is either 0 or negative. Can be preceded by a negative () sign.


An integer that is greater than 0.


An unsigned long. Means that the space normally used for storing the values below zero is used for positive numbers, allowing a maximum value of 18,446,744,073,709,551,615.


As with long, only with integer values. Maximum value is 4,294,967,295.


Unsigned 16-bit value, maximum 65535.


Unsigned 8-bit value, maximum 255.

Remember that in any place that you need to define the data type of an element or attribute, you can use any of the data types in Table 29.2 without having to include extra files because they are part of the XSD language specification.

Complex Data Types

Primitive data types and derived data types belong to a single category of data types called simple types. A simple type is just that; it contains a type definition for a simple structure, such as an element or attribute.

A complex type defines the set of attributes and child content of an element. The reason it is called complex is that because the element doesn't have a simple data type, it can contain as its children other elements that are either simple or complex. You saw several complex types in the sample in Listing 29.1. The complex type is indicated in an XSD document as <complexType>. For example:

   <xs:complexType name="AuthorType">
  <xs:element name="Name" type="xs:string"/>
      <xs:element name="Age" type="xs:positiveInteger" />

It should be fairly evident from the preceding sample that the complex type called AuthorType defines a sequence of child elements. These elements are called Name and Age Any time an <xs:element> is used in a schema with the type of AuthorType, it is assumed that, in the instance document, the element will always contain both a Name and Age child element. The term instance document always refers to the XML document containing the data that conforms to the structure defined by the XSD document. The XSD document is often referred to as the schema document.

Grouping Elements

In the previous section, you saw how to define a complex type. A complex type is a nesting of other complex or simple types. The use of complex types enables you to control very intricate structures, but also enables you to keep your schema document clear and easy to read.

You can define the grouping structure of the child elements of a complex type by using one of four group control attributes: <group>, <all>, <sequence>, and <choice>.

  • group This element defines a grouping of elements to be contained as a complex type. You can define the properties of a grouping, such as the min and max occurrence values, as well as the name. You place the simple or complex type definitions that you want to appear as children beneath the group element.

  • choice This element indicates that one and only one of the elements defined as a child can appear in the instance document.

  • all This element indicates that all child elements beneath this element can appear in the containing element in any order. This doesn't require that all elements appear in the containing element, however.

  • sequence This element indicates that all child elements beneath this element must appear in the order listed in the schema document. If the document order in the instance document does not match the order defined in the schema document, document validation will fail.

Annotating XML Schemas

Even if you know every attribute and element defined by the XSD language definition, reading someone else's XSD file can often be a daunting task. You might not be able to infer the intent of the schema author from the schema itself. In such a case, the schema author can provide annotation, which enables him to include documentation in the schema document to complement external documentation such as MS Word or PDF documents.

An annotation element (<xs:annotation>) occurs as a child element of the element it is documenting. For example, the following section of XSD provides an annotation for a complex type:

<xs:complexType name="MyComplexType" >
      The MyComplexType allows you to do something. It is a very
      valuable element, etc, etc, etc.

Even if your code is the only code using the schema and humans might never see it, you can still use annotation to record information about the application for which the schema was created. The <xs:appinfo> element (a child of <xs:annotation>) can be used to store application-specific information that can be read programmatically from the schema file.


There are tools to make creating and annotating schemas much easier, such as XML Spy, a tool capable of producing graphical, hierarchical XML rendering as well as creating Microsoft Word documents containing schema annotations.

XML Schema Facets

Any simple type (either primitive or derived, as shown in the earlier tables) can have a facet. According to Webster's Dictionary, a facet is a definable aspect that makes up a subject or an object. In the case of XSD, facets are definable aspects of data types that further refine the definition of that data type.

Just like annotations, a facet describes or constrains the element in which it is defined (its parent element). Facets are described in XSD using elements. For example:

<xs:simpleType name="AgeOfWoman">
  <xs:restriction base="xs:integer">
    <xs:maxExclusive value="29" />

The preceding facet restricts all elements defined as the type AgeOfWoman in such a way that the data can never reflect an age that exceeds 29 years.

Programming XML SchemasThe XmlSchema Class

The XmlSchema class is one that is provided with the .NET Framework to give you an object-oriented view of the structure and construction of an XML schema definition file. The code in Listing 29.2 illustrates how to create the XSD file shown earlier in the chapter in Listing 29.1. The code might seem overly complex, but when you compare the output file and the code, you'll see the relationship between each class and each element in the XSD file, and the code will become clear and useful.

Listing 29.2. The Code to Create the XSD File in Listing 29.1
using System;
using System.Xml;
using System.Xml.Schema;

namespace XmlSchemaBuilder
  class Class1
    /// <summary>
    /// The main entry point for the application.
    /// </summary>
    static void Main(string[] args)
      XmlNamespaceManager nsMan = new XmlNamespaceManager( new NameTable() );
      nsMan.AddNamespace("xs", "");
      XmlSchema schema = new XmlSchema();

      XmlSchemaComplexType ctAuthorType = new XmlSchemaComplexType();
      ctAuthorType.Name = "AuthorType";
      XmlSchemaSequence seqAuthor = new XmlSchemaSequence();
      XmlSchemaElement elemName = new XmlSchemaElement();
      elemName.Name = "Name";
      elemName.SchemaTypeName = new XmlQualifiedName("string",
      seqAuthor.Items.Add( elemName );
      XmlSchemaElement elemAge = new XmlSchemaElement();
      elemAge.Name = "Age";
      elemAge.SchemaTypeName = new XmlQualifiedName("positiveInteger",
      seqAuthor.Items.Add( elemAge );
      ctAuthorType.Particle = seqAuthor;
      schema.Items.Add( ctAuthorType );

      XmlSchemaComplexType ctChaptersNested = new XmlSchemaComplexType();
      ctChaptersNested.Name = "ChapterType";
      schema.Items.Add( ctChaptersNested );
        XmlSchemaSequence seqChapter = new XmlSchemaSequence();
        XmlSchemaElement elemTitle = new XmlSchemaElement();
        elemTitle.Name = "Title";
        elemTitle.SchemaTypeName = new XmlQualifiedName("string",
        seqChapter.Items.Add( elemTitle );

         XmlSchemaElement elemPages = new XmlSchemaElement();
         elemPages.Name = "Pages";
         elemPages.SchemaTypeName = new XmlQualifiedName("positiveInteger",
         seqChapter.Items.Add( elemPages );
         ctChaptersNested.Particle = seqChapter;

      XmlSchemaComplexType ctChapters = new XmlSchemaComplexType();
      ctChapters.Name = "ChaptersType";
      XmlSchemaSequence seqChapters = new XmlSchemaSequence();

      XmlSchemaElement elemChapter = new XmlSchemaElement();
      elemChapter.Name = "Chapter";
      elemChapter.SchemaTypeName = new XmlQualifiedName("ChapterType");
      elemChapter.MinOccursString = "0";
      elemChapter.MaxOccursString = "unbounded";
      seqChapters.Items.Add( elemChapter );

      ctChapters.Particle = seqChapters;
      schema.Items.Add( ctChapters );

      XmlSchemaComplexType ctLibrary = new XmlSchemaComplexType();
      ctLibrary.Name = "LibraryType";
      XmlSchemaSequence seqLibrary = new XmlSchemaSequence();
      ctLibrary.Particle = seqLibrary;
      XmlSchemaElement elemBook = new XmlSchemaElement();
      elemBook.Name = "Book";
      elemBook.SchemaTypeName = new XmlQualifiedName("BookType");
      seqLibrary.Items.Add( elemBook );
      schema.Items.Add( ctLibrary );

      XmlSchemaComplexType ctBook = new XmlSchemaComplexType();
      ctBook.Name = "BookType";
      XmlSchemaSequence seqBook = new XmlSchemaSequence();
      ctBook.Particle = seqBook;

      XmlSchemaElement elemAuthor = new XmlSchemaElement();
      elemAuthor.Name = "Author";
      elemAuthor.SchemaTypeName = new XmlQualifiedName("AuthorType");
      seqBook.Items.Add( elemAuthor );
      XmlSchemaElement elemChapters = new XmlSchemaElement();
      elemChapters.Name = "Chapters";
      elemChapters.SchemaTypeName = new XmlQualifiedName("ChaptersType");
      seqBook.Items.Add( elemChapters );

      XmlSchemaAttribute attribTitle = new XmlSchemaAttribute();
      attribTitle.Name = "title";
      attribTitle.SchemaTypeName = new XmlQualifiedName("string",
      ctBook.Attributes.Add( attribTitle );
      XmlSchemaAttribute attribPublisher = new XmlSchemaAttribute();
      attribPublisher.Name = "publisher";
      attribPublisher.SchemaTypeName = new XmlQualifiedName("string",
      ctBook.Attributes.Add( attribPublisher );
      schema.Items.Add( ctBook );
      XmlSchemaElement elemLibrary = new XmlSchemaElement();
      elemLibrary.Name = "Library";
      elemLibrary.SchemaTypeName = new XmlQualifiedName("LibraryType");
      schema.Items.Add( elemLibrary );

      schema.Compile( new ValidationEventHandler( ValidationCallback )); 

      schema.Write( Console.Out, nsMan );

   public static void ValidationCallback(object sender, ValidationEventArgs e )

Figure 29.1 shows the console output of the preceding code.

Figure 29.1. Console output of the XSD builder console application.

There's a lot of code to take in from the preceding sample, but if you take a little time to read it, it should make perfect sense. There is a class in the System.Xml.Schema namespace for every single type of structure that you can place in an XSD document. If you know what you want your XSD to look like, you know what classes you want to instantiate and insert into the document.

The XmlSchema class forces you to think about everything the way an XSD parser would think about it. For example, even though you can create elements in an XSD that don't appear to have a namespace qualification, everything you do with the XmlSchema class has to be namespace qualified. The output document will remove redundant or unnecessary declarations, which is something XSD authors do without thinking about what they're doing.

Another handy feature of the XmlSchema class is that you can specify a callback method . This method will be invoked whenever there is a schema validation failure. Instead of throwing an exception and stopping the validation process, it actually makes a callback for every validation issue so that you can decide for yourself which validation problems are worth halting program execution.

    Team LiB
    Previous Section Next Section