The Building Blocks of Web Services

In addition to representing functions that serve information of any kind, Web services can also be defined as a standard platform for building interoperable, distributed applications.

As a developer, you have probably built component-based distributed applications using COM, DCOM, or CORBA. Although COM is an excellent component technology, in certain scenarios it does not work very well.

The Web services platform is a set of standards that applications follow to achieve interoperability via HTTP, for instance. You write your Web services in whatever language and on any platform you like, as long as those Web services can be viewed and accessed according to the Web services standards set by ECMA. To enable the different Web services implemented on different operating systems to communicate with each other, there must be agreement as to a common standard. A platform must have a data representation format and a type system. To enable interoperability, the Web services must agree to a standard type system that bridges current differences between type systems of different platforms. Web services must also be able to find other Web services that have been developed . So how can you achieve this? By using XML, XSD, SOAP, WSDL, and UDDI. If all these protocols make you feel panicky, do not fear! In the next sections we will briefly go through these different technologies that make Web services so powerful.

XML

Extensible Markup Language (XML) is the basic format for representing data in Web services. XML was developed from CXML, an old markup language from the sixties. Its strength was its simplicity when it came to creating and parsing a document, and XML was chosen for Web services because it is neither platform nor vendor specific.

Being neutral is more important than being technically superior. Software vendors are much more likely to adopt a neutral technology, rather than one invented by a competitor. A noncomplex language will probably also result in noncomplex problems.

XML provides a simple way of representing data by using tags, but says nothing about the standard set of data types available or how to extend that set. To enable interoperability, it is important to be able to describe, for example, what exactly an integer is. Is it 16, 32, or 64 bits? The World Wide Web Consortium (W3C) XML Schema Definition Language (XSD), which we will take a look at in the next section, is a language that defines data types.

Here is a simple example of an XML document taken from the weather report example previously found in this chapter:

<WeatherReports>
    <WeatherReport>
        <Degrees Scale="C">
            23
        </Degrees>
        <Humidity>
            67
        </Humidity>
        <PrecipitationPossibility Type="Rain">
            45
        </PrecipitationPossibility>
    </WeatherReport>
</WeatherReports>

This XML document contains something called tags, or elements. Each item appearing in angle brackets (such as Humidity) is a tag. There are many rules an XML document should follow. One of the most important is that each start tag needs to have an end tag. In the preceding example, WeatherReport is a start tag, and its corresponding end tag, which starts with a slash, is /WeatherReport. The XML document also needs to have a root element into which every child element is placed. The root element in the preceding XML document is WeatherReports.

Each element can either be a complex type or a simple type. The Humidity element is a simple type—it contains only a value. When an element contains other elements or has attributes, it is said to be a complex type. The WeatherReport element is a complex type, because it has other element tags beneath it.

These "subelements" are considered child elements to the WeatherReport element. Child elements to WeatherReport are Degrees, Humidity, and PrecipitationPossibility.

An XML document can have two different states: well formed and valid. The XML document is well formed when it follows XML standards regarding start and end tags and so on. An XML document is said to be valid when it is well formed and conforms to the rules in the XSD document to which the XML document is linked.

XSD

Without the capability to define types, XML would be less useful in a cross-platform world. The World Wide Web Consortium finalized a standard for an XML-based type system known as XML Schema in May 2001. The language used to define a schema is an XML grammar known as XML Schema Definition Language. Since Web services use XML as the underlying format for representing messages and data, XML was a natural choice for XSD as well to represent the definition types.

The W3C XML Schema standard consists logically of two things:

A set of predefined or built-in types such as string, date, and integer
An XML language for defining new types and for describing the structure of a class of XML documents such as an invoice or a purchase order

To help you understand the role of XSD in the world of XML, consider this analogy to your favorite programming language: The programming language you use to define application classes (e.g., VB 6 or VB .NET) is an analogue to XSD. When you define a class named COrder, this would correspond to a type definition for a COrder type in XSD. Finally, when you instantiate an object from the COrder class, this object instance is analogous to an XML document that represents an order—that is, an XML instance document. The instance follows the definitions (rules) of the class (it has the properties and methods defined in the class).

Another way to view the relationship between the XML document and the XSD document is by looking at the use of XSD together with the XML document, which exists in a Simple Object Access Protocol (SOAP) message sent to or from a Web service.

When an XML document agrees to an XSD and is verified against an XSD document, the XML document is said to be valid. (It has been validated with respect to its XSD document and has not broken any of the rules/types defined in the XSD document.) The XSD document is used to validate the information in the XML document so that it occurs in the right order and has the right data types. The XSD document can also be used to verify that certain elements are present, and that they do not go beyond a maximum or minimum limit in the XML document.

When you build a Web service, the data types you use must be translated to XSD types so that they conform to the Web services standards. The developer tools you use may automate this, but you likely have to tweak the result a bit to meet your needs. Therefore, we will give a brief tour here of the type system of XSD.

The XSD types are either of a complex or simple nature. The simple ones are scalar values, or nonstructured ones. Examples are integers, strings, and dates. Complex types, on the other hand, contain a structure. An example of a complex type is the aforementioned XML element WeatherReport because it contains child elements and types. An easy rule to remember for determining whether a type is a simple one or a complex one is that an element with attributes and/or child elements is a complex type. Here is an example to clarify the difference:

<examples>
<!-- This element is of a complex type because it has child elements -->
<example>
<childelement1>some child text</childelement1>
< childelement2>more child text here </childelement2>
</example>

<!-- This element is of a complex type because it has an attribute-->
<example exampleid="2">some text</example>
<!--This element is of a simple type since it
has no attribute nor child elements-->
<example>some text</example>
</examples>

The XSD system is similar to the type system found in .NET, as every type derives from a base type regardless of whether it is a simple type or a complex type. In .NET the base type is System.Object, and in the XSD type system it is the built-in type anyType. The complex types in XSD derive directly from anyType, and the simple types derive from anySimpleType, which in turn derives from anyType.

When building Web services, you need to create XSD schemas to define the structure and the types that are contained in the request and the response messages from the Web service. In the weather example previously shown, the WeatherReport element will contain one Degree element of type integer. To declare this in the XSD, write the following:

<element name="Degree" type="int"/>

This line says that the Degree element is of type integer and will occur only once in the XML document, which is the default for a type. By changing the minOccur and the maxOccur values on the element, it is possible to specify that Degree is optional:

<element name="Degree" type="int" minOccurs="0"/>,

or that it should be between one and two measurements of the temperature:

<element name="Degree" type="int" minOccurs="1" maxOccurs="2"/>

If you want at least one measurement of the temperature, but do not want to set a maximum number, you can do so by including the unbounded value as shown here:

<element name="Degree" type="int" minOccurs="1" maxOccurs="unbounded"/>

SOAP

When you finally have created your Web service, other developers or systems need to be able to connect to it in an easy way; SOAP allows this.

As mentioned earlier, SOAP stands for Simple Object Access Protocol, and this protocol makes it easy for you to use a Web service. SOAP provides the standard remote procedure call mechanism used for invoking Web services. You can look at SOAP as a proxy for the Web service. The SOAP SDK from Microsoft will create a proxy object of your Web service that exposes functions for each Web service function you have declared. The proxy object finds this information in the WSDL file that we will look at later in our discussion of WSDL. SOAP then takes care of the packing and unpacking of messages sent and received from the Web service. SOAP uses HTTP as a carrier of the data and packs the requests or responses to the Web service in a SOAP envelope using XML and XSD.

Since 1993, the Web has grown tremendously, and it continues to grow. The Internet itself provides basic network connectivity between millions of computers using TCP/IP and HTTP, as shown in Figure 6-2. This connectivity is not of much value unless applications running on different machines decide to communicate with one another, leveraging the underlying network and exchanging information.

Figure 6-2: The Internet protocol stack for browser/server communication

Traditionally each type of application has invented and used its own application-level protocol that sits on top of TCP. For example, HTTP and HTTPS are application-level protocols designed for use between the Web browser and the Web server as shown in Figure 6-2. The arrows in Figure 6-2 show the logical communication between peer layers on different hosts. The actual information flow goes down the stack on one host and then up the stack on the other.

Despite the huge success of HTTP as the Internet's main application protocol, it is limited to fairly simple commands centered on requesting and sending resources—for example, GET, POST, and PUT. The result is that we have millions of interconnected computers today that leverage the Internet primarily for browsing the Web but cannot, despite of the connectivity, freely exchange data between applications. SOAP proposes to solve this problem by defining a standard protocol that any application can use to communicate and exchange data with any other application. Figure 6-3 shows how SOAP can be used over TCP/IP, leveraging the current Internet infrastructure. Because SOAP is an application- level protocol, it can work directly over any transport protocol such as TCP.

Figure 6-3: SOAP over TCP/IP

Today's Internet infrastructure is unfortunately scattered with proxies and firewalls that typically allow only HTTP traffic. In order for Internet-connected applications to communicate, SOAP must be able to flow over the current Internet infrastructure including firewalls and proxies. To achieve this, SOAP can be layered over HTTP as shown in Figure 6-4.

Figure 6-4: SOAP layered over HTTP

Layering SOAP over HTTP means that a SOAP message is sent as part of an HTTP request or response, which makes it easy to communicate over any network that permits HTTP traffic. Since HTTP is pervasive on all computing platforms, it is a good choice for transport protocol. To achieve platform independence and maximum interoperability, SOAP uses XML to represent messages exchanged between the client and the Web service. Like HTTP, XML is also well known, and you can find an XML parser for nearly any computing platform on the market (or write your own, if necessary). By leveraging HTTP and XML, SOAP provides application-to-application communications between applications running on any platform and connected over the existing Internet infrastructure. The SOAP SDK and the Web services in .NET from Microsoft by default use HTTP as its transport protocol.

We will later talk about using HTTP/HTTPS for SOAP in the section "SOAP over HTTP," but first we will take a closer look at the SOAP architecture and the different ways you can manipulate a SOAP message to see the benefits of using SOAP.

SOAP Architecture

The strength of the SOAP architecture is its simplicity. The goal of the SOAP architecture is not to solve all the problems with distributed applications you have today; instead SOAP focuses on using the minimum of standards to send messages from one application to another. Old distributed applications require secure connections and/or transaction support—neither of which is supported by SOAP in its standard version.

You can really think of SOAP as a method for sending a message from one application to another application—nothing more. Items like RPCs are actually a combination of more SOAP messages with different contents. Security is often provided by the protocol used for SOAP, but can also be integrated into the SOAP message through WS-Security—as we will discuss later in the section "WSE and Security."

The SOAP Message

Let us take a look at the SOAP message in Figure 6-5. This message is constructed with a SOAP envelope, a SOAP header, and a SOAP body. A basic SOAP message is a well-formed XML document consisting of at least an envelope and a body that both belong to an envelope namespace, defined as http://schemas.xmlsoap.org/soap/envelope/. The header and the SOAP fault parts are optional. Since the tags Envelope and Body may exist in the XML document that should be transferred, the tags use namespaces. A name-space is a collection of names that can be used as element or attribute names in an XML document. The purpose of a namespace is to make the combination of the namespace and the tag name unique in the XML document. You need to be able to identify the namespace, and that is done by a Universal Resource Identifier (URI), which is either a Uniform Resource Locator (URL) or a Uniform Resource Number (URN). The important thing is not what the URI points to, but that the URI is globally unique across the Internet.

Figure 6-5: The architecture of the SOAP message

A namespace can be declared as either implicit or explicit. The explicit declaration is used in the SOAP message. With an explicit declaration, you define a shorthand, or prefix, to substitute for the full name of the namespace. You use this prefix to qualify elements belonging to that namespace. Explicit declarations are useful when a node contains elements from different namespaces.

We will refer to SOAP elements using soapenv as the namespace prefix later on in the examples in this chapter.

A simple SOAP message will look like the following:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
  <soapenv:Body>
    <SubmitInvoice
      xmlns="http://mycompany.com/shop.net">
      <invoiceData>
          Data goes here!!
      </invoiceData>
    </SubmitInvoice>
  </soapenv:Body>
</soapenv:Envelope>

Envelope

The envelope is the container for the whole SOAP message. This is actually the root XML of the XML document and has the name Envelope. In the beginning Envelope tag, the namespace is specified. The xmlns is the definition for name-space, and the text following the colon is the namespace prefix that will be used in this XML document. It is possible to include many different namespaces in the same XML document by adding them one after another like this:

xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:economy="http://mycomp.com/soap/economy/"

This will often occur, because the information sent in the SOAP message may use different namespaces to separate the same tag name.

Header

The header is optional and can contain information about the SOAP message, such as expiration date, encoding information, and so on. Here it is possible to include your own information, and later in the section "SOAP Extensions" you will see how the header can be used for security verification and RPC calls. Here we will show you how to add some header information to a SOAP message. In Listing 6-1, some general information is sent in the header: the date when the SOAP message was posted.

Listing 6-1: Header Information in the SOAP Message

<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
  <soapenv:Header>
<postedDate>
12th of May 2002
</postedDate>
  </soapenv:Header>
  <soapenv:Body>
    <SubmitInvoice xmlns="http://mycompany.com/shop">
      <invoiceData>
      Data goes here!
      </invoiceData>
    </SubmitInvoice>
  </soapenv:Body>
</soapenv:Envelope>

Fault Part

The last part of the SOAP message might be a fault part. This is a child element to the body parts of the SOAP message. The information is generated on the server when an error occurs in the processing of the request. A fault message is easy to create inside a Web service, as you will see in later examples. The SOAP message is put back into the response message, and the fault part tells the user of the Web services what kind of error has occurred. A typical fault message looks like this:


<soap:Fault>
       <faultcode>soap:Client</faultcode>
<faultstring>System.Web.Services.Protocols.SoapException:
A validation error occurred:
The 'http://www.mycompany.com/info.net/schemas/invoice:ItemTotal'
element has an invalid value according to its data type.
</faultstring>
       <detail>
         <Procedure xmlns="http://services.mycompany.com/Invoice/">
                  Validation
         </Procedure>
         <Line xmlns="http://services.mycompany.com/Invoice/">13</Line>
         <Position xmlns="http://services.mycompany.com/Invoice/">15</Position>
       </detail>
     </soap:Fault>

SOAP Message Formats

Today many different kinds of SOAP message formats exist. The formats differ by the way the body element and the data in the header elements are formatted or encoded.

To send a SOAP message, you need to serialize the data into a format that can be understood by the receiver of the SOAP message. If you serialize the data to an XML document and send it in a SOAP message by putting the XML document inside the Body tag, the SOAP message that you end up with is said to be a document-style SOAP message and has a literal payload. This is more often called a document/literal SOAP message. Normally the use of SOAP is separated into two different areas: document/literal use and RPC use.

In document/literal use, the document usually contains one or more children called parts. There are no rules for what the body part of the SOAP message should contain in a document/literal SOAP message—it can contain anything that can be serialized and that the receiver and you agree upon. In RPC use, the SOAP body contains a method name that should be invoked on the receiver's server and an element for each parameter that the remote procedure needs. Section 7 in the SOAP specification defines exactly what the body should contain when it is used for RPC. In addition to the previously mentioned two SOAP message styles, there are two formats for serializing data into XML. The format you choose determines how your data is serialized into the soapenv:Body and soapenv:Header elements. We will now show you the different ways you can serialize the SOAP message.

When the serialization method is literal, the data is serialized according to an XML Schema. No specific encoding rules dictate how the data should be serialized in a literal SOAP message. With the literal format, the communication is based on XML documents rather than objects and structures, and the documents may contain anything.

The second serialization method is the encoded format. The data in this case is serialized according to rules that dictate the internal structure of the message. The most common rules are the Section 5 rules of the SOAP specification, which define how objects and object graphs should be serialized. Using section 5 encoding results in the data between the client and the services being based on objects and structures.

The decision of what format to use in a communication between a client and a service is open. In theory, the choice of using document or RPC format is unconnected to your selection of literal or encoded format. This gives you four possible combinations to choose among: document/literal, document/encoded, RPC/literal, or RPC/encoded. In the real world, the choice is reduced down to two—document/literal or RPC/encoded. Why? Because normally you would break down the use of Web services and SOAP into two groups:

The first group is the one that deals with document exchanges between different applications. Our example application in Chapter 9 exposes Web services that send complete time reports (in the form of documents) back to the caller. Such solutions need to describe the data in XML, and the content often does not have complex types in it. The preferred choice for these solutions is the document/ literal combination.

The second group uses SOAP for invoking remote objects. These solutions normally use RPC/encoded SOAP messages instead of document/literal SOAP messages, since the information in the SOAP message is of a more technical nature and includes complex types.

In the beginning, SOAP mainly functioned as a replacement for DCOM. Because SOAP was easy to use and less error prone than DCOM, the market became saturated with implementations of SOAP that mainly focused on its DCOM-replacement aspects. SOAP became popular because of its use of the HTTP protocol for transporting information between a remote object and a server. There was suddenly no need for altering the firewall to allow RPC calls to go through, as there was with DCOM.

In .NET you have two ways you can use Web services: either as Remoting services or as .NET Web services. Remoting, the replacement for DCOM, mainly focuses on remote access to objects on a server. Remoting services are based on RPC/encoded SOAP. The other option, .NET Web services, incorporate document/ literal encoding intended to be used for message-based Web services, like transferring information back and forth between a client and the server.

Document/literal is the format of choice when working with message-based Web services, because you have full control of the format of the message. By using the document/literal format in a business-to-business communication in which orders and invoices are sent between different companies, you can ensure that the documents are valid to a predefined and agreed XML Schema.

So when do you use the RPC/encoded SOAP messages?

The only time we recommend you use RPC/encoded messages is when you expose objects on your server that the client should be able to run (invoking as RPC). This may, for instance, be a calculation object that calculates a customer's price for items at the current time. If the server-side objects are based on COM, you want to use SOAP messages that are RPC/encoded. If you have developed them in .NET, you would use .NET Remoting.

SOAP is very easy to use. All you need to do is write a request as an XML document and put that document into a SOAP body encoded in the document/literal format. SOAP is mostly used when the need for message communication between different programs arises. This feature of SOAP is very important, since it makes it possible for you to tie different applications together, and reuse and extend legacy information in your business environment in a simple way—no matter what environment the legacy system is running in. One of the strengths of SOAP and Web services is the use of standard communication protocols as a carrier for the data. In the next section, we will talk about the benefits of using HTTP for SOAP messages.

SOAP over HTTP

SOAP over HTTP gives you a benefit that many of those who have used DCOM or CORBA sometimes wish they had—free access through firewalls. Since HTTP is a standard protocol that mostly is allowed to pass through firewalls using port 80, the use of HTTP as a protocol for SOAP messages automatically gives you access through the firewalls and proxy servers.

When using SOAP over HTTP, you need to set the HTTP content type header in a SOAP message to text/XML to identify to the receiver of the HTTP request that it contains XML. Also, when sending a SOAP request, you need to set the SOAP action header, which always contains a quoted value and communicates additional information about the SOAP message at the HTTP level. This can be useful for firewalls that filter SOAP/HTTP traffic based on certain values of the HTTP SOAP action header. In this way, it is quite easy for the administrator of a firewall to redirect different SOAP requests or even deny access for specific requests.

One example of this is to use the SOAP action header for extra customer information that validates the request to access the Web. In this way, access to the Web services is filtered at an early stage. (This is not always the best solution; we mention it merely as an example of the use of a SOAP action header.) Normally you should not design your Web services to rely on information located in the Action header, as there are other places to put it—for instance, in the SOAP header. Another use of the SOAP header is to block the SOAP message from passage through a firewall and only allow "standard" HTTP traffic. This enables a company to make sure that no Web services are accidentally invoked from outside the firewall. Here is a short SOAP message sent via HTTP:


POST /msyhop/orders/orderWS/
orders.aspx HTTP/1.0SOAPAction: "urn:OrderAction"
Content-Type: text/XML
Accept-Language: en-us
Content-Length: 800
Accept: */*
User-Agent: Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)
Host: localhost
Connection: Keep-Alive
<soapenv:Envelope xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance
xmlns:xsd=http://www.w3.org/2001/XMLSchema
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">..</soapenv:Envelope>

SOAP over HTTPS

HTTP is perfect to use as a transport protocol for SOAP, but SOAP has no built-in security. Hence the SOAP message is transferred via HTTP, open for anyone to explore—not the best solution for critical business information. Anyone who knows how to sniff for TCP/IP packages will easily collect your business critical information, or even worse, manipulate it and send it to its destination! One solution is to use HTTPS. HTTPS employs Secure Sockets Layer (SSL) over port 443 (normally).

The encryption of packages prevents most persons from decoding your information. The strength of the communication depends on how long the encryption key used is. The length is described in bits, and longer is better. The key is based on prime numbers and has, until now, been quite safe to use. This security solution will work fine in regards to point-to-point communication—that is, the sender and the receiver are aware of each other and are the only ones interacting with the SOAP messages. This security solution can also be very fast, because it is possible to buy hardware that handles the SSL encryption.

When more actors on the SOAP message path are involved, the use of WS-I specifications such as WS-Security, WS-Policy, and WS-Routing is a better choice than relying on HTTPS security when you reroute your SOAP messages. The Web services architecture in a more complex application cannot rely on Transport Layer Security because a SOAP intermediary node may need to process the document before forwarding it on to the endpoint. This means that SSL can only be deployed in peer-to-peer environments. The important thing to note is that the number of potential SOAP intermediaries increases as the deployment scale increases.

This decreases the likelihood of encrypting your data successfully at the transport layer, increases the amount of access to each SOAP envelope, and increases the possibility of a passive sniffing attack against your text-based SOAP messaging framework. Message layer encryption and digital signatures are a must for a secure deployment. By using WS-Security, you ensure that the message itself is encrypted—no matter what kind of transport protocol is in use.

Later in the section "Web Services Enhancements (WSE) SDK," we will show you examples of how to use the WS-Security specifications from VB .NET via the Web Services Enhancements (WSE) software development kit (SDK) from Microsoft. This is an implementation of the WS-Security, WS-Policy, and WS-Attachment specifications.

RPC and SOAP

Rewriting an existing application to move from DCOM to SOAP is often too arduous, since many parts in such an application would need to be changed. You would have to make changes to the server side of the application to receive and process incoming XML documents and send responses as outgoing XML documents back to the client. You would also have to rewrite the client side of the application to format and send requests as XML documents and receive and parse responses.

Not only that, but most difficult of all, you would have to adopt a programming model that is probably different from the one you are used to: Serialize all data to XML, send it as a request to the server, and when the answer arrives back from the server, deserialize the incoming XML document and process the return values. Finally, if the existing application architecture does not lend itself to messaging, by using stateful server objects for example, replacing DCOM with SOAP messaging requires a major rearchitecture and/or some custom SOAP extensions.

To ease the transition process from DCOM to SOAP in an application, you could add an extra layer to the client application. The purpose of this layer is to simulate the previously used DCOM object, but instead of calling the object via DCOM, this proxy makes a SOAP message of the request and sends it to the server. On the server side resides a similar proxy that receives the SOAP message and deserializes it. The server then calls the object on the server, serializes the result, and returns it to the calling client proxy.

The specification for using SOAP for RPC (Section 7 of the SOAP specification) defines a standard format for communicating this kind of information. In addition, the SOAP specification contains many definitions, including one that defines a standard way of serializing/deserializing native application types (arrays, strings, and the like), which is important to enable interoperation among clients and servers from different vendors. This standard serialization format is commonly called Section 5 encoding, after the SOAP specification section where it is defined. Although the SOAP specification indicates you can use SOAP for RPC without necessarily using the standard Section 5 encoding, we do not recommend this because Section 5 encoding ensures your application will be able to operate with other applications. Common sense is to stick to a standard if one exists. Most SOAP stacks and SOAP tools combine RPC with Section 5 encoding. Instead of writing your own proxy layers, you can use existing third-party proxies, like those found in Microsoft's SOAP SDK. These proxies enable remote communication via SOAP messages in old Windows DNA environments, where applications are built on COM objects.

We will not continue to talk about older technologies here; instead we will focus on all the new features in the Windows .NET architecture that make Web services and SOAP even simpler to use, yet more powerful than they were in Windows DNA.

Note

Previously we mentioned Section 5 encoding, named after the section that defines it in the SOAP specification. Most of the specification of SOAP is found in Section 5 (which has approximately 40 percent of the total number of pages in the SOAP specification). Section 5 includes detailed rules for how application information should be serialized and deserialized, including arrays and objects. In this book, we will not dive into the encoding specification; if you use .NET Framework mostly for your Web services as we recommend, the .NET SDK takes care of the encoding.

Error Messages in SOAP

Sometimes things do not work out the way they should, and you need to report an error back to the caller. This is possible to do by using the fault part in the SOAP specification. In .NET this part is populated with error information from the thrown error object, which makes it easy to return an error message. All you need to do is throw an error inside the called Web services method. The fault message contains a fault code, fault string, fault actor, and detail element, as described in the following list:

Fault code: The fault code element is intended for software to provide a number that identifies the error. It must be present in the SOAP fault element, and the fault code value must be a qualified name.
Fault string: The fault string is provided to give you a chance to see what kind of error has occurred and is not intended for algorithmic processing. For processing, use a fault code. The fault string is similar to the description string found in the error object in VB 6 in that it contains a description of the reason why the error occurred.
Fault actor: The fault actor element is intended to provide information about what caused a fault within the message path. It contains a URI that identifies the source. Applications that are not the final destination for the SOAP message must include the fault actor in the response. The final destination may use it to indicate that the fault actor caused the error itself.
Fault detail: The fault detail element is intended to carry detail error information related to the body of the SOAP message. It must be present, or processing of the message body will not be successful. The detail part may not carry information about faults that have occurred in the SOAP header. To return information about errors in SOAP headers, a fault message should be carried within header elements in the detail part. In the section "SOAP Exceptions," we will give you a more extensive look into SOAP and SOAP errors and how you can use them to return fault information to the caller.

WSDL

Say you now have a Web service that communicates by using SOAP. The request and the result are described in XML, and the internal structure of the XML document is defined in XSD. But how do your users and customers know which functions your Web service exposes and which parameters they take? You can easily write this information down in a Word document, but such a document will probably not be possible to parse programmatically. Even worse, a developer might not find your specification document and have to spend a couple of days figuring out what your clever Web service is suppose to do, and how he or she can use it!

The answer to this problem is Web Services Description Language. WSDL describes Web services and what functions they have in plain XML. It also tells the user of such Web services the parameters to each function and their data types and return types. One of the benefits of using this formal approach is that both humans and machines can read the WSDL file. In addition, the WSDL file can be auto-generated from a Web service and parsed programmatically. By auto-generating the WSDL file, you can avoid errors that may be introduced by humans in a manual translation. The SOAP proxy also uses the WSDL file to decide what methods the proxy should have.

UDDI

The last link in the Web service process chain is Universal Description, Discovery, and Integration (UDDI), which lets you publish your Web services and makes it possible for potential customers or users to find them.

UDDI should be viewed as a means of finding particular Web services. UDDI categorizes and catalogs providers and their Web services. A developer can find WSDL files and access points in UDDI, and then incorporate those Web services into its client applications. The UDDI servers that host information about all Web services conform to the specification governed by the UDDI.org consortium. Today there exists one UDDI server hosted by Microsoft, one hosted by IBM, and one hosted by HP. The replication between them occurs on a 24-hour basis, but will probably be more frequent in the future, as more applications will depend on accurate UDDI data. You need to perform a couple of steps if you want to add your Web services to UDDI and make them searchable. We will walk you through these steps in the next sections.

Step 1: Modeling the UDDI Entry

The first step is to model your UDDI entry. Several key pieces of data need to be collected before establishing a UDDI entry.

First you need to determine which WSDL files your Web service implementation uses. Similar to a traditional COM or CORBA application, your Web service has been developed based on an existing interface or a proprietary one. The WSDL file can be generated in .NET by adding the WSDL parameter to the query string on the Web service as in this example: http://www.mycompany.com/SampleWebService.asmx?wsdl.

You then need to specify the name of your company and give a brief description of it as well as the central contacts for its Web services. This gives companies that would like to use your Web service the ability to get in touch with your support team if they need to.

Next, you define the categories and identifications that are appropriate to your company. To do this you browse, for instance, Microsoft's UDDI server (http://uddi.microsoft.com/default.aspx). The currently supported taxonomies are North American Industry Classification System (NAICS), ISO 3166, Universal Standard Products and Services Codes (UNSPSC), Standard Industry Classification (SIC), and GeoWeb Geographic Classification. Here you choose which categories best represent your company to allow others to narrow their searches for your Web service.

At this point you determine the Web services your company should provide through UDDI. You add the Web services you want to publish and define the different access points they may have (a Web service may have different access points). It is important to realize that registering a Web service in UDDI does not mean that everyone has access to it. Security, authorization, and authentication can exist in tandem with a UDDI registry entry. Just because someone knows your Web service exists does not mean he or she can actually invoke it.

Finally, you categorize the Web services you would like to publish, just as you previously categorized your company, so that it will be possible for potential users to find a Web service by searching on a category.

Now you have a complete UDDI registry entry. The next important step is to register it in the UDDI server.

Step 2: Registering the UDDI Entry

All communications with the UDDI publishing server holding information about your Web services use authenticated connections, and therefore you need to register to get a username and a password. The registration contains simple data about you and your business or organization, including the name of the business and an optional description of it. You must also agree to the terms of use statement, so this step cannot be done programmatically. Microsoft requires a Passport for authentication, so you need to have one before you can proceed to register your UDDI entry in the UDDI server.

The main purpose of this registration is to let others know your Web service exists. There are two ways to do the registration: either using a Web-based registration form or performing the registration programmatically.

If you are making frequent updates to your UDDI, you should use the UDDI .NET SDK to do so programmatically. After downloading and installing the SDK, it is quite easy to register your information in the UDDI server. But if you do not need to update the information frequently, you can use a Web-based registration form.

Due to space constraints, we will not show the complete registration process here; instead, we suggest you take a look at Microsoft's copy of a real UDDI registration at http://test.uddi.microsoft.com/default.aspx.

After you complete the registration of your UDDI entry, in 24 hours Microsoft's UDDI server will replicate your registration to the other UDDI servers around the world, and your Web service will be searchable and ready for others to use.

Transactions and Web Services

Transactions are not supported by SOAP today; however, the WS-I specification group is working on designing transactional support for Web services. Transaction support will not be a traditional two-phase commit. (A two-phase commit occurs when a transaction coordinator keeps track of all updates to several databases in the network. If all updates are successful, they are committed; otherwise all transactions are rolled back.) Transaction support in Web services will be more of a coordination issue in most cases. The WS-I specification group is currently designing WS-Attachment and WS-Coordination to help in handling transactions via Web services.

Putting It All Together

Finally, we will describe in this section the complete flow for a request of a Web service. Imagine that a third-party company, called Good Products Inc., would like to get information from your company regarding your inventory to be able to make special offers that suit your company.

Having agreed to this, you install a Web service that exposes one function— GetProductStatuses (Optional ProductID as long). It takes one parameter that is optional. If you do not send in a particular product ID, this means you want to retrieve all products and their statuses. This is a simple Web service that only returns an XML document containing the different product numbers and the quantity.

Good Products Inc. has a Windows service running that regularly polls its internal list of subscribing partners. When it reaches your company's server in the list, the following will happen:

The Windows service creates a SOAP message (see Listing 6-2). The first element in this SOAP message is an envelope. It identifies an XML document as being a SOAP message, and encapsulates all the other parts of a message within itself. The envelope contains the version number of the message and the rules used by the application to serialize the message.

A SOAP header element may be placed after the envelope, but this is optional. It can contain authorization information and transactional information. It may also contain extra information for the receiving application, such as priority and so on. If a header exists, it must follow directly after the envelope.

The last element is the body element. This area contains application-specific information. In Listing 6-2, which shows a complete SOAP message, an RPC call to a Web service requests GetProductStatuses.

Listing 6-2: A Complete SOAP Message

<SOAP:Envelope
     xmlns:SOAP='http://schemas.xmlsoap.org/soap/envelope/'
     SOAP:encodingStyle='http://schemas.xmlsoap.org/soap/encoding/'
     xmlns:p='http://www.GoodProducts.com/productInfo/'>
    <SOAP:Header>
        <p:From SOAP:mustUnderstand='1'>
            info@goodproducts.com
        </p:From>
    </SOAP:Header>
    <SOAP:Body>
        <p:GetProductStatuses>
            <productid>123-4567-890</productid>
        </p:GetProductStatuses>
    </SOAP:Body>
</SOAP:Envelope>

After you create a SOAP message, the next step is to send this message via some kind of protocol. Normally you use HTTP or HTTPS, but SOAP doesn't dictate what kind of transport protocol you should use. You could print out the message and send it by ordinary mail if you wanted (but we don't recommend doing this). In our examples we use HTTP, since it is well suited for SOAP messages.

At this point, you create a SOAP proxy, as mentioned earlier in this chapter. This proxy will expose the functions of the Web service found at your company as functions on this SOAP proxy object.
```
Dim c As New MSSOAPLib.SoapClient
c.mssoapinit ("http://www.yourcompany.com/productinfo/
services/productinfo.XML")
Dim strProductInfos As String
```
Invoke the method GetProductStatuses that the WSDL file describes and that is exposed as a function on the SOAP Client.
```
    strProductInfos = c.GetProductStatuses ()
```
The productinfo.XML file specified in the SOAP proxy is the WSDL file that contains information about the Web service, so the proxy (the SOAP client) can create an interface for you with all the functions visible. The result from the Web service at your company is found in the strProductInfos variable in the preceding code line. Now you are ready to parse!

It is possible to create a request to a Web service without using Microsoft's SOAP Toolkit. All you need to do is parse the incoming request stream to the Web page, extract the information, and execute the requested method. However, this takes some work and requires a lot of code, so we will not show this process here. Figure 6-6 contains a simple schematic of the flow of actions for invoking a Web service via HTTP and SOAP.

Figure 6-6: The flow for a Web service request

As this figure illustrates, the steps for the Web service request flow are as follows:

First the request is packaged into a SOAP message. The SOAP body is filled with the appropriate information that the client will send or request to/from the server.
The SOAP message is sent via HTTP to the server, where the Web service receives the message.
The server unpacks the incoming messages and executes the requested function.
When the server has received the result from the function, the result is packed into a new SOAP message and returned to the caller.
The response in the form of a SOAP message from the server is unpacked by the SOAP client, and the client parses the result.