Lecture Notes Last

Fall Semester 2002

(Notes under development while you can see this.)


Building Web Services with Java. Making Sense of XML, SOAP, WSDL, and UDDI
This is where the semester ends.

These are likely to be the longest lecture notes of the season though.

This is where your professional life might start, too.

Web Services. A rapidly evolving set of standards and implementation technologies that have great promise for the world of application integration and distributed computing. From our perspective a web service is a platform and implementation independent software component that can be:

  1. Described using a service description language
  2. Published to a registry of services
  3. Discovered through a standard mechanism (at runtime or design time)
  4. Invoked through a declared API, usually over a network
  5. Composed with other services
One important point is that a web service need not necessarily exist on the world wide web. (This is an unfortunate historical naming issue). A web service can live anywhere on the network, inter- or intra-net; some web services can be invoked by a simple method invocation in the same operating system process, or perhaps using shared memory between tightly coupled processes running on the same machine.

Another important point is that a web service's implementation and deployment platform details are not relevant to a program that is invoking the service. A web service is available through its declared API and invocation mechanism (network protocol, data encoding schemes, and so on). This is analogous to the relationship between a web browser and a web application server: very little shared understanding exists between the two components. (The only shared understanding is that they both speak HTTP and converse in HTML or a very limited set of MIME types).

Organizations use web services in two broad categories:

  1. Enterprise Application Integration (EAI)
  2. Business-to-business (B2B)

One needs to compare and contrast B2B to B2C.

Area B2C application B2B application
Backend logic Java classes and EJBs Java classes and EJBs
Custom logic Servlets and EJPs Web service engine
Communication protocol HTTP HTTP, SMTP, FTP, TCP/IP, EDI, JMS, RMI/IIOP
Data input HTTP GET/POST parameters XML
Data output HTML XML
User interface HTML + script N/A
Client Human behind a browser Software application

Why are web services superior to existing distributed computing approaches?

There are three key reasons.

  1. Web services have evolved not around pre-defined architectures (like DCOM, CORBA, or RMI) but around the problem of application integration.
  2. They use more flexible technologies, especially as far as data encoding and data description are concerned.
  3. There is a different dynamics around their standards control and innovation: the many industry players can now engage in parallel innovation, and the standards are open.

We should be careful to distinguish and describe the Service Oriented Architecture (SOA).

Any SOA (service oriented architecture) contains three roles:

  1. A Service Requestor
  2. A Service Provider, and
  3. A Service Registry

A service provider is responsible for creating a service decription, publishing that service description to one or more service registries, and receiving web service invocation messages from one or more service requestors. We should describe the other two, also.

Any SOA also includes three operations: publish, find and bind. These operations define the contracts between the SOA roles and we should describe them in greater detail soon, too.

Web services technologies can be factored in the following three stacks:

  1. The wire stack
  2. The description stack
  3. The discovery stack

We now have to start exploring the root of all web services technologies: XML.

Here's a sample purchase order from the Skateboard Warehouse, retailer of skateboards to SkatesTown. The order is for 5 backpacks, 12 skateboards, and 1000 SkatesTown promotional stickers (this is what the stock keeping unit [SKU] of 008-PR stands for).

<po xmlns="http://www.skatestown.com/ns/po" id="50383" submitted="2001-12-06">
   <billTo>
      <company>The Skateboard Warehouse</company>
      <street>One Warehouse Park</street>
      <street>Building 17</street>
      <city>Boston</city>
      <state>MA</state>
      <postalCode>01775</postalCode>
   </billTo>
   <shipTo>
      <company>The Skateboard Warehouse</company>
      <street>One Warehouse Park</street>
      <street>Building 17</street>
      <city>Boston</city>
      <state>MA</state>
      <postalCode>01775</postalCode>
   </shipTo>
   <order>
      <item sku="318-BP" quantity="5">
         <description>Skateboard backpack; five pockets</description>
      </item>
      <item sku="947-TI" quantity="12">
         <description>Street-style titanium skateboard.</description>
      </item>
      <item sku="008-PR" quantity="1000"/>
   </order>
</po>
That's the difference between document-centric (HTML-like) XML and data-centric XML.

Chapter 2: XML Primer

In this chapter:
  1. Origins of XML
  2. Document- Versus Data-Centric XML
  3. XML Instances
  4. XML Namespaces
  5. Document Type Definitions
  6. XML Schemas
  7. Processing XML
(OK, I need to redo the beginning. But for now let's go full force ahead).

Introduced in 1998. XML is about structuring, describing, and exchanging information. All key web services technologies are based on it. This chapter will develop a set of examples around SkatesTown's purchase order submission and invoice generation process. Perhaps this is a good time to introduce SkatesTown. SkatesTown is a small but growing business in New York founded by three mechanically inclined friends with a passion for cars and skateboards.

Larry Michael Toni

They started by designing and selling custom pre-built boards out of Larry's garage, and word soon spread about the quality of their work. They came up with some innovative new construction techniques, and within months they had orders piling up. Now SkatesTown has a small manufacturing operation in Brooklyn, and the company is selling boards, clothing and equipment to stores around the city. Larry, Michael, and Toni couldn't be happier about how their business has grown.

Of the three, Toni is the real gearhead, and he has been responsible for most of the daring construction and design choices that have helped SkatesTown get where it is today. He's the president and head of the team. Michael, gregarious and a smooth talker ever since childhood, now handles marketing and sales. Larry has tightly tracked the computer revolution over the years, and is chief technical officer for the company.

A few years back, Larry realized that networking technology was going to be big, and he wanted to make sure that SkatesTown could catch the wave and utilize distrobuted computing to leverage its business. This focus turned out to be a great move. Larry set up a web presence so SkatesTown could help its customers stay up-to-date without requiring a large staff to answer phones and questions. He also built an online order-processing system to help streamline the actual flow of the business with network-enabled clients. In recent months, more and more stores who carry SkatesTown products have been using the system to great effect.

At present, Larry is pretty happy with the way things are working with SkatesTown's electronic commerce systems. But there have been a few problems, and Larry is sure that things could be even better. He realizes that as the business grows, the manual tasks associated with with order gathering and inventory resupply will limit the company's success. Always one to watch the horizon, Larry has heard the buzz about web services and wants to know more. At the urging of a friend, he got in touch with Al Rosen, a contractor for Silver Bullet Consulting. Silver Bullet specializes in web services solutions, and after a couple of meetings with Al, Larry was convinced - he hired SBC to come in, evaluate SkatesTown's systems, and help the company grow into a web service-enabled business.

Al Rosen
As we move through the rest of the notes, we'll keep an eye on how SkatesTown uses technologies like XML and, later, SOAP, WSDL, and UDDI to increase efficiency, productivity, and establish new and valuable relationships with its customers and business partners. Silver Bullet, as we'll see, usually lives up to its name.

Work on XML started in 1996. The need was for a simple yet extensible mechanism that would allow the textual representation of structured and semi-structured information. The design inspiration for XML came from two main sources: Standard Generalized Markup Language (SGML) and HTML.

HTML is an SGML application (perhaps the most popular).

XML is a lightweight version of SGML.

The rest of this chapter introduces the set of XML technologies and standards that are the foundation of web services:

  1. XML Instances - the rules for creating syntactically correct XML documents
  2. XML Schema - a recent standard that enables detailed validation of XML documents as well as the specification of XML datatypes
  3. XML Namespaces - definitions of the mechanisms for combining XML from multiple sources in a single document
  4. XML processing - the core architecture and mechanisms for creating, parsing, and manipulating XML documents from programming languages
Document- Versus Data-Centric XML

Semi-structured marked-up text. The content in these documents is typically meant for human consumption. That's document-centric XML. Here's an example from the FastGlide skateboard user guide:

<h1>Skateboard Usage Requirements</h1>

<p>In order to use the <b>FastGlide</b> skateboard you must have:</p> 

<list>
  <item>A strong pair of legs.</item>
  <item>A reasonably long stretch of smooth road surface.</item>
  <item>The impulse to impress others.</item>
</list> 

<p>If you have all of the above, you can proceed to 
   <link href="Chapter2.xml">Getting on the Board</link>. 
</p> 
By contrast, data-centric XML is typically generated by machines and is meant for machine consumption. Consider the purchase order that we presented a few screens above. The use of XML in it is very different from the previous user guide:

  1. The ratio of markup to content is high. The XML includes many different types of tags. There is no long-running text.
  2. The XML includes machine generated information (the date, for example).
  3. The tags are organized in a highly structured manner.
  4. Markup is used to describe what a piece of information means rather than how it should be presented to a human.
In short, if you can easily imagine the XML as a data structure in your favourite programming language then you are probably looking at a data-centric use of XML. Here's such an analog for the purchase order of a few screens above (writen in Java):

class PO {
  int id; 
  Date submitted; 
  Address billTo, shipTo; 
  Item order[];
} 
Typically, XML documents for human consumption live a long time, whereas some data-centric XML could live for only a few milliseconds, and that's another relevant difference. Web services are about data-centric uses of XML, and that's what we are going to focus on in what follows.

XML Instances

The structure and formatting of XML in an XML document must follow the rules of the XML instance syntax. XML documents contain an optional prolog followed by a root element that contains the actual document. Typically the prolog serves up to three roles:

  1. identifies the document as an XML document
  2. includes any comments about the document
  3. includes any meta-information about the content of the document

A document can be identified as an XML document through the use of a processing instruction. Processing instructions (PIs) are special directives to the application that will process the XML document. In general, data-oriented XML applications do not use application-specific processing instructions. Instead, they tend to put all information in elements and attributes. However, you should use one standard processing instruction: the XML declaration (in the XML document prolog) to determine two very important pieces of information

  1. the version of XML in the document and
  2. the character encoding.

Here's an example:

<?xml version="1.0" encoding="UTF-8"?>
The version parameter of the xml PI tells the processing application the version of the XML specification to which the document conforms. Currently, there is only one version: "1.0". The encoding parameter is optional. It identifies the character set of the document. The default value is "UTF-8". If you omit the XML declaration, the XML version is assumed to be 1.0, and the processing application will try to guess the encoding of the document based on clues (such as the raw byte order of the data stream). This approach has problems, and whenever interoperability is of high importance - such as for Web Services - applications should always provide an explicit XML declaration and use UTF-8 encoding.

Stopped working on this document on Sun Nov 10 21:48:45 EST 2002.

Anticipate coming back to it on Mon Nov 11 2002


Last updated: Nov 6, 2002 by Adrian German for A348/A548