CSCI A348/548
Lecture Notes Nine

Spring 2001 (Second semester 2000-2001)


Java and XML

1. Introduction

XML is the Extensible Markup Language. Like its predecessor SGML, XML is a metalanguage used to define other languages. However, XML is much simpler and more straightforward than SGML. It promises to bring to a data format what Java brought to a programming language: complete portability.

Definition: tag set of a markup language. The tag set for a markup language defines the markup tags that have meaning to a language parser. For example HTML has a strict set of tags that are allowed. You may use <TABLE> but not <CHAIR> (the second tag has no specific meaning in HTML, and although most browsers will ignore it, unexpected things can happen when it appears).

Definition: grammar of a markup language. The grammar of a markup language defines the correct use of the language's tags. For example, when using the <TABLE> tag, several attributes may be included, such as the width, the background color, the alignment. However you cannot define the <TYPE> of the table because the grammar of HTML does not allow it.

XML, by defining neither the tags nor the grammar, is completely extensible. So you could have something like this:

<?xml version="1.0"?>

<dining-room>
    <table type="round" wood="maple">
        <manufacturer>;The Wood Shop</manufacturer>
        <price>$1999.99</price>
    </table>

    <chair wood="maple">
        <quantity>;2</quantity>
        <quality>excellent</quality>
        <cushion included="true">
            <color>blue</color>
        </cushion>
    </chair>

    <chair wood="oak">
        <quantity>3</quantity>
        <quality>average</quality>
    </chair>
</dining-room>
This is a valid XML file. The tags and the grammar being used are completely made up. That's the power of XML: it allows you to define the content of your data in a variety of ways as long as you conform to the general structure that XML requires (which we will explore in detail). XML is built to allow flexibility of data formatting.

The power of XML is transmitting data from system to system, application to application, and business to business. We are going to focus on XML as data. That's the only meaning of XML.

2. Creating XML

Here's a practical, real-world example of an XML document.

<?xml version="1.0"?>
<?xml-stylesheet href="XSL\JavaXML.html.xsl" type="text/xsl"?>
<?xml-stylesheet href="XSL\JavaXML.wml.xsl" type="text/xsl"
                                            media="wap"?>
<?cocoon-process type="xslt"?>

<!DOCTYPE JavaXML:Book SYSTEM "DTD\JavaXML.dtd">

<!-- Java and XML -->
<JavaXML:Book xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/">
    <JavaXML:Title>Java and XML</JavaXML:Title>

    <JavaXML:Contents>
        <JavaXML:Chapter focus="XML">
            <JavaXML:Heading>Introduction</JavaXML:Heading>
            <JavaXML:Topic subSections="7">What Is It?</JavaXML:Topic>
            <JavaXML:Topic subSections="3">How Do I Use It?</JavaXML:Topic>
            <JavaXML:Topic subSections="4">Why Should I Use It?</JavaXML:Topic>
            <JavaXML:Topic subSections="0">What’s Next?</JavaXML:Topic>
        </JavaXML:Chapter>

        <JavaXML:Chapter focus="XML">
            <JavaXML:Heading>Creating XML</JavaXML:Heading>
            <JavaXML:Topic subSections="0">An XML Document</JavaXML:Topic>
            <JavaXML:Topic subSections="2">The Header</JavaXML:Topic>
            <JavaXML:Topic subSections="6">The Content</JavaXML:Topic>
            <JavaXML:Topic subSections="1">What’s Next?</JavaXML:Topic>
        </JavaXML:Chapter>

        <JavaXML:Chapter focus="Java">
            <JavaXML:Heading>Parsing XML</JavaXML:Heading>
            <JavaXML:Topic subSections="3">Getting Prepared</JavaXML:Topic>
            <JavaXML:Topic subSections="3">SAX Readers</JavaXML:Topic>
            <JavaXML:Topic subSections="9">Content Handlers</JavaXML:Topic>
            <JavaXML:Topic subSections="4">Error Handlers</JavaXML:Topic>
            <JavaXML:Topic subSections="0">
                A Better Way to Load a Parser
            </JavaXML:Topic>
            <JavaXML:Topic subSections="4">"Gotcha!"</JavaXML:Topic>
            <JavaXML:Topic subSections="0">What’s Next?</JavaXML:Topic>
        </JavaXML:Chapter>

        <JavaXML:SectionBreak/>

        <JavaXML:Chapter focus="Java">
            <JavaXML:Heading>Web Publishing Frameworks</JavaXML:Heading>
            <JavaXML:Topic subSections="4">Selecting a Framework</JavaXML:Topic>
            <JavaXML:Topic subSections="4">Installation</JavaXML:Topic>
            <JavaXML:Topic subSections="3">
                Using a Publishing Framework
            </JavaXML:Topic>
            <JavaXML:Topic subSections="2">XSP</JavaXML:Topic>
            <JavaXML:Topic subSections="3">Cocoon 2.0 and Beyond</JavaXML:Topic>
            <JavaXML:Topic subSections="0">What’s Next?</JavaXML:Topic>
        </JavaXML:Chapter>
    </JavaXML:Contents>

    <JavaXML:Copyright>&OReillyCopyright;</JavaXML:Copyright>

</JavaXML:Book>
It represents part of the table of contents of the book that we are following, listed at right (O'Reilly & Associates; Brett McLaughlin; ISBN: 0596000162)

An XML document can be broken into two basic pieces:

These two parts are color-coded, so we can identify them easily.

Let's take a closer look at each one of them now.

The header gives an XML parser and XML applications information about how to handle the document. One of the most important layers to an XML-aware application is the parser. This component handles the extremely important task of taking a aw XML document as input and making sense of the document. What results from an XML document being parsed is typically a data structure, in our case a Java-based one, that can easily be manipulated and handled by other XML tools or Java APIs. The header teaches the parser the structure it needs to read the XML document.

The header is composed of processing instructions (PIs).

<?xml version="1.0"?>
This is an XML instruction to the parser, specifying the version of XML being used.
<?xml-stylesheet href="XSL\JavaXML.html.xsl" type="text/xsl"?>
This and the next instruction refer to the stylesheets being used.
<?xml-stylesheet href="XSL\JavaXML.wml.xsl" type="text/xsl"
                                            media="wap"?>
The second one is an alternate stylesheet, for a specific client media.

<?cocoon-process type="xslt"?>
This is an application specific (publishing framework) instruction (for Cocoon).
<!DOCTYPE JavaXML:Book SYSTEM "DTD\JavaXML.dtd">
This declaration specifies a DTD for the XML document. Then the next is a comment.
<!-- Java and XML -->
A DTD is a document type definition. It establishes a set of constraints for the document. For example it might define that for the "wood" attribute, only "maple", "oak", "pine", and "mahogany" are acceptable values. In this case the file JavaXML.dtd is the DTD for the document.

What follows (in a shade of blue) is the content.

The content consists of all the

The root element is <JavaXML:Book>.

There may be only one root element in a document.

In other words, the root element must enclose all other elements in the document.

The root element also defines a name space, which is used in referring to the other elements:

A namespace is a mapping between an element prefix and a URI. This mapping is used for handling namespace collisions. For example consider an XML document that contains a <price> tag for a <chair>, and a <price> tag for a <cushion> (resulting in confusion when reading a <price> tag).

Namespaces allow elements from various groupings to be used, yet remain identified as a part of their specific grouping. In the case above a namespace is associated with a prefix to an XML document, and results in tags such as <chair:price> and <cushion:price> (which a parser can use).

xmlns is introducing an XML namespace.

Let's take a look at XML data elements now.

Elements are represented by arbitrary names and must be enclosed in anlge brackets. Every opened element must in turn be closed. There are no exceptions to this rule, as there are in other markup languages, such as HTML. (As you can see, XML does have a specific syntax. Documents need to be well-formed. When they satisfy DTD or schema constraints they are also valid.)

One could have empty elements, also, where the syntax becomes:

<SectionBreak/>
from an unnecessary empty paired of tags (which would be otherwise required).

In addition to text contained within an element's tags, an element can also have attributes. Attributes are included with their respective values within the element's opening declaration. Regarding XML "constants", XML defines five entities: &lt; (for <), &gt; (for >), &amp; (for &), &quot; (for ") and &apos; (for <), and you should already be familiar with their use from HTML.

With this mini-primer on creating XML documents we are ready to begin writing our first Java code. We now take a look at the Java SAX API. SAX means Simple API for XML.

3. Parsing XML

One of the first things you will have to do when dealing with XML programmatically is to take an XML document and parse it. As the document is parsed, the data in the document becomes available to the application using the parser, and suddenly we are within an XML-aware application!

First we must obtain an XML parser. Writing a parser for XML is a serious task, and there are several efforts going on to provide excellent XML parsers. After selecting a parser we must ensure that a copy of the SAX classes is on hand. Finally we need an XML document to parse.

We will use the XML document that we discussed in the previous section.

In the spirit of open source community, all of the examples in this book will use the Apache Xerces parser. Freely available in binary and source form it can be downloaded from

http://xml.apache.org
Once you have selected and downloaded an XML parser, make sure that your Java environment has the XML parser classes in its class path. This will be a basic requirement for all further examples.

Download the Xerces parser from

http://xml.apache.org/dist/xerces-j/
I will be using the following distribution here:
Xerces-J-src.1.3.1.tar.gz           16-Mar-2001 15:11   879k  Latest sources
I placed a copy of it in /u/dgerman/public for download.

frilled.cs.indiana.edu%pwd
/nfs/moose/home/user3/dgerman/public
frilled.cs.indiana.edu%ls -ld *
-rw-------   1 dgerman    900263 Apr  1 14:59 Xerces-J-src.1.3.1.tar.gz
frilled.cs.indiana.edu%gunzip *
frilled.cs.indiana.edu%ls -l
total 6320
-rw-------   1 dgerman   6461440 Apr  1 14:59 Xerces-J-src.1.3.1.tar
frilled.cs.indiana.edu%tar xf *
frilled.cs.indiana.edu%ls -ld *
-rw-------   1 dgerman   6461440 Apr  1 14:59 Xerces-J-src.1.3.1.tar
drwx------   6 dgerman       512 Mar 16 16:40 xerces-1_3_1
frilled.cs.indiana.edu%
Then remove the archive.

To build Xerces one needs to run the following command from the top of the Xerces Java tree:

make jars
This takes a few minutes (let's say, about 10).

Take this program, call it SAXParserDemo.java and place it in some directory where you will be working out all examples. Let's say, for me, that location will be (inside an XML directory):

/u/dgerman/XML/chapterThree
Here's the program:
import java.io.IOException;

import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.ErrorHandler;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;

// Import your vendor's XMLReader implementation here
//import org.apache.xerces.parsers.SAXParser;


/**
 * <b><code>SAXParserDemo</code></b> will take an XML file and parse it
 *   using SAX, displaying the callbacks in the parsing lifecycle.
 *
 * @author
 *   <a href="mailto:brettmclaughlin@earthlink.net">Brett McLaughlin</a>
 * @version 1.0
 */
public class SAXParserDemo {

    /**
     * <p>
     *   This parses the file, using registered SAX handlers, and outputs
     *     the events in the parsing process cycle.
     * </p>
     *
     * @param uri <code>String</code> URI of file to parse.
     */
    public void performDemo(String uri) {
        System.out.println("Parsing XML File: " + uri + "\n\n");

        // Get instances of our handlers
        ContentHandler contentHandler = new MyContentHandler();
        ErrorHandler errorHandler = new MyErrorHandler();

        try {
            // Instantiate a parser
            XMLReader parser =
                XMLReaderFactory.createXMLReader(
                    "org.apache.xerces.parsers.SAXParser");

            // Register the content handler
            parser.setContentHandler(contentHandler);

            // Register the error handler
            parser.setErrorHandler(errorHandler);

            // Parse the document
            parser.parse(uri);

        } catch (IOException e) {
            System.out.println("Error reading URI: " + e.getMessage());
        } catch (SAXException e) {
            System.out.println("Error in parsing: " + e.getMessage());
        }
    }

    /**
     * <p>
     *   This provides a command-line entry point for this demo.
     * </p>
     */
    public static void main(String[] args) {
        if (args.length != 1) {
            System.out.println("Usage: java SAXParserDemo [XML URI]");
            System.exit(0);
        }
        String uri = args[0];
        SAXParserDemo parserDemo = new SAXParserDemo();
        parserDemo.performDemo(uri);
    }
}

/**
 * <b><code>MyContentHandler</code></b> implements the SAX
 *   <code>ContentHandler</code> interface and defines callback
 *   behavior for the SAX callbacks associated with an XML
 *   document's content.
 */
class MyContentHandler implements ContentHandler {

    /** Hold onto the locator for location information */
    private Locator locator;

    /**
     * <p>
     * Provide reference to <code>Locator</code> which provides
     * information about where in a document callbacks occur.
     * </p>
     *
     * @param locator <code>Locator</code> object tied to callback
     * process
     */
    public void setDocumentLocator(Locator locator) {
        System.out.println(" * setDocumentLocator() called");

        // We save this for later use if desired.
        this.locator = locator;
    }

    /**
     * <p>
     * This indicates the start of a Document parse—this precedes
     * all callbacks in all SAX Handlers with the sole exception
     * of <code>{@link #setDocumentLocator}</code>.
     * </p>
     *
     * @throws <code>SAXException</code> when things go wrong
     */
    public void startDocument() throws SAXException {
        System.out.println("Parsing begins...");
    }

    /**
     * <p>
     *   This indicates the end of a Document parse—this occurs after
     *     all callbacks in all SAX Handlers.</code>.
     * </p>
     *
     * @throws <code>SAXException</code> when things go wrong
     */
    public void endDocument() throws SAXException {
        System.out.println("...Parsing ends.");
    }

    /**
     * <p>
     *   This indicates that a processing instruction (other than
     *     the XML declaration) has been encountered.
     * </p>
     *
     * @param target <code>String</code> target of PI
     * @param data <code>String</code containing all data sent to the PI.
     *               This typically looks like one or more attribute value
     *               pairs.
     * @throws <code>SAXException</code> when things go wrong
     */
    public void processingInstruction(String target, String data)
        throws SAXException {

        System.out.println("PI: Target:" + target + " and Data:" + data);
    }

    /**
     * <p>
     *   This indicates the beginning of an XML Namespace prefix
     *     mapping. Although this typically occurs within the root element
     *     of an XML document, it can occur at any point within the
     *     document. Note that a prefix mapping on an element triggers
     *     this callback <i>before</i> the callback for the actual element
     *     itself (<code>{@link #startElement}</code>) occurs.
     * </p>
     *
     * @param prefix <code>String</code> prefix used for the namespace
     *                being reported
     * @param uri <code>String</code> URI for the namespace
     *               being reported
     * @throws <code>SAXException</code> when things go wrong
     */
    public void startPrefixMapping(String prefix, String uri) {
        System.out.println("Mapping starts for prefix " + prefix +
            " mapped to URI " + uri);
    }

    /**
     * <p>
     *   This indicates the end of a prefix mapping, when the namespace
     *     reported in a <code>{@link #startPrefixMapping}</code> callback
     *     is no longer available.
     * </p>
     *
     * @param prefix <code>String</code> of namespace being reported
     * @throws <code>SAXException</code> when things go wrong
     */
    public void endPrefixMapping(String prefix) {
        System.out.println("Mapping ends for prefix " + prefix);
    }

    /**
     * <p>
     *   This reports the occurrence of an actual element. It includes
     *     the element's attributes, with the exception of XML vocabulary
     *     specific attributes, such as
     *     <code>xmlns:[namespace prefix]</code> and
     *     <code>xsi:schemaLocation</code>.
     * </p>
     *
     * @param namespaceURI <code>String</code> namespace URI this element
     *               is associated with, or an empty <code>String</code>
     * @param localName <code>String</code> name of element (with no
     *               namespace prefix, if one is present)
     * @param rawName <code>String</code> XML 1.0 version of element name:
     *                [namespace prefix]:[localName]
     * @param atts <code>Attributes</code> list for this element
     * @throws <code>SAXException</code> when things go wrong
     */
    public void startElement(String namespaceURI, String localName,
                                           String rawName, Attributes atts)
        throws SAXException {

        System.out.print("startElement: " + localName);
        if (!namespaceURI.equals("")) {
            System.out.println(" in namespace " + namespaceURI +
                " (" + rawName + ")");
        } else {
            System.out.println(" has no associated namespace");
        }

        for (int i=0; i<atts.getLength(); i++) {
            System.out.println(" Attribute: " + atts.getLocalName(i) +
                "=" + atts.getValue(i));
        }
    }

    /**
     * <p>
     *   Indicates the end of an element
     *     (<code></[element name]></code>) is reached. Note that
     *     the parser does not distinguish between empty
     *     elements and non-empty elements, so this occurs uniformly.
     * </p>
     *
     * @param namespaceURI <code>String</code> URI of namespace this
     *                element is associated with
     * @param localName <code>String</code> name of element without prefix
     * @param rawName <code>String</code> name of element in XML 1.0 form
     * @throws <code>SAXException</code> when things go wrong
     */
    public void endElement(String namespaceURI, String localName,
                                          String rawName)
        throws SAXException {

        System.out.println("endElement: " + localName + "\n");
    }

    /**
     * <p>
     *   This reports character data (within an element).
     * </p>
     *
     * @param ch <code>char[]</code> character array with character data
     * @param start <code>int</code> index in array where data starts.
     * @param end <code>int</code> index in array where data ends.
     * @throws <code>SAXException</code> when things go wrong
     */
    public void characters(char[] ch, int start, int end)
        throws SAXException {

        String s = new String(ch, start, end);
        System.out.println("characters: " + s);
    }

    /**
     * <p>
     * This reports whitespace that can be ignored in the
     * originating document. This is typically invoked only when
     * validation is ocurring in the parsing process.
     * </p>
     *
     * @param ch <code>char[]</code> character array with character data
     * @param start <code>int</code> index in array where data starts.
     * @param end <code>int</code> index in array where data ends.
     * @throws <code>SAXException</code> when things go wrong
     */
    public void ignorableWhitespace(char[] ch, int start, int end)
        throws SAXException {

        String s = new String(ch, start, end);
        System.out.println("ignorableWhitespace: [" + s + "]");
    }

    /**
     * <p>
     *   This reports an entity that is skipped by the parser. This
     *     should only occur for non-validating parsers, and then is still
     *     implementation-dependent behavior.
     * </p>
     *
     * @param name <code>String</code> name of entity being skipped
     * @throws <code>SAXException</code> when things go wrong
     */
    public void skippedEntity(String name) throws SAXException {
        System.out.println("Skipping entity " + name);
    }
}

/**
 * <b><code>MyErrorHandler</code></b> implements the SAX
 *   <code>ErrorHandler</code> interface and defines callback
 *   behavior for the SAX callbacks associated with an XML
 *   document's errors.
 */
class MyErrorHandler implements ErrorHandler {

    /**
     * <p>
     *   This will report a warning that has occurred; this indicates
     *     that while no XML rules were broken, something appears
     *     to be incorrect or missing.
     * </p>
     *
     * @param exception <code>SAXParseException</code> that occurred.
     * @throws <code>SAXException</code> when things go wrong
     */
    public void warning(SAXParseException exception)
        throws SAXException {

        System.out.println("**Parsing Warning**\n" +
            " Line: " +
            exception.getLineNumber() + "\n" +
            " URI: " +
            exception.getSystemId() + "\n" +
            " Message: " +
            exception.getMessage());
        throw new SAXException("Warning encountered");
    }

    /**
     * <p>
     *   This will report an error that has occurred; this indicates
     *     that a rule was broken, typically in validation, but that
     *     parsing can reasonably continue.
     * </p>
     *
     * @param exception <code>SAXParseException</code> that occurred.
     * @throws <code>SAXException</code> when things go wrong
     */
    public void error(SAXParseException exception)
        throws SAXException {

        System.out.println("**Parsing Error**\n" +
            " Line: " +
            exception.getLineNumber() + "\n" +
            " URI: " +
            exception.getSystemId() + "\n" +
            " Message: " +
            exception.getMessage());
        throw new SAXException("Error encountered");
    }

    /**
     * <p>
     *   This will report a fatal error that has occurred; this indicates
     *     that a rule has been broken that makes continued parsing either
     *     impossible or an almost certain waste of time.
     * </p>
     *
     * @param exception <code>SAXParseException</code> that occurred.
     * @throws <code>SAXException</code> when things go wrong
     */
    public void fatalError(SAXParseException exception)
        throws SAXException {

        System.out.println("**Parsing Fatal Error**\n" +
            " Line: " +
            exception.getLineNumber() + "\n" +
            " URI: " +
            exception.getSystemId() + "\n" +
            " Message: " +
            exception.getMessage());
        throw new SAXException("Fatal Error encountered");
    }
}
Also, don't forget contents.xml above. To be able to run the example below you must first modify the first part of the file as follows:
<?xml version="1.0"?>

<!-- We don't need these yet
    <?xml-stylesheet href="XSL\JavaXML.html.xsl" type="text/xsl"?>
    <?xml-stylesheet href="XSL\JavaXML.wml.xsl" type="text/xsl"
                                media="wap"?>
    <?cocoon-process type="xslt"?>
    <!DOCTYPE JavaXML:Book SYSTEM "DTD\JavaXML.dtd">
-->

<!-- Java and XML -->
<JavaXML:Book xmlns:JavaXML="http://www.oreilly.com/catalog/javaxml/">
    <JavaXML:Title>Java and XML</JavaXML:Title>

Also the last but one line in the file also needs to be taken out:
        </JavaXML:Chapter>
    </JavaXML:Contents>

    <!-- Leave out until DTD section
        <JavaXML:Copyright>&OReillyCopyright;</JavaXML:Copyright>
    -->
</JavaXML:Book>
Once you have the files, compile the program:

frilled.cs.indiana.edu%pwd
/nfs/moose/home/user3/dgerman/XML/three
frilled.cs.indiana.edu%emacs contents.xml
frilled.cs.indiana.edu%emacs SAXParserDemo.java
frilled.cs.indiana.edu%javac SAXParserDemo.java
frilled.cs.indiana.edu%ls -l 
total 26
-rw-------   1 dgerman      2703 Apr  1 16:07 MyContentHandler.class
-rw-------   1 dgerman      1483 Apr  1 16:07 MyErrorHandler.class
-rw-------   1 dgerman      1525 Apr  1 16:07 SAXParserDemo.class
-rw-------   1 dgerman     14851 Apr  1 16:07 SAXParserDemo.java
-rw-------   1 dgerman      3086 Apr  1 16:06 contents.xml
frilled.cs.indiana.edu%
Note that you must have the SAX classes in you CLASSPATH.

For me that means the path to the following directory:

/u/dgerman/public/xerces-1_3_1/class
Running the program will make it print and report the contents of the XML file prepared:

frilled.cs.indiana.edu%java SAXParserDemo
Usage: java SAXParserDemo [XML URI]
frilled.cs.indiana.edu%java SAXParserDemo contents.xml
Parsing XML File: contents.xml


 * setDocumentLocator() called
Parsing begins...
Mapping starts for prefix JavaXML mapped to URI http://www.oreilly.com/catalog/javaxml/
startElement: Book in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Book)
characters: 
           
startElement: Title in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Title)
characters: Java and XML
endElement: Title

characters: 

           
startElement: Contents in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Contents)
characters: 
               
startElement: Chapter in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Chapter)
 Attribute: focus=XML
characters: 
                   
startElement: Heading in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Heading)
characters: Introduction
endElement: Heading

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=7
characters: What Is It?
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=3
characters: How Do I Use It?
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=4
characters: Why Should I Use It?
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=0
characters: What?s Next?
endElement: Topic

characters: 
               
endElement: Chapter

characters: 

               
startElement: Chapter in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Chapter)
 Attribute: focus=XML
characters: 
                   
startElement: Heading in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Heading)
characters: Creating XML
endElement: Heading

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=0
characters: An XML Document
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=2
characters: The Header
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=6
characters: The Content
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=1
characters: What?s Next?
endElement: Topic

characters: 
               
endElement: Chapter

characters: 

               
startElement: Chapter in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Chapter)
 Attribute: focus=Java
characters: 
                   
startElement: Heading in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Heading)
characters: Parsing XML
endElement: Heading

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=3
characters: Getting Prepared
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=3
characters: SAX Readers
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=9
characters: Content Handlers
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=4
characters: Error Handlers
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=0
characters: 
                       A Better Way to Load a Parser
                   
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=4
characters: "Gotcha!"
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=0
characters: What?s Next?
endElement: Topic

characters: 
               
endElement: Chapter

characters: 

               
startElement: SectionBreak in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:SectionBreak)
endElement: SectionBreak

characters: 

               
startElement: Chapter in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Chapter)
 Attribute: focus=Java
characters: 
                   
startElement: Heading in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Heading)
characters: Web Publishing Frameworks
endElement: Heading

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=4
characters: Selecting a Framework
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=4
characters: Installation
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=3
characters: 
                       Using a Publishing Framework
                   
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=2
characters: XSP
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=3
characters: Cocoon 2.0 and Beyond
endElement: Topic

characters: 
                   
startElement: Topic in namespace http://www.oreilly.com/catalog/javaxml/ (JavaXML:Topic)
 Attribute: subSections=0
characters: What?s Next?
endElement: Topic

characters: 
               
endElement: Chapter

characters: 
           
endElement: Contents

characters: 


characters: 

       
endElement: Book

Mapping ends for prefix JavaXML
...Parsing ends.
frilled.cs.indiana.edu%
This should give us a good understanding of the SAX interfaces and how they interact with an XML parser and the parsing process (with regard to a non-validated document).

These interfaces are key to the rest of our discussions and Java code, as we will expand on our knowledge of SAX and add additional SAX classes to our example program.

Next we look at how an XML document can be validated, and cover an XML document's DTD and schema. These will teach you how to constrain an XML document, and then in the section after that we will look at implementing validation in our example parsing code.

4. Constraining XML


Last updated on Apr 1, 2001, by Adrian German for A348/A548