Skip Headers

Oracle9i Application Developer's Guide - XML
Release 1 (9.0.1)

Part Number A88894-01
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

20
Using XML Parser for Java

This chapter contains the following sections:

XML Parser for Java: Features

Oracle provides a set of XML parsers for Java, C, C++, and PL/SQL. Each of these parsers is a stand-alone XML component that parses an XML document (or a standalone DTD or XML Schema) so that it can be processed by an application. Library and command-line versions are provided supporting the following standards and features:

Additional features include:

The parsers are available on all Oracle platforms.

Figure 20-1 shows an XML document inputting XML Parser for Java. The DOM or SAX parser interface parses the XML document. The parsed XML is then transferred to the application for further processing.

If a stylesheet is used, the DOM or SAX interface also parses and outputs the XSL commands. These are sent together with the parsed XML to the XSLT Processor where the selected stylesheet is applied and the transformed (new) XML document is then output.

See Also:

Appendix C, "XDK for Java: Specifications and Cheat Sheets"

Figure 20-1 Oracle XML Parser


Text description of adxml002.gif follows
Text description of the illustration adxml002.gif

DOM and SAX APIs are explained in "DOM and SAX APIs".

The classes and methods used to parse an XML document are illustrated in the following diagrams:

The classes and methods used by the XSLT Processor to apply stylesheets are illustrated in the following diagram:

XSL Transformation (XSLT) Processor

The V2 versions of the XML Parsers include an integrated XSL Transformation (XSLT) Processor for transforming XML data using XSL stylesheets. Using the XSLT processor, you can transform XML documents from XML to XML, XML to HTML, or to virtually any other text-based format. See Figure 20-1.

The processor supports the following standards and features:

Namespace Support

The Java, C, and C++ XML parsers also support XML Namespaces. Namespaces are a mechanism to resolve or avoid name collisions between element types (tags) or attributes in XML documents.

This mechanism provides "universal" namespace element types and attribute names whose scope extends beyond this manual.

Such tags are qualified by uniform resource identifiers (URIs), such as:

<oracle:EMP xmlns:oracle="http://www.oracle.com/xml"/>

For example, namespaces can be used to identify an Oracle <EMP> data element as distinct from another company's definition of an <EMP> data element.

This enables an application to more easily identify elements and attributes it is designed to process. The Java, C, and C++ parsers support namespaces by being able to recognize and parse universal element types and attribute names, as well as unqualified "local" element types and attribute names.

See Also:

Chapter 21, "Using XML Schema Processor for Java" 

Oracle XML Parsers Support Four Validation Modes

The Java, C, and C++ parsers can parse XML in validating or non-validating modes.

Validation involves checking whether or not the attribute names and element tags are legal, whether nested elements belong where they are, and so on.

See Also:

Oracle9i XML Reference 

Parsers Access XML Document's Content and Structure

XML documents are made up of storage units called entities, which contain either parsed or unparsed data. Parsed data is made up of characters, some of which form character data, and some of which form markup.

Markup encodes a description of the document's storage layout and logical structure. XML provides a mechanism to impose constraints on the storage layout and logical structure.

A software module called an XML processor is used to read XML documents and provide access to their content and structure. It is assumed that an XML processor is doing its work on behalf of another module, called the application.

This parsing process is illustrated in Figure 20-2.

Figure 20-2 XML Parsing Process


Text description of adxml040.gif follows
Text description of the illustration adxml040.gif

DOM and SAX APIs

XML APIs generally fall into the following two categories:

See Figure 20-3. Consider the following simple XML document:

<?xml version="1.0"?>
  <EMPLIST>
    <EMP>
     <ENAME>MARY</ENAME>
    </EMP>
    <EMP>
     <ENAME>SCOTT</ENAME>
    </EMP>
  </EMPLIST>

DOM: Tree-Based API

A tree-based API (such as Document Object Model, DOM) builds an in-memory tree representation of the XML document. It provides classes and methods for an application to navigate and process the tree.

In general, the DOM interface is most useful for structural manipulations of the XML tree, such as reordering elements, adding or deleting elements and attributes, renaming elements, and so on. For example, for the XML document above, the DOM creates an in-memory tree structure as shown inFigure 20-3.

SAX: Event -Based API

An event-based API (such as SAX) uses calls to report parsing events to the application. The application deals with these events through customized event handlers. Events include the start and end of elements and characters.

Unlike tree-based APIs, event-based APIs usually do not build in-memory tree representations of the XML documents. Therefore, in general, SAX is useful for applications that do not need to manipulate the XML tree, such as search operations, among others.

The above XML document becomes a series of linear events as shown in Figure 20-3.

Figure 20-3 Comparing DOM (Tree-Based) and SAX (Event-Based) APIs


Text description of adxml041.gif follows
Text description of the illustration adxml041.gif
[

Guidelines for Using DOM and SAX APIs

Here are some guidelines for using the DOM and SAX APIs:

DOM:

SAX:

Use the SAX API when your data is mostly streaming data.

XML Parser and Data Compression

Oracle XML Parser can also compress XML documents. Using the compression feature, an in-memory DOM tree or the SAX events generated from an XML document can be compressed to generate a binary compressed output.

The compressed stream generated from DOM and SAX are compatible, that is, the compressed stream generated from SAX could be used to generate the DOM tree and vice versa. The compression is based on tokenizing the XML tags. This is based on the assumption that XML files typically have repeated tags and tokenizing the tags compresses the data. The compression depends on the type of input XML document -- the larger the number of tags, the less the text content, and the better the compression.

As with XML documents in general, you can store the compressed XML data output as a CLOB (Character Large Object) in the database.

XML Serialization/Compression

An XML document is compressed into a binary stream by means of the serialization of an in-memory DOM tree. When a large XML document is parsed and a DOM tree is created in memory corresponding to it, it may be difficult to satisfy memory requirements and this could affect performance. The XML document is compressed into a byte stream and stored in an in-memory DOM tree. This can be expanded at a later time into a DOM tree without performing validation on the XML data stored in the compressed stream.

The compressed stream can be treated as a serialized stream, but note that the information in the stream is more controlled and managed, compared to the compression implemented by Java's default serialization.

In this release, there are two kinds of XML compressed streams:

The compressed stream is generated using SAX events and that generated using DOM serialization are compatible. You can use the compressed stream generated by SAX events to create a DOM tree and vice versa. The compression algorithm used is based on tokenizing the XML tag's. The assumption is that any XML file has repeated number of tags and therefore tokenizing the tags will give considerable compression.

See Also:

 

Upgrading XDK for Java

Upgrading XDK for Java from a Previous Release to Oracle9i

If you already have XDK for Java installed, and are upgrading to Oracle9i, follow these steps:

  1. Make sure you have successfully upgraded JServer.

  2. Change to the ORACLE_HOME/rdbms/admin directory.

  3. Start SQL*Plus.

  4. Connect to the database instance as a user with SYSDBA privileges.

  5. Run STARTUP:

    SQL> STARTUP
    
    

    You may need to use the PFILE option to specify the location of your initialization parameter file.

  6. Run the appropriate upgrade script depending on the release from which you are upgrading.

    If you are upgrading from release 8.1.5, run xmlu815.sql:

    SQL> @xmlu815.sql
    
    

    If you are upgrading from release 8.1.6, run xmlu816.sql:

    SQL> @xmlu816.sql
    
    

    If you are upgrading from release 8.1.7, run xmlu817.sql:

    SQL> @xmlu817.sql
    
    
  7. Shut down all instances using SHUTDOWN:

    SQL> SHUTDOWN
    
    
  8. Exit SQL*Plus.

The XDK for Java component is upgraded to the new release.

Upgrading Session Namespace, CORBA, and OSE

  1. Make sure you have successfully upgraded JServer and XDK for Java.

  2. At a system prompt, change to the ORACLE_HOME/javavm/install directory.

Upgrading JSP

If the Oracle system has JSP installed, then complete the following steps:

  1. Make sure you have successfully upgraded JServer, XDK for Java, and Session Namespace, CORBA, and OSE.

  2. At a system prompt, change to the ORACLE_HOME/javavm/install directory.

Downgrading to Oracle Release 8.1

See Chapter 13 of the Oracle9i Migration manual.

Running the XML Parser for Java Samples

Table 20-1 lists the XML Parser for Java examples provided with XDK for Java software. The samples are located in the sample/ subdirectory. They illustrate how to use Oracle XML Parser for Java.

Table 20-1 XML Parser for Java Samples  
Name of Sample File  Description 

DOMSample.java 

A sample application using DOM APIs. 

SAXSample.java 

A sample application using SAX APIs. 

XSLSample.java 

A sample application using XSL APIs. 

DOMNamespace.java 

A sample application using Namespace extensions to DOM APIs. 

SAXNamespace.java 

A sample application using Namespace extensions to SAX APIs. 

Note that because some package names are different in V2, different files were generated to show the differences between V2 and V1 of the XML Parser for Java.

To run the sample programs:

  1. Use "make" to generate .class files.

  2. Add xmlparserv2.jar and the current directory to the CLASSPATH.

  3. Run the sample program for DOM/SAX APIs as follows:

    java <classname> <sample xml file>
    
    
  4. Run the sample program for XSL APIs as follows:

    java XSLSample <sample xsl file> <sample xml file>
    
    

A few XML files such as class.xml, empl.xml, and family.xml, are provided as test cases.

XSL stylesheet iden.xsl, can be used to achieve an identity transformation of the supplied XML files:

XML Parser for Java - XML Sample 1: class.xml

<?xml version = "1.0"?>
<!DOCTYPE course [
<!ELEMENT course (Name, Dept, Instructor, Student)>
<!ELEMENT Name (#PCDATA)>
<!ELEMENT Dept (#PCDATA)>
<!ELEMENT Instructor (Name)>
<!ELEMENT Student (Name*)>
]>
<course>
<Name>Calculus</Name>
<Dept>Math</Dept>
<Instructor>
<Name>Jim Green</Name>
</Instructor>
<Student>
<Name>Jack</Name>
<Name>Mary</Name>
<Name>Paul</Name>
</Student>
</course>

XML Parser for Java - XML Example 2: Using DTD employee -- employee.xml

<?xml version="1.0"?>
<!DOCTYPE employee [
<!ELEMENT employee (Name, Dept, Title)>
<!ELEMENT Name (#PCDATA)>
<!ELEMENT Dept (#PCDATA)>
<!ELEMENT Title (#PCDATA)>
]>
<employee>
<Name>John Goodman</Name>
<Dept>Manufacturing</Dept>
<Title>Supervisor</Title>
</employee>

XML Parser for Java - XML Example 3: Using DTD family.dtd -- family.xml

<?xml version="1.0" standalone="no"?>
<!DOCTYPE family SYSTEM "family.dtd">
<family lastname="Smith">
<member memberid="m1">Sarah</member>
<member memberid="m2">Bob</member>
<member memberid="m3" mom="m1" dad="m2">Joanne</member>
<member memberid="m4" mom="m1" dad="m2">Jim</member>
</family>

DTD: family.dtd

<!ELEMENT family (member*)>
<!ATTLIST family lastname CDATA #REQUIRED>
<!ELEMENT member (#PCDATA)>
<!ATTLIST member memberid ID #REQUIRED>
<!ATTLIST member dad IDREF #IMPLIED>
<!ATTLIST member mom IDREF #IMPLIED>

XML Parser for Java -- XSL Example 1: XSL (iden.xsl)

<?xml version="1.0"?> 
<!-- Identity transformation -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:template match="*|@*|comment()|processing-instruction()|text()">
      <xsl:copy>
          <xsl:apply-templates 
select="*|@*|comment()|processing-instruction()|text()"/>
      </xsl:copy>
  </xsl:template>
     
</xsl:stylesheet>

XML Parser for Java - DTD Example 1: (NSExample)

<!DOCTYPE doc [
<!ELEMENT doc (child*)>
<!ATTLIST doc xmlns:nsprefix CDATA #IMPLIED>
<!ATTLIST doc xmlns CDATA #IMPLIED>
<!ATTLIST doc nsprefix:a1 CDATA #IMPLIED>
<!ELEMENT child (#PCDATA)>
]>
<doc nsprefix:a1 = "v1" xmlns="http://www.w3c.org" 
xmlns:nsprefix="http://www.oracle.com">
<child>
This element inherits the default Namespace of doc.
</child>
</doc>

Using XML Parser for Java: DOMParser() Class

To write DOM based parser applications you can use the following classes:

Since DOMParser extends XMLParser, all methods of XMLparser are also available to DOMParser. Figure 20-4 shows the main steps you need when coding with the DOMParser() class:

The example, "XML Parser for Java Example 1: Using the Parser and DOM API (DomSample.java)", shows hoe to use DOMParser() class.

Figure 20-4 XML Parser for Java: DOMParser()


Text description of adxml055.gif follows
Text description of the illustration adxml055.gif

XML Parser for Java Example 1: Using the Parser and DOM API (DomSample.java)

The examples represent the way we write code so it is required to present the examples with Java coding standards (like all imports expanded), with documentation headers before the methods, and so on.

// This file demonstates a simple use of the parser and DOM API.
// The XML file given to the application is parsed.
// The elements and attributes in the document are printed.
// This demonstrates setting the parser options.
//

import java.io.*;
import java.net.*;
import org.w3c.dom.*;
import org.w3c.dom.Node;

import oracle.xml.parser.v2.*;

public class DOMSample
{
   static public void main(String[] argv)
   {
      try
      {
         if (argv.length != 1) 
         {
            // Must pass in the name of the XML file.
            System.err.println("Usage: java DOMSample filename");
            System.exit(1);
         }

         // Get an instance of the parser
         DOMParser parser = new DOMParser();

	 // Generate a URL from the filename.
	 URL url = createURL(argv[0]);

         // Set various parser options: validation on,
         // warnings shown, error stream set to stderr.
         parser.setErrorStream(System.err);
         parser.setValidationMode(DTD_validation);
         parser.showWarnings(true);

	 // Parse the document.
         parser.parse(url);

         // Obtain the document.
         XMLDocument doc = parser.getDocument();

         // Print document elements
         System.out.print("The elements are: ");
         printElements(doc);

         // Print document element attributes
         System.out.println("The attributes of each element are: ");
         printElementAttributes(doc);
         parser.reset();
      }
      catch (Exception e)
      {
         System.out.println(e.toString());
      }
   }

   static void printElements(Document doc)
   {
      NodeList nl = doc.getElementsByTagName("*");
      Node n;
         
      for (int i=0; i<nl.getLength(); i++)
      {
         n = nl.item(i);
         System.out.print(n.getNodeName() + " ");
      }

      System.out.println();
   }

   static void printElementAttributes(Document doc)
   {
      NodeList nl = doc.getElementsByTagName("*");
      Element e;
      Node n;
      NamedNodeMap nnm;

      String attrname;
      String attrval;
      int i, len;

      len = nl.getLength();
      for (int j=0; j < len; j++)
      {
         e = (Element)nl.item(j);
         System.out.println(e.getTagName() + ":");
         nnm = e.getAttributes();
         if (nnm != null)
         {
            for (i=0; i<nnm.getLength(); i++)
            {
               n = nnm.item(i);
               attrname = n.getNodeName();
               attrval = n.getNodeValue();
               System.out.print(" " + attrname + " = " + attrval);
            }
         }
         System.out.println();
      }
   }

   static URL createURL(String fileName)
   {
      URL url = null;
      try 
      {
         url = new URL(fileName);
      } 
      catch (MalformedURLException ex) 
      {
         File f = new File(fileName);
         try 
         {
            String path = f.getAbsolutePath();
            String fs = System.getProperty("file.separator");
            if (fs.length() == 1)
            {
               char sep = fs.charAt(0);
               if (sep != '/')
                  path = path.replace(sep, '/');
               if (path.charAt(0) != '/')
                  path = '/' + path;
            }
            path = "file://" + path;
            url = new URL(path);
         } 
         catch (MalformedURLException e) 
         {
            System.out.println("Cannot create url for: " + fileName);
            System.exit(0);
         }
      }
      return url;
   }
}

Comments on DOMParser() Example 1

See also Figure 20-4. The following provides comments for Example 1:

  1. Declare a new DOMParser(). In Example 1, see the line:

    DOMParser parser = new DOMParser();
    
    

    This class has several properties you can use. Here the example uses:

    parser.setErrorStream(System.err);
    parser.setValidationMode(DTD_validation);
    parser.showWarnings(true);
    
    
  2. The XML input is a URL as declared by:

    URL url = createURL(argv[0])
    
    
  3. The XML document is input as a URL. This is parsed using parser.parse():

     parser.parse(url);
    
    
  4. Gets the document:

    XMLDocument doc = parser.getDocument();
    
    
  5. Applies other DOM methods. In this case:

    • Node class methods:

      • getElementsByTagName()

      • getAttributes()

      • getNodeName()

      • getNodeValue()

    • Method, createURL() to convert the string name into a URL.

  6. parser.reset() is called to clean up any data structure created during the parse process, after the DOM tree has been created. Note that this is a new method with this release.

  7. Generates the DOM tree (parsed XML) document for further processing by your application.


    Note:

    No DTD input is shown in Example 1. 


Using XML Parser for Java: DOMNamespace() Class

Figure 20-3 illustrates the main processes involved when parsing an XML document using the DOM interface. The DOMNamespace() method is applied in the parser process at the "bubble" that states "Apply other DOM methods". The following example illustrates how to use DOMNamespace():

XML Parser for Java Example 2: Parsing a URL -- DOMNamespace.java

// This file demonstates a simple use of the parser and Namespace
// extensions to the DOM APIs. 
// The XML file given to the application is parsed and the
// elements and attributes in the document are printed.
//

import java.io.*;
import java.net.*;

import oracle.xml.parser.v2.DOMParser;

import org.w3c.dom.*;
import org.w3c.dom.Node;

// Extensions to DOM Interfaces for Namespace support.
import oracle.xml.parser.v2.XMLElement;
import oracle.xml.parser.v2.XMLAttr;


public class DOMNamespace
{
   static public void main(String[] argv)
   {
      try
      {
         if (argv.length != 1) 
         {
            // Must pass in the name of the XML file.
            System.err.println("Usage: DOMNamespace filename");
            System.exit(1);
         }

         // Get an instance of the parser
         Class cls = Class.forName("oracle.xml.parser.v2.DOMParser");
         DOMParser parser = (DOMParser)cls.newInstance();

	 // Generate a URL from the filename.
	 URL url = createURL(argv[0]);

	 // Parse the document.
         parser.parse(url);

         // Obtain the document.
         Document doc = parser.getDocument();

         // Print document elements
         printElements(doc);

         // Print document element attributes
         System.out.println("The attributes of each element are: ");
         printElementAttributes(doc);
      }
      catch (Exception e)
      {
         System.out.println(e.toString());
      }
   }

   static void printElements(Document doc)
   {
      NodeList nl = doc.getElementsByTagName("*");
      XMLElement nsElement;

      String qName;
      String localName;
      String nsName;
      String expName;
      
      System.out.println("The elements are: ");
      for (int i=0; i < nl.getLength(); i++)
      {
         nsElement = (XMLElement)nl.item(i);

         // Use the methods getQualifiedName(), getLocalName(), getNamespace()
         // and getExpandedName() in NSName interface to get Namespace
         // information.
         
         qName = nsElement.getQualifiedName();
         System.out.println("  ELEMENT Qualified Name:" + qName);
         
         localName = nsElement.getLocalName();
         System.out.println("  ELEMENT Local Name    :" + localName);
         
         nsName = nsElement.getNamespace();
         System.out.println("  ELEMENT Namespace     :" + nsName);
         
         expName = nsElement.getExpandedName();
         System.out.println("  ELEMENT Expanded Name :" + expName);
      }
      
      System.out.println();
   }

   static void printElementAttributes(Document doc)
   {
      NodeList nl = doc.getElementsByTagName("*");
      Element e;
      XMLAttr nsAttr;
      String attrname;
      String attrval;
      String attrqname;

      NamedNodeMap nnm;
      int i, len;
      len = nl.getLength();
      for (int j=0; j < len; j++)
      {
         e = (Element) nl.item(j);
         System.out.println(e.getTagName() + ":");
         nnm = e.getAttributes();

         if (nnm != null)
         {
            for (i=0; i < nnm.getLength(); i++)
            {
               nsAttr = (XMLAttr) nnm.item(i);

               // Use the methods getQualifiedName(), getLocalName(), 
               // getNamespace() and getExpandedName() in NSName 
               // interface to get Namespace information.

               attrname = nsAttr.getExpandedName();
               attrqname = nsAttr.getQualifiedName();
               attrval = nsAttr.getNodeValue();

               System.out.println(" " + attrqname + "(" + attrname + ")" + " = " 
+ attrval);
            }
         }
         System.out.println();
      }
   }

   static URL createURL(String fileName)
   {
      URL url = null;
      try 
      {
         url = new URL(fileName);
      } 
      catch (MalformedURLException ex) 
      {
         File f = new File(fileName);
         try 
         {
            String path = f.getAbsolutePath();
            String fs = System.getProperty("file.separator");
            if (fs.length() == 1)
            {
               char sep = fs.charAt(0);
               if (sep != '/')
                  path = path.replace(sep, '/');
               if (path.charAt(0) != '/')
                  path = '/' + path;
            }
            path = "file://" + path;
            url = new URL(path);
         } 
         catch (MalformedURLException e) 
         {
            System.out.println("Cannot create url for: " + fileName);
            System.exit(0);
         }
      }
      return url;
   }
}


Note:

No DTD is input is shown in Example 2. 


Using XML Parser for Java: SAXParser() Class

Applications can register a SAX handler to receive notification of various parser events. XMLReader is the interface that an XML parser's SAX2 driver must implement. This interface allows an application to set and query features and properties in the parser, to register event handlers for document processing, and to initiate a document parse.

All SAX interfaces are assumed synchronous: the parse methods must not return until parsing is complete, and readers must wait for an event-handler callback to return before reporting the next event.

This interface replaces the (now deprecated) SAX 1.0 Parser interface. The XMLReader interface contains two important enhancements over the old Parser interface:

Table 20-2 lists the class SAXParser() methods.

Table 20-2 Class SAXParser() Methods 
Method  Description 

getContentHandler() 

Returns the current content handler. 

getDTDHandler() 

Returns the current DTD handler. 

getEntityResolver() 

Returns the current entity resolver. 

getErrorHandler() 

Returns the current error handler. 

getFeature(java.lang.String name) 

Looks up the value of a feature. 

getProperty(java.lang.String name) 

Looks up the value of a property. 

setContentHandler(ContentHandler handler) 

Allows an application to register a content event handler. 

setDocumentHandler(DocumentHandler handler) 

Deprecated. as of SAX2.0 - Replaced by setContentHandler 

setDTDHandler(DTDHandler handler) 

Allows an application to register a DTD event handler. 

setEntityResolver(EntityResolver resolver) 

Allows an application to register an entity resolver. 

setErrorHandler(ErrorHandler handler) 

Allows an application to register an error event handler. 

setFeature(java.lang.String name, boolean value) 

Sets the state of a feature. 

setProperty(java.lang.String name, java.lang.Object value) 

Sets the value of a property. 

Figure 20-5 shows the main steps for coding with the SAXParser() class.

  1. Declare a new DOMParser() class. Table 20-2 lists the available methods.

  2. The results of 1) are passed to .parse() along with the XML input in the form of a file, string, or URL.

  3. Parse methods return when parsing completes. Meanwhile the process waits for an event-handler callback to return before reporting the next event.

  4. The parsed XML document is available for further processing by your application.

The example, "XML Parser for Java Example 3: Using the Parser and SAX API (SAXSample.java)", illustrates how you can use SAXParser() class and several handler interfaces.

Figure 20-5 Using SAXParser() Class


Text description of adxml052.gif follows
Text description of the illustration adxml052.gif

XML Parser for Java Example 3: Using the Parser and SAX API (SAXSample.java)

// This file demonstates a simple use of the parser and SAX API.
// The XML file given to the application is parsed and 
// prints out some information about the contents of this file.
//

import org.xml.sax.*;
import java.io.*;
import java.net.*;
import oracle.xml.parser.v2.*;

public class SAXSample extends HandlerBase
{
   // Store the locator
   Locator locator;

   static public void main(String[] argv)
   {
      try
      {
         if (argv.length != 1)
         {
            // Must pass in the name of the XML file.
            System.err.println("Usage: SAXSample filename");
            System.exit(1);
         }
         // (1) Create a new handler for the parser
         SAXSample sample = new SAXSample();

         // (2) Get an instance of the parser
         Parser parser = new SAXParser();

         // (3) Set Handlers in the parser
         parser.setDocumentHandler(sample);
         parser.setEntityResolver(sample);
         parser.setDTDHandler(sample);
         parser.setErrorHandler(sample);
    
         // (4) Convert file to URL and parse
         try
         {
            parser.parse(fileToURL(new File(argv[0])).toString());
         }
         catch (SAXParseException e) 
         {
            System.out.println(e.getMessage());
         }
         catch (SAXException e) 
         {
            System.out.println(e.getMessage());
         }  
      }
      catch (Exception e)
      {
         System.out.println(e.toString());
      }
   }

   static URL fileToURL(File file) 
   {
      String path = file.getAbsolutePath();
      String fSep = System.getProperty("file.separator");
      if (fSep != null && fSep.length() == 1)
         path = path.replace(fSep.charAt(0), '/');
      if (path.length() > 0 && path.charAt(0) != '/')
         path = '/' + path;
      try  
      {
         return new URL("file", null, path);
      }
      catch (java.net.MalformedURLException e) 
      {
         throw new Error("unexpected MalformedURLException");
      }
   }

   //////////////////////////////////////////////////////////////////////
   // (5) Sample implementation of DocumentHandler interface.
   //////////////////////////////////////////////////////////////////////

   public void setDocumentLocator (Locator locator)
   {
      System.out.println("SetDocumentLocator:");
      this.locator = locator;
   }

   public void startDocument() 
   {
      System.out.println("StartDocument");
   }

   public void endDocument() throws SAXException 
   {
      System.out.println("EndDocument");
   }
      
   public void startElement(String name, AttributeList atts) 
                                                  throws SAXException 
   {
      System.out.println("StartElement:"+name);
      for (int i=0;i<atts.getLength();i++)
      {
         String aname = atts.getName(i);
         String type = atts.getType(i);
         String value = atts.getValue(i);

         System.out.println("   "+aname+"("+type+")"+"="+value);
      }
      
   }

   public void endElement(String name) throws SAXException 
   {
      System.out.println("EndElement:"+name);
   }

   public void characters(char[] cbuf, int start, int len) 
   {
      System.out.print("Characters:");
      System.out.println(new String(cbuf,start,len));
   }

   public void ignorableWhitespace(char[] cbuf, int start, int len) 
   {
      System.out.println("IgnorableWhiteSpace");
   }
   
   
   public void processingInstruction(String target, String data) 
              throws SAXException 
   {
      System.out.println("ProcessingInstruction:"+target+" "+data);
   }
   
   //////////////////////////////////////////////////////////////////////
   // (6) Sample implementation of the EntityResolver interface.
   //////////////////////////////////////////////////////////////////////

   public InputSource resolveEntity (String publicId, String systemId)
                      throws SAXException
   {
      System.out.println("ResolveEntity:"+publicId+" "+systemId);
      System.out.println("Locator:"+locator.getPublicId()+" "+
                  locator.getSystemId()+
                  " "+locator.getLineNumber()+" "+locator.getColumnNumber());
      return null;
   }

   //////////////////////////////////////////////////////////////////////
   // (7) Sample implementation of the DTDHandler interface.
   //////////////////////////////////////////////////////////////////////

   public void notationDecl (String name, String publicId, String systemId)
   {
      System.out.println("NotationDecl:"+name+" "+publicId+" "+systemId);
   }

   public void unparsedEntityDecl (String name, String publicId,
         String systemId, String notationName)
   {
      System.out.println("UnparsedEntityDecl:"+name + " "+publicId+" "+
         systemId+" "+notationName);
   }

   //////////////////////////////////////////////////////////////////////
   // (8) Sample implementation of the ErrorHandler interface.
   //////////////////////////////////////////////////////////////////////

   public void warning (SAXParseException e)
         throws SAXException
   {
      System.out.println("Warning:"+e.getMessage());
   }

   public void error (SAXParseException e)
         throws SAXException
   {
      throw new SAXException(e.getMessage());
   }


   public void fatalError (SAXParseException e)
         throws SAXException
   {
      System.out.println("Fatal error");
      throw new SAXException(e.getMessage());
   }
}

Using XML Parser for Java: aXSLT Processor

To implement the XSLT Processor in the XML Parser for Java use XSLProcessor class.

Figure 20-6 shows the overall process used by class, XSLProcessor.

  1. A new XSLProcessor() class declaration begins the XSLT process.

  2. There are two inputs:

    • "Stylesheet". First a stylesheet is built. A new XSLStylesheet() class is declared with any of the following available methods:

      • removeParam()

      • resetParam()

      • setParam()

    • "XML input". This can repeat 1 through n times for a particular stylesheet. This inputs the "Process Stylesheet" step.

    Both inputs can be one of four types:

    • input stream

    • URL

    • XML document

    • Reader

  3. The resulting stylesheet object and the XML input, feed the "Process Stylesheet" step, namely:

    XSLProcessor.processXSL(xslstylesheet, xml instance)
    
    
  4. The XSLProcessor.processXSL() method processes the XML input 1 through n times, using the selected stylesheet.

  5. XSLProcessor.processXSL() outputs either an output stream or a DOM document.

XML Parser for Java XSLT Processor is illustrated by the following examples:

Figure 20-6 XSLProcessor Class Process


Text description of adxml051.gif follows
Text description of the illustration adxml051.gif

XML Parser for Java Example 4: (XSLSample.java)

/**
 * This file gives a simple example of how to use the XSL processing 
 * capabilities of the Oracle XML Parser V2.0. An input XML document is
 * transformed using a given input stylesheet
 */

import org.w3c.dom.*;
import java.util.*;
import java.io.*;
import java.net.*;
import oracle.xml.parser.v2.*;

public class XSLSample 
{
   /**
    * Transforms an xml document using a stylesheet
    * @param args input xml and xml documents
    */
   public static void main (String args[]) throws Exception
   {
      DOMParser parser;
      XMLDocument xml, xsldoc, out;
      URL xslURL;
      URL xmlURL;

      try 
      {

         if (args.length != 2) 
         {
            // Must pass in the names of the XSL and XML files
            System.err.println("Usage: java XSLSample xslfile xmlfile");
            System.exit(1);
         }

         // Parse xsl and xml documents
         
         parser = new DOMParser();
         parser.setPreserveWhitespace(true);

         // parser input XSL file
         xslURL = createURL(args[0]);
         parser.parse(xslURL);
         xsldoc = parser.getDocument();
         
         // parser input XML file
         xmlURL = createURL(args[1]);
         parser.parse(xmlURL);
         xml = parser.getDocument();

         // instantiate a stylesheet
         XSLStylesheet xsl = new XSLStylesheet(xsldoc, xslURL);
         XSLProcessor processor = new XSLProcessor();

         // display any warnings that may occur
         processor.showWarnings(true);
         processor.setErrorStream(System.err);

         // Process XSL
         DocumentFragment result = processor.processXSL(xsl, xml);

         // create an output document to hold the result
         out = new XMLDocument();

         // create a dummy document element for the output document
         Element root = out.createElement("root");
         out.appendChild(root);

         // append the transformed tree to the dummy document element
         root.appendChild(result);
         
         // print the transformed document
         out.print(System.out);
      }
      catch (Exception e)
      {
         e.printStackTrace();
      }
   }

   // Helper method to create a URL from a file name
   static URL createURL(String fileName)
   {
      URL url = null;
      try 
      {
         url = new URL(fileName);
      } 
      catch (MalformedURLException ex) 
      {
         File f = new File(fileName);
         try 
         {
            String path = f.getAbsolutePath();
            // This is a bunch of weird code that is required to
            // make a valid URL on the Windows platform, due
            // to inconsistencies in what getAbsolutePath returns.
            String fs = System.getProperty("file.separator");
            if (fs.length() == 1)
            {
               char sep = fs.charAt(0);
               if (sep != '/')
                  path = path.replace(sep, '/');
               if (path.charAt(0) != '/')
                  path = '/' + path;
            }
            path = "file://" + path;
            url = new URL(path);
         } 
         catch (MalformedURLException e) 
         {
            System.out.println("Cannot create url for: " + fileName);
            System.exit(0);
         }
      }
      return url;
   }
}

XML Parser for Java Example 5: Using the DOM API and XSLT Processor

This example code is not included in the sample/ subdirectory. It uses the XML Parser for Java v2, to perform the following tasks.

Comments on XSLT Example 5

See Figure 20-4 and Figure 20-6. The following provides comments for Example 5:

  1. The program inputs two URL documents:

    • URL xmlURL;

    • URL xslURL;

  2. Parse the two documents and set the preserve white space property:

    parser = new DOMParser();
    parser.setPreserveWhitespace(true);
    
    
  3. Get the XSL and XML documents

    xslURL = createURL(args[0]);
    parser.parse(xslURL);
    xsldoc = parser.getDocument();
    
    xmlURL = createURL(args[1]);
    xmlURL = createURL(args[1]);
    parser.parse(xmlURL);
    xml = parser.getDocument();
    
    
  4. Initialize a new XSLStylesheet and XSLProcessor class:

    XSLStylesheet xsl = new XSLStylesheet(xsldoc, xslURL);
    
    XSLProcessor processor = new XSLProcessor();
    
        processor.setErrorStream(System.err);
    
    
  5. Process the stylesheet

    DocumentFragment result = processor.processXSL(xsl, xml);
    
    
  6. Output the DOM XML transformed document

    out = new XMLDocument();
    Element root = out.createElement("root");
    out.appendChild(root);
    root.appendChild(result);
    
    

Using XML Parser for Java: SAXNamespace() Class

Using the SAXNamespace() class is illustrated in the following example:

XML Parser for Java Example 6: (SAXNamespace.java)

// This file demonstrates a simple use of the Namespace extensions to 
// the SAX APIs.

import org.xml.sax.*;
import java.io.*;
import java.net.URL;
import java.net.MalformedURLException;

// Extensions to the SAX Interfaces for Namespace support.
import oracle.xml.parser.v2.XMLDocumentHandler;
import oracle.xml.parser.v2.DefaultXMLDocumentHandler;
import oracle.xml.parser.v2.NSName;
import oracle.xml.parser.v2.SAXAttrList;

import oracle.xml.parser.v2.SAXParser;

public class SAXNamespace {
  static public void main(String[] args) {
     String fileName;

     //Get the file name
     if (args.length == 0)
     {
        System.err.println("No file Specified!!!");
        System.err.println("USAGE: java SAXNamespace <filename>");
        return;
     }
     else
     {
        fileName = args[0];
     }
     
     try {
        // Create handlers for the parser
        // Use the XMLDocumentHandler interface for namespace support 
        // instead of org.xml.sax.DocumentHandler
        XMLDocumentHandler xmlDocHandler = new XMLDocumentHandlerImpl();

        // For all the other interface use the default provided by
        // Handler base
        HandlerBase defHandler = new HandlerBase();

        // Get an instance of the parser
        SAXParser parser = new SAXParser();
           
        // Set Handlers in the parser
        // Set the DocumentHandler to XMLDocumentHandler
        parser.setDocumentHandler(xmlDocHandler);

        // Set the other Handler to the defHandler
        parser.setErrorHandler(defHandler);
        parser.setEntityResolver(defHandler);
        parser.setDTDHandler(defHandler);
           
        try 
        {
           parser.parse(fileToURL(new File(fileName)).toString());
        }
        catch (SAXParseException e) 
        {
           System.err.println(args[0] + ": " + e.getMessage());
        }
        catch (SAXException e) 
        {
           System.err.println(args[0] + ": " + e.getMessage());
        }  
     }
     catch (Exception e) 
     {
        System.err.println(e.toString());
     }
  }
  
static public URL fileToURL(File file) 
   {
    String path = file.getAbsolutePath();
    String fSep = System.getProperty("file.separator");
    if (fSep != null && fSep.length() == 1)
      path = path.replace(fSep.charAt(0), '/');
    if (path.length() > 0 && path.charAt(0) != '/')
      path = '/' + path;
    try {
      return new URL("file", null, path);
    }
    catch (java.net.MalformedURLException e) {
      /* According to the spec this could only happen if the file
	 protocol were not recognized. */
      throw new Error("unexpected MalformedURLException");
    }
  }

  private SAXNamespace() throws IOException 
   {
   }
   
}
   /***********************************************************************
     Implementation of XMLDocumentHandler interface. Only the new
     startElement and endElement interfaces are implemented here. All other
     interfaces are implemented in the class HandlerBase.
     **********************************************************************/

class XMLDocumentHandlerImpl extends DefaultXMLDocumentHandler
{

   public void XMLDocumentHandlerImpl()
   {
   }

      
   public void startElement(NSName name, SAXAttrList atts) throws SAXException 
   {

      // Use the methods getQualifiedName(), getLocalName(), getNamespace()
      // and getExpandedName() in NSName interface to get Namespace
      // information.
      String qName;
      String localName;
      String nsName;
      String expName;
      qName = name.getQualifiedName();
      System.out.println("ELEMENT Qualified Name:" + qName);
      localName = name.getLocalName();
      System.out.println("ELEMENT Local Name    :" + localName);

      nsName = name.getNamespace();
      System.out.println("ELEMENT Namespace     :" + nsName);

      expName = name.getExpandedName();
      System.out.println("ELEMENT Expanded Name :" + expName);

      for (int i=0; i<atts.getLength(); i++)
      {

      // Use the methods getQualifiedName(), getLocalName(), getNamespace()
      // and getExpandedName() in SAXAttrList interface to get Namespace
      // information.
         qName = atts.getQualifiedName(i);
         localName = atts.getLocalName(i);
         nsName = atts.getNamespace(i);
         expName = atts.getExpandedName(i);

         System.out.println(" ATTRIBUTE Qualified Name   :" + qName);
         System.out.println(" ATTRIBUTE Local Name       :" + localName);
         System.out.println(" ATTRIBUTE Namespace        :" + nsName);
         System.out.println(" ATTRIBUTE Expanded Name    :" + expName);

         // You can get the type and value of the attributes either
         // by index or by the Qualified Name.
         String type = atts.getType(qName);
         String value = atts.getValue(qName);

         System.out.println(" ATTRIBUTE Type             :" + type);
         System.out.println(" ATTRIBUTE Value            :" + value);
         System.out.println();
      }      
   }

  public void endElement(NSName name) throws SAXException 
   {
      // Use the methods getQualifiedName(), getLocalName(), getNamespace()
      // and getExpandedName() in NSName interface to get Namespace
      // information.
      String expName = name.getExpandedName();
      System.out.println("ELEMENT Expanded Name  :" + expName);
   }
}

XML Parser for Java: Command Line Interface

oraxml - Oracle XML parser

oraxml is a command-line interface to parse an XML document. It checks for well-formedness and validity.

To use oraxml ensure the following:

Use the following syntax to invoke oraxml:

oraxml options* source 

oraxml expects to be given an XML file to parse. Table 20-3 lists oraxml's command line options.

Table 20-3 oraxml: Command Line Options
Option  Purpose 

-h 

Help mod. Prints oraxml invocation syntax. 

-v 

Partial validation mo. If this option is not used, the parser checks only for well formedness. 

-s 

Strict validation mode. 

-w 

Show warnings. By default, warnings are turned off. 

-debug 

Debug mod. By default, debug mode is turned off. 

-e <error log> 

A file to write errors to. Specify a log file to write errors and warnings.  

oraxsl - Oracle XSL processor

oraxsl is a command-line interface used to apply a stylesheet on multiple XML documents. It accepts a number of command-line options that dictate how it should behave.

To use oraxsl ensure the following:

Use the following syntax to invoke oraxsl:

oraxsl options* source? stylesheet? result? 

oraxsl expects to be given a stylesheet, an XML file to transform, and optionally, a result file. If no result file is specified, it outputs the transformed document to standard out. If multiple XML documents need to be transformed by a stylesheet, the -l or -d options in conjunction with the -s and -r options should be used instead. These and other options are described in Table 20-4.

Table 20-4 oraxsl: Command Line Options  
Option  Purpose 

-h 

Help mode (prints oraxsl invocation syntax) 

-v 

Verbose mode (some debugging information is printed and could help in tracing any problems that are encountered during processing) 

-w 

Show warnings (by default, warnings are turned off) 

-debug 

New - Debug mode (by default, debug mode is turned off) 

-e <error log> 

A file to write errors to (specify a log file to write errors and warnings). 

-t <# of threads> 

Number of threads to use for processing (using multiple threads could provide performance improvements when processing multiple documents). 

-l <xml file list> 

List of files to transform (allows you to explicitly list the files to be processed).  

-d <directory> 

Directory with files to transform (the default behavior is to process all files in the directory). If only a certain subset of the files in that directory, e.g., one file, need to be processed, this behavior must be changed by using -l and specifying just the files that need to be processed. You could also change the behavior by using the '-x' or '-i' option to select files based on their extension). 

-x <source extension> 

Extensions to exclude (used in conjunction with -d. All files with the specified extension will not be selected). 

-i <source extension> 

Extensions to include (used in conjunction with -d. Only files with the specified extension will be selected). 

-s <stylesheet> 

Stylesheet to use (if -d or -l is specified, this option needs to be specified to specify the stylesheet to be used. The complete path must be specified). 

-r <result extension> 

Extension to use for results (if -d or -l is specified, this option must be specified to specify the extension to be used for the results of the transformation. So, if one specifies the extension "out", an input document "foo" would get transformed to "foo.out". By default, the results are placed in the current directory. This is can be changed by using the -o option which allows you to specify a directory to hold the results). 

-o <result directory> 

Directory to place results (this must be used in conjunction with the -r option).  

-p 

List of Parameters 

XML Extension Functions for XSLT Processing

XSLT Processor Extension Functions: Introduction

XML extension functions for XSLT processing allow users of XSLT processor to call any Java method from XSL expressions. Java extension functions should belong to the namespace that starts with the following:

http://www.oracle.com/XSL/Transform/java/


An extension function that belongs to the following namespace:

http://www.oracle.com/XSL/Transform/java/classname

refers to methods in class classname. For example, the following namespace:

http://www.oracle.com/XSL/Transform/java/java.lang.String 

can be used to call java.lang.String methods from XSL expressions.

Static Versus Non-static Methods

If the method is a non-static method of the class, then the first parameter will be used as the instance on which the method is invoked, and the rest of the parameters are passed on to the method.

If the extension function is a static method, then all the parameters of the extension function are passed on as parameters to the static function.

XML Parser for Java - XSL Example 1: Static function

The following XSL, static function example:

<xsl:stylesheet  
xmlns:math="http://www.oracle.com/XSL/Transform/java/java.lang.Math"> 
  <xsl:template match="/"> 
  <xsl:value-of select="math:ceil('12.34')"/> 
</xsl:template> 
</xsl:stylesheet> 

prints out '13'.

Constructor Extension Function

The extension function 'new' creates a new instance of the class and acts as the constructor.

XML Parser for Java - XSL Example 2: Constructor Extension Function

The following constructor function example:

<xsl:stylesheet 
xmlns:jstring="http://www.oracle.com/XSL/Transform/java/java.lang.String"> 
  <xsl:template match="/"> 
  <!-- creates a new java.lang.String and stores it in the variable str1 --> 
  <xsl:variable name="str1" select="jstring:new('Hello World')"/> 
  <xsl:value-of select="jstring:toUpperCase($str1)"/> 
</xsl:template> 
</xsl:stylesheet> 

prints out 'HELLO WORLD'.

Return Value Extension Function

The result of an extension function can be of any type, including the five types defined in XSL:

They can be stored in variables or passed onto other extension functions.

If the result is of one of the five types defined in XSL, then the result can be returned as the result of an XSL expression.

XML Parser for Java XSL Example 3: Return Value Extension Function

Here is an XSL example illustrating the Return value extension function:

<!-- Declare extension function namespace --> 
<xsl:stylesheet xmlns:parser = 
"http://www.oracle.com/XSL/Transform/java/oracle.xml.parser.v2.DOMParser" 
xmlns:document = 
"http://www.oracle.com/XSL/Transform/java/oracle.xml.parser.v2.XMLDocument" > 

<xsl:template match ="/"> <!-- Create a new instance of the parser, store it in 
myparser variable --> 
<xsl:variable name="myparser" select="parser:new()"/> 
<!-- Call a non-static method of DOMParser. Since the method is anon-static 
method, the first parameter is the instance on which themethod is called. This 
is equivalent to $myparser.parse('test.xml') --> 
<xsl:value-of select="parser:parse($myparser, 'test.xml')"/> 
<!-- Get the document node of the XML Dom tree --> 
<xsl:variable name="mydocument" select="parser:getDocument($myparser)"/> 
<!-- Invoke getelementsbytagname on mydocument --> 
<xsl:for-each select="document:getElementsByTagName($mydocument,'elementname')"> 
...... 
</xsl:for-each> </xsl:template>
</xsl:stylesheet> 

Datatypes Extension Function

Overloading based on number of parameters and type is supported. Implicit type conversion is done between the five XSL types as defined in XSL.

Type conversion is done implicitly between (String, Number, Boolean, ResultTree) and from NodeSet to (String, Number, Boolean, ResultTree).

Overloading based on two types which can be implicitly converted to each other is not permitted.

XML Parser for Java Example 4: Datatype Extension Function

The following overloading will result in an error in XSL, since String and Number can be implicitly converted to each other:

Mapping between XSL type and Java type is done as following:

String -> java.lang.String
Number -> int, float, double
Boolean -> boolean
NodeSet -> XMLNodeList
ResultTree -> XMLDocumentFragment

Frequently Asked Questions (FAQs): XML Parser for Java

The XML Parser for Java Frequently Asked Questions (FAQs) are organized into the following topics:

DTDs

Checking DTD Syntax: Suggestions for Editors

Question

I was wondering if someone could help me verify the syntax for the following DTD. I realize that I can use a DTD editor to do this for me, but the editor I'm using is not very good.

 <?xml version="1.0"?>
 <!DOCTYPE CATALOG [
 
 <!ELEMENT CATALOG ( ADMIN, SCHEMA?, DATA? ) >
 <!ATTLIST CATALOG xml:lang NMTOKEN #IMPLIED >
 
 <!ELEMENT ADMIN ( NAME, INFORMATION) >
 <!ELEMENT SCHEMA (CATEGORY | DESCRIPTOR)* >
 <!ELEMENT DATA (ITEM)*>
 
 <!ELEMENT NAME (#PCDATA) >
 <!ELEMENT INFORMATION ( DATE, SOURCE )  >
 <!ELEMENT DATE (#PCDATA) >
 <!ELEMENT SOURCE (#PCDATA) >
 
 <!ELEMENT CATEGORY (NAME | KEY | TYPE | UPDATE  )* >
 <!ATTLIST CATEGORY ACTION (ADD|DELETE|UPDATE) #REQUIRED>
 <!ELEMENT DESCRIPTOR (NAME | KEY | UPDATE | OWNER  | TYPE )* >
 <!ATTLIST DESCRIPTOR ACTION (ADD|DELETE|UPDATE) #REQUIRED>
 <!ELEMENT OWNER (NAME?, KEY? ) >
 <!ELEMENT KEY (#PCDATA) >
 <!ELEMENT TYPE (#PCDATA) >
 
 <!ELEMENT ITEM (OWNER?, NAMEVALUE*, UPDATE ) >
 <!ATTLIST ITEM ACTION (ADD | DELETE | UPDATE) #REQUIRED>
 <!ELEMENT UPDATE (NAME | KEY | NAMEVALUE )* >
 
 <!ELEMENT NAMEVALUE ( NAME, VALUE ) >
 <!ELEMENT VALUE (#PCDATA)* >
 ]>

I'm unsure about the ATTLIST syntax.

Answer

I loaded this into XMLAuthority 1.1 and did a Save As. XML Authority lets you visually inspect and edit DTD's and XML Schemas. Highly recommended. http://www.extensibility.com ($99.00).

It came back with:

<!ELEMENT CATALOG  (ADMIN , SCHEMA? , DATA? )>
<!ATTLIST CATALOG  xml:lang NMTOKEN  #IMPLIED >
<!ELEMENT ADMIN  (NAME , INFORMATION )>
<!ELEMENT SCHEMA  (CATEGORY | DESCRIPTOR )*>
<!ELEMENT DATA  (ITEM )*>
<!ELEMENT NAME  (#PCDATA )>
<!ELEMENT INFORMATION  (DATE , SOURCE )>
<!ELEMENT DATE  (#PCDATA )>
<!ELEMENT SOURCE  (#PCDATA )>
<!ELEMENT CATEGORY  (NAME | KEY | TYPE | UPDATE )*>
<!ATTLIST CATEGORY  ACTION  (ADD | DELETE | UPDATE )  #REQUIRED >
<!ELEMENT DESCRIPTOR  (NAME | KEY | UPDATE | OWNER | TYPE )*>
<!ATTLIST DESCRIPTOR  ACTION  (ADD | DELETE | UPDATE )  #REQUIRED >
<!ELEMENT OWNER  (NAME? , KEY? )>
<!ELEMENT KEY  (#PCDATA )>
<!ELEMENT TYPE  (#PCDATA )>
<!ELEMENT ITEM  (OWNER? , NAMEVALUE* , UPDATE )>
<!ATTLIST ITEM  ACTION  (ADD | DELETE | UPDATE )  #REQUIRED >
<!ELEMENT UPDATE  (NAME | KEY | NAMEVALUE )*>
<!ELEMENT NAMEVALUE  (NAME , VALUE )>
<!ELEMENT VALUE  (#PCDATA )*>

DTD File in DOCTYPE Must be Relative to XML Document Location

Question

My parser doesn't find the DTD file.

Answer

The DTD file defined in the <!DOCTYPE> declaration must be relative to the location of the input XML document. Otherwise, you'll need to use the setBaseURL(url) functions to set the base URL to resolve the relative address of the DTD if the input is coming from an InputStream.

Validating an XML File Using External DTD

Question

Can I validate an XML file using an external DTD?

Answer

You need to include a reference to the applicable DTD in your XML document. Without it there is no way that the parser knows what to validate against. Including the reference is the XML standard way of specifying an external DTD. Otherwise you need to embed the DTD in your XML Document.

DTD Caching

Question

Do you have DTD caching? How do I set the DTD using v2 parser for DTD Cache purpose?

Answer

Yes, DTD caching is optional and is not enabled automatically.

The method to set the DTD is setDoctype(). Here is an example:

// Test using InputSource 
parser = new DOMParser(); 
parser.setErrorStream(System.out); 
parser.showWarnings(true); 
 
FileReader r = new FileReader(args[0]); 
InputSource inSource = new InputSource(r); 
inSource.setSystemId(createURL(args[0]).toString()); 
parser.parseDTD(inSource, args[1]); 
dtd = (DTD)parser.getDoctype(); 
 
r = new FileReader(args[2]); 
inSource = new InputSource(r); 
inSource.setSystemId(createURL(args[2]).toString()); 
parser.setDoctype(dtd); 
parser.setValidationMode(DTD_validation); 
parser.parse(inSource); 

 
doc = (XMLDocument)parser.getDocument(); doc.print(new PrintWriter(System.out));

Recognizing External DTDs

Question

How can XML Parser for Java (V2) recognize external DTD's when running from the server. The Java code has been loaded with loadjava and runs in the Oracle9i server process. My XML file has an external DTD reference.

  1. But is there a more generic way, as with the SAX parser, to redirect it to a stream or string or something if my DTD is in the database?

  2. Do you have a more generic way to redirect the DTD, analogous to that offered by the SAXParser with resolveEntity().

Answer

  1. We only have the setBaseURL() method at this time.

  2. You can achieve your desired result using the following:

    1. Parse your External DTD using a DOMParser's parseDTD() method.

    2. Call getDoctype() to get an instance of oracle.xml.parser.v2.DTD

    3. On the document where you want to set your DTD programmatically, use the: setDoctype(yourDTD); We use this technique to read a DTD out of our product's JAR file.

Loading external DTD's from a jar File

Question

I would like to put all my DTDs in a jar file, so that when the XML Parser needs a DTD it can get it from the jar. The current XML Parser supports a base URL(setBaseURL()), but that just points to a place where all the DTDs are exposed.

Answer

The solution involves a combination of:

  1. Load DTD as InputStream using:

    InputStream is =      
    YourClass.class.getResourceAsStream("/foo/bar/your.dtd");  
    

    This will open ./foo/bar/your.dtd in the first relative location on the CLASSPATH that it can be found, including out of your jar if it's in the CLASSPATH.

  2. Parse the DTD with the code:

    DOMParser d = new DOMParser();
    d.parseDTD(is, "rootelementname");
    d.setDoctype(d.getDoctype());
    
    
  3. Now parse your document with:

    d.parse("yourdoc");
    

Can I Check the Correctness of an XML Document Using their DTD?

Question

I am exporting Java objects to XML. I can construct a DOM with an XML Document and use its print method to export it. But, I am unable to set the DTD of these documents. I construct a parser, parse the DTD, and then get the DTD via Document doc = parser.getDocument() and DocType dtd = doc.getDocumentType().

How do I set the DTD of the freshly constructed XML Documents to use this one in order to be able to check the correctness of the documents using this DTD at a later time?

Answer

Your method of getting the DTD object is correct. However, we do not do any validation while creating the DOM tree using DOM APIs. So setting the DTD in the Document will not help validate the DOM tree that is constructed. The only way to validate an XML file is to parse the XML document using DOMParser or SAXParser.

Parsing a DTD Object Separately from XML Document

Question

How do I parse and get a DTD Object separately from parsing my XML document?

Answer

The parseDTD() method allows you to parse a DTD file separately and get a DTD object. Here is a sample code to do that:

DOMParser domparser = new DOMParser();
domparser.setValidationMode(DTD_validation); 
/* parse the DTD file */
domparser.parseDTD(new FileReader(dtdfile));
DTD dtd = domparser.getDocType();

Case-Sensitivity in Parser Validation against DTD?

Question

The XML file has a tag like: <xn:subjectcode>. In the DTD, it is defined as <xn:subjectCode>. When the file is parsed and validated against the DTD, it gives an error: XML-0148: (Error) Invalid element 'xn:subjectcode' in content of 'xn:Resource',...

When I changed the element name to <xn:subjectCode> instead of <xn:subjectcode> it works. Is the parser case-sensitive as far as validation against DTD's go - or is it because, there is a namespace also in the tag definition of the element and when a element is defined along with its namespace, the case-sensitivity comes into effect?

Answer

XML is inherently case-sensitive, therefore our parsers enforce case sensitivity in order to be compliant. When you run in non-validation mode only well-formedness counts. However <test></Test> would signal an error even in non-validation mode.

Extracting Embedded XML From a CDATA Section

Question

  1. I want to extract PAYLOAD and do extra processing on it.

  2. When I select the value of PAYLOAD it does not parse the data because it is in a CDATA section.

  3. How do I extract embedded XML using just XSLT. I have done this using SAX before but in the current setup all I can use is XSLT.

Answer

  1. Here are the answers:

    <PAYLOAD>
    <![CDATA[<?xml version = '1.0' encoding = 'ASCII' standalone = 'no'?>
    <ADD_PO_003>
       <CNTROLAREA>
          <BSR>
             <VERB value="ADD">ADD</VERB>
             <NOUN value="PO">PO</NOUN>
             <REVISION value="003">003</REVISION>
          </BSR>
       </CNTROLAREA>
    </ADD_PO_003>]]>
    </PAYLOAD>
    
    

    The CDATA strategy is kind of odd. You won't be able to use a different encoding on the nested XML document included as text inside the CDATA, so having the XML Declaration of the embedded document seems of little value to me. If you don't need the XML Declaration, then why not just embed the message as real elements into the <PAYLOAD> instead of as a text chunk which is what CDATA does for you.

    Just do:

    String s = YourDocumentObject.selectSingleNode("/OES_MESSAGE/PAYLOAD");
    
    
  2. It shouldn't parse the data, you've asked for it to be a big text chunk, which is what it will give you. You'll have to parse the text chunk yourself (another benefit of not using the CDATA approach) by doing something like:

      YourParser.parse( new StringReader(s));
    
    

    where s is the string you got in the previous step.

  3. There's nothing special about what's in your CDATA, it's just text. If you want the text content to be output without escaping the angle-brackets, then you'll do:

     <xsl:value-of select="/OES_MESSAGE/PAYLOAD" disable-output-escaping="yes"/>
    

Why Am I Getting an Error When I Call DOMParser.parseDTD()?

Question

I am having trouble creating a DTD and parsing it using Oracle XML Parser for Java v2. I got the following error when I call DOMParser.parseDTD() function:

Attribute value should start with quote.  

Please check my DTD and tell me what's wrong?

<?xml version = "1.0" encoding="UTF-8" ?> 
<!-- RCS_ID = "$Header: XMLRenderer.dtd 115.0 2000/09/18 03:00:10 fli noship $" 
--> 
<!-- RCS_ID_RECORDED = VersionInfo.recordClassVersion(RCS_ID,   
"oracle.apps.mwa.admin") --> 
<!--  Copyright: This DTD file is owned by Oracle Mobile Application Server   
Group.  --> 
  <!ELEMENT    page    (header?,form,footer?) > 
  <!ATTLIST    page 
               name    CDATA   #REQUIRED 
               lov     (Y|N)   'N' > 
  <!ELEMENT    header EMPTY > 
  <!ATTLIST    header 
               name    CDATA   #REQUIRED 
               title   CDATA 
               home    (Y|N)   'N' 
               portal  (Y|N)   'N' 
               logout  (Y|N)   'N' > 
  <!ELEMENT    footer EMPTY > 
  <!ATTLIST    footer 
               name    CDATA   #REQUIRED 
               home    (Y|N)   'N' 
               portal  (Y|N)   'N' 
               logout  (Y|N)   'N' 
               copyright (Y|N) 'N' > 

  <!ELEMENT    form 
  (styledText|textInput|list|link|menu|submitButton|table|separator)+ > 
  <!ATTLIST    form 
               name    CDATA    #REQUIRED 
               title    CDATA 
               type     CDATA > 

  <!ELEMENT    styledText    (#PCDATA) > 

  <!ELEMENT    textInput    EMPTY > 
  <!ATTLIST    textInput 
               name    CDATA    #REQUIRED 
               prompt    CDATA    #IMPLIED 
               password    (Y|N)    'N' 
               required    (Y|N)    'N' 
               maxlength    #IMPLIED 
               size    #IMPLIED 
               format    #IMPLIED 
               default    #IMPLIED > 

  <!ELEMENT    link (postfield*) > 
  <!ATTLIST    link 
               name    CDATA    #REQUIRED 
               title    CDATA    #REQUIRED 
               baseurl    CDATA    #REQUIRED > 

Answer

Your DTD syntax is not valid. When you declare ATTLIST with CDATA, you must put #REQUIRED, #IMPLIED, #FIXED, "any value", %paramatic_entity. For example, your DTD contains

<!ELEMENT  header EMPTY > 
<!ATTLIST  header 
           name    CDATA   #REQUIRED 
           title   CDATA 
           home    (Y|N)   'N' 
           portal  (Y|N)   'N' 
           logout  (Y|N)   'N' >

should change as follows:

<!ELEMENT  header EMPTY > 
<!ATTLIST  header 
           name   CDATA #REQUIRED 
           title  CDATA #REQUIRED <- can replaced by #FIXED, #IMPLIED, or 
"title1" 
           home    (Y|N)   'N' 
           portal  (Y|N)   'N' 
           logout  (Y|N)   'N' > 

Is There a Standard Extension To Use for External Entities References in an XML Document?

Question

Is there a standard extension (other than .xml or .txt) that should be used for external entities which are being referenced in an XML document. These external entities are not complete XML files, but rather only part of an XML file, starting with the <![CDATA[. Mostly they contain HTML, or Javascript code, but may also contain just some plain text. As an example, the external entity is A.txt which is being referenced in the XML document B.xml.

A.txt:

<![CDATA[<!-- This is just an html comment -->]]>

B.xml:

 <?xml version="1.0"?>
 <!DOCTYPE B[
 <!ENTITY htmlComment SYSTEM "A.txt">
]>

<B>
  &htmlComment;
</B>

Currently we are using .txt as an extension for all such entities, but need to change that, otherwise the translation team assumes that these files need to get translated, whereas they don't. Is there a standard extension that we should be using?

Answer

I marked up your DTD syntax in "red (bold)" in your DTD. The file extension for external entities is unimportant so you can change it to any convenient extension, including *no* extension.:-)

DOM and SAX APIs

Using the DOM API

Question

How do I get the number of elements in a particular tag using the parser?

Answer

You can use the getElementsByTagName() method that returns a NodeList of all descent elements with a given tag name. You can then find out the number of elements in that NodeList to determine the number of the elements in the particular tag.

How DOM Parser Works

Question

How does the XML DOM parser work?

Answer

The parser accepts an XML formatted document and constructs in memory a DOM tree based on its structure. It will then check whether the document is well-formed and optionally whether it complies with a DTD. It also provides methods to support DOM Level 1 and 2.

Creating a Node With Value to be Set Later

Question

How do I create a node whose value I can set later?

Answer

If you check the DOM spec referring to the table discussing the node type, you will find that if you are creating an element node, its nodeValue is to be null and hence cannot be set. However, you can create a text node and append it to the element node. You can put the value in the text node.

Traversing the XML Tree

Question

How to traverse the XML tree

Answer

You can traverse the tree by using the DOM API. Or alternately, you can use the selectNodes() method which takes XPath syntax to navigate through the XML document. selectNodes() is part of oracle.xml.parser.v2.XMLNode.

Extracting Elements from XML File

Question

How do I extract elements from the XML file?

Answer

If you're using DOM, the getElementsByTagName() method can be used to get all of the elements in the document.

Does a DTD Validate the DOM Tree?

Question

If I add a DTD to an XML Document, does it validate the DOM tree?

Answer

No, we do not do any validation while creating the DOM tree using the DOM APIs. So setting the DTD in the Document will not help in validating the DOM tree that is constructed. The only way to validate an XML file is to parse the XML document using the DOMParser or SAXParser.

First Child Node Element Value

Question

How do I efficiently obtain the value of first child node of the element without going through the DOM Tree?

Answer

If you do not need the entire tree, use the SAX interface to return the desired data. Since it is event-driven, it does not have to parse the whole document.

Creating DocType Node

Question

How do I create a DocType Node?

Answer

The only current way of creating a doctype node is by using the parseDTD functions. For example, emp.dtd has the following DTD:

<!ELEMENT employee (Name, Dept, Title)>
 <!ELEMENT Name (#PCDATA)> 
<!ELEMENT Dept (#PCDATA)>
 <!ELEMENT Title (#PCDATA)> 

You can use the following code to create a doctype node:

parser.parseDTD(new FileInputStream(emp.dtd), "employee"); 
dtd = parser.getDocType();

XMLNode.selectNodes() Method

Question

How do I use the selectNodes() method in XMLNode class?

Answer

The selectNodes() method is used in XMLElement and XMLDocument nodes. This method is used to extract contents from the tree/subtree based on the select patterns allowed by XSL. The optional second parameter of selectNodes, is used to resolve Namespace prefixes (return the expanded namespace URL given a prefix). XMLElement implements NSResolver, so it can be sent as the second parameter. XMLElement resolves the prefixes based on the input document. You can implement the NSResolver interface, if you need to override the namespace definitions. The following sample code uses selectNodes

public class SelectNodesTest  {
public static void main(String[] args) throws Exception {
String pattern = "/family/member/text()";
String file    = args[0];

if (args.length == 2)
  pattern = args[1];

DOMParser dp = new DOMParser();

dp.parse(createURL(file));  // Include createURL from DOMSample
XMLDocument xd = dp.getDocument();
XMLElement e = (XMLElement) xd.getDocumentElement();
NodeList nl = e.selectNodes(pattern, e);
for (int i = 0; i < nl.getLength(); i++) {
   System.out.println(nl.item(i).getNodeValue());
    }
  }
}

> java SelectNodesTest family.xml
Sarah
Bob
Joanne
Jim

> java SelectNodesTest family.xml //member/@memberid
m1
m2
m3
m4

Using SAX API to Get the Data Value

Question

I am using SAX to parse an XML document. How does it get the value of the data?

Answer

During a SAX parse the value of an element will be the concatenation of the characters reported from after the startElement event to before the corresponding endElement event is called.

SAXSample.java

Question

Inside the SAXSample program, I did not see any line that explicitly calls setDocumentLocator and some other methods. However, these methods are 'run'. Can you explain when they are called and from where

Answer

SAX is a standard interface for event-based XML parsing. The parser reports parsing events directly through callback functions such as setDocumentLocator() and startDocument(). The application, in this case, the SAXSample, implements handlers to deal with the different events. Here is a good place to help you start learning about the event-driven API, SAX: http://www.megginson.com/SAX/index.html

Does DOMParser implement Parser interface

Question

Does the XML Parser DOMParser implement org.xml.sax.Parser interface at all? The documentation says it implements XML Constants and the API does not include that class at all.

Answer

You'll want oracle.xml.parser.v2.SAXParser to work with SAX and to have something that implements the org.xml.sax.Parser interface.

Creating an New Document Type Node Via DOM

Question

I am trying to create a XML file on the fly. I use the NodeFactory to construct a document (createDocument()). I have then setStandalone("no") and setVersion("1.0"). when I try to add a DOCTYPE node via appendChild(new XMLNode("test", Node.DOCUMENT_TYPE_NODE)), I get a ClassCastException. What is the mechanism to add a node of this type? I noticed that the NodeFactory did not have a mechanism for creating a DOCTYPE node.

Answer

There is no mechanism to create a new DOCUMENT_TYPE_NODE object via DOM APIs. The only way to get a DTD object is to parse the DTD file or the XML file using the DOMParser, and then use the getDocType() method.

Note that new XMLNode("test",Node.DOCUMENT_TYPE_NODE) does not create a DTDobject. It creates an XMLNode object with the type set to DOCUMENT_TYPE_NODE, which in fact should not be allowed. The ClassCastException is raised because appendChild expects a DTDobject (based on the type).

Also, we do not do any validation while creating the DOM tree using the DOM APIs. So setting the DTD in the Document will not help in validating the DOM tree that is constructed. The only way to validate an XML file is to parse the XML document using DOMParser or SAXParser.

Querying for First Child Node's Value of a Certain Tag

Question

I am using the XML Parser for Java v2. Given a XML document containing the following Calculus Math Jim Green Jack Mary Paul, I want to obtain the value of first child node of whose tag is. I could not find any method that can do that efficiently. The nearest match is method getElementsByTag("Name"), which traverses the entire tree under.

Answer

Your best bet, if you do not need the entire tree, is to use the SAX interface to return the desired data. Since it is event driven it does not have to parse the whole document.

XML Document Generation From Data in Variables

Question

Is there an example of XML document generation starting from information contained in simple variables? An example would be: A client fills a Java form and wants to obtain an XML document containing the given data.

Answer

Here are two possible interpretations of your question and answers to both. Let's say you have two variables in Java:

String firstname = "Gianfranco";
String lastname = "Pietraforte";

The two ways that come to mind first to get this into an XML document are as follows:

  1. Make an XML document in a string and parse it.

    String xml = "<person><first>"+firstname+"</first>"+
    

    "<last>"+lastname+"</last></person";

     DOMParser d = new DOMParser();
     d.parse( new StringReader(xml));
     Document xmldoc = d.getDocument();
    
    
  2. Use DOM APIs to construct the document and "stitch" it together:

    Document xmldoc = new XMLDocument();
    Element e1 = xmldoc.createElement("person");
    xmldoc.appendChild(e1);
    Element e2 = xmldoc.createElement("first");
    e1.appendChild(e2);
    Text t = xmldoc.createText(firstname);
    e2.appendChild(t);
    // and so on
    

Printing Data in the Element Tags: DOM API

Question

Can you suggest how to get a print out using the DOM API in Java:

<name>macy</name>

I want to print out "may". Don't know which class and what function to use. I was successful in printing "name" on to the console.

Answer

For DOM, you need to first realize that <name>macy</name> is actually an element named "name" with a child node (Text Node) of value "macy".

So, you can do the following:

String value = myElement.getFirstChild().getNodeValue();

Building XML Files from Hashtable Value Pairs

Question

We have a hash table of key value pairs, how do we build an XML file out of it using the DOM API? We have a hashtablekey = valuename = georgezip = 20000. How do we build this?

<key>value</key><name>george</name><zip>20000</zip>'

Is there a utility to do it automatically?

Answer

  1. Get the enumeration of keys from your hashtable

  2. Loop while enum.hasMoreElements()

  3. For each key in the enumeration, use the createElement() on DOM Document to create an element by the name of the key with a child text node with the value of the *value* of the hashtable entry for that key.

XML Parser for Java: wrong_document_err on Node.appendChild()

Question

I have a question regarding our XML parser (v2) implementation. Say if I have the following scenario:

  Document doc1 = new XMLDocument();
  Element element1 = doc1.creatElement("foo");
  Document doc2 = new XMLDocument();
  Element element2 = doc2.createElement("bar");
  element1.appendChild(element2);  

My question is whether or not we should get a DOMException of WRONG_DOCUMENT_ERR on calling the appendChild() routine. This comes to my mind when I look at the XSLSample.java distributed with the XMLparser (v2). Any feedback would be greatly appreciated.

Answer

Yes you should get this error, since the owner document of element1 is doc1 while that of element2 is doc2. AppendChild() only works within a single tree and you are dealing with two different ones.

Question 2

In XSLSample.java that's shipped with xmlparser v2:

DocumentFragment result = processor.processXSL(xsl, xml);
// create an output document to hold the result
  out = new XMLDocument();
// create a dummy document element for the output document
  Element root = out.createElement("root");  
  out.appendChild(root);
// append the transformed tree to the dummy document element
   root.appendChild(result);

Nodes root and result are created from different XML Documents, wouldn't this result in the WRONG_DOCUMENT_ERR when we try to append result to root?

Answer 2

This sample uses a document fragment that does not have a root node, therefore there are not two XML documents.

Question 3

When appending a document fragment to a node, only the child nodes of the document fragment (but not the document fragment itself) is inserted. Wouldn't the parser check the owner document of these child nodes?

Comment

A DocumentFragment shouldn't be bound to a 'root' node, since, by definition, a fragment could very well be just a list of nodes. The root node, if any, should be considered a single child. That is, you could for example take all the lines of an Invoice document, and add them into an ProviderOrder document, without taking the invoice itself. How do we create a documentFragment without root? As the XSLT Processor does, so that we can append it to other documents?

Creating Nodes: DOMException when Setting Node Value

Question

I get the following error:

oracle.xml.parser.XMLDOMException: Node cannot be modified while trying to set 
the value of a newly created node as below:
  String eName="Mynode";
  XMLNode aNode = new XMLNode(eName, Node.ELEMENT_NODE);     
  aNode.setNodeValue(eValue);

How do I create a node whose value I can set later on?

Answer

Check the DOM notes where they discuss the node type. You will see that if you are creating an element node, its nodeValue is null and hence cannot be set.

With SAX, How Can I Force the Parser to Not Discard Whitespace?

Question

I receive the following error when reading the attached file using SAX (Oracle XML Parser, v.2.0.2.9.0): if character data starts with a whitespace, characters( ) method discards characters that follows whitespace.

Is this a bug or can I force the parser to not discard those characters?

Answer

Use XMLParser.setPreserveWhitespace(true) to force the parser to not discard whitespace.

Validation

DTD: Understanding DOCTYPE and Validating Parser

Question

I have an XML string contains the following reference to a DTD, that is physically located in the directory where I start my program. The validating XML parser complains that this file can not be found.

<!DOCTYPE xyz SYSTEM "xyz.dtd" >

What are the rules for locating DTDs on the disk? Can anyone point me to a decent discussion of DOCTYPE attribute descriptions.

Answer

Are you parsing an InputStream or a URL? If you are parsing an InputStream the parser doesn't know where that InputStream came from so it cannot find the DTD in the "same directory as the current file". The solution is to setBaseURL()on DOMParser() to give the parser the URL "hint" information to be able to derive the rest when it goes to get the DTD.

Can Multiple Threads Use Single XSLProcessor/Stylesheet?

Question

Can multiple threads use a single XSLProcessor/XSLStylesheet instance to perform concurrent (at the same time) transformations?

Answer

As long as you are processing multiple files with no more than one XSLProcessor/XSLStylesheet instance per XML file you can do this simultaneously using threads. If you take a look at the readme.html file in the bin directory, it describes ORAXSL which has a threads parameter for multi-threaded processing.

Is it Safe to Use Document Clones in Multiple Threads?

Question

Is it safe to use clones of a document in multiple threads? Is the public void setParam(String,String) throws XSLExceptionmethod of Class oracle.xml.parser.v2.XSLStylesheet supported? If no, is there another way to pass parameters at runtime to the XSLT Processor?

Answer

If you are copying the global area set up by the constructor to another thread then it should work.

That method is supported since XML Parser release 2.0.2.5.

Comment

You have it in your docs, but it is not implemented in the XSLStylesheet class (windows zip edition). First update your zip download file.

public static void serve(Document template, Document data,Element 
userdata,PrintWriter out) 
{  
 XMLDocument clone = (XMLDocument)data.cloneNode(true);
    clone.getDocumentElement().appendChild(userdata.cloneNode(true));
    serve(template, clone, out);
}

Character Sets

Encoding iso-8859-1 in xmlparser

Question

I have some XML-Documents with encoding="iso-8859-1". I am trying to parse these with xmlparser SAX API. In characters (char[], int, int), I would like to output the content in iso-8859-1 (Latin1) too.

With System.out.println() it doesn't work correctly. German umlauts result in '?' in the output stream. Internally ,,,÷,',ý,Ù,>, are stored as 65508,65526,65532,65476,65494,65500,65503 respectively. What do I have to do to get the output in Latin1? Host system here is a SPARC Solaris 2.6.

Answer

You cannot use System.out.println(). You need to use an output stream which is encoding aware, for example, OutputStreamWriter.

You can construct an outputstreamwriter and use the write(char[], int, int) method to:

print.Ex:OutputStreamWriter out = new OutputStreamWriter(System.out, "8859_1");
/* Java enc string for ISO8859-1*/

Parsing XML Stored in NCLOB With UTF-8 Encoding

Question

I'm having trouble with parsing XML stored in NCLOB column using UTF-8 encoding. Here is what I'm running:

The following XML sample that I loaded into the database contains two UTF-8 multi-byte characters:

<?xml version="1.0" encoding="UTF-8"?>
<G>
<A>GÂ,otingen, Brück_W</A>
</G>

G(0xc2, 0x82)otingen, Br(0xc3, 0xbc)ck_W

If I'm not mistaken, both multibyte characters are valid UTF-8 encodings and they are defined in ISO-8859-1 as:

 0xC2 LATIN CAPITAL LETTER A WITH CIRCUMFLEX
0xFC  LATIN SMALL LETTER U WITH DIAERESIS

I wrote a Java stored function that uses the default connection object to connect to the database, runs a Select query, gets the OracleResultSet, calls the getCLOB method and calls the getAsciiStream() method on the CLOB object. Then it executes the following piece of code to get the XML into a DOM object:

DOMParser parser = new DOMParser();
parser.setPreserveWhitespace(true);
parser.parse(istr); 
// istr getAsciiStreamXMLDocument xmldoc = parser.getDocument();

Before the stored function can do other tasks, this code throws an exception complaining that the above XML contains "Invalid UTF8 encoding".

I loaded the sample XML into the database using the thin JDBC driver. I tried two database configurations with WE8ISO8859P1/WE8ISO8859P1 and WE8ISO8859P1/UTF8 and both showed the same problem.

Answer

Yes, the character (0xc2, 0x82) is valid UTF-8. We suspect that the character is distorted when getAsciiStream() is called. Try to use getUnicodeStream() and getBinaryStream() instead of getAsciiStream().

If this does not work, try print out the characters before to make sure that they are not distorted before they are sent to the parser in step: parser.parse(istr)

NLS support within XML

Question

I've got Japanese data stored in an nvarchar2 field in the database. I have a dynamic SQL procedure that utilizes the PL/SQL web toolkit that allows me to access data via OAS and a browser. This procedure uses the XML Parser to correctly format the result set in XML before returning it to the browser.

My problem is that the Japanese data is returned and displayed on the browser as upside down question marks. Is there anything I can do so that this data is correctly returned and displayed as Kanji?

Answer

Unfortunately, Java and XML default character set is UTF8 while I haven't heard of any UTF8 OS nor people using it as in their database and people writing their web pages in UTF8. All this means is that you have a character code conversion problem. Answer to your last question is 'yes'. We do have both PL/SQL and Java XML parsers working in Japanese. Unfortunately, we cannot provide a simple solution that will fit in this space.

UTF-16 Encoding with XML Parser for Java V2

Question

This is my XML Document:

Documento de Prueba de gestin de contenidos. Roberto P0/00rez Lita

This is the way in which I parse the document:

DOMParser parser=new DOMParser(); 
parser.setPreserveWhitespace(true); 
parser.setErrorStream(System.err); 
parser.setValidationMode(false); 
parser.showWarnings(true);
parser.parse ( new FileInputStream(new File("PruebaA3Ingles.xml")));

I get the following error:

XML-0231 : (Error) Encoding 'UTF-16' is not currently supported

I am using the XML Parser for Java V2_0_2_5 and I am confused because the documentation says that the UTF-16 encoding is supported in this version of the Parser. Does anybody know how can I parse documents containing spanish accents?

Answer

Oracle just uploaded a new release of V2 Parser. It should support UTF-16.Yet, other utilities still have some problems with UTF-16 encoding.

How Can I Read in Accented Characters?

Question

I need to store accented characters in my XML documents. If I manually add an accented character e.g. é, to my XML file and then attempt to parse the XML doc. with Oracle's XML Parser for Java the Parser throws the following exception:

'Invalid UTF-8 encoding' 

Here's my encoding declaration in my xml header:

<?xml version="1.0" encoding="UTF-8"?> 

Aside: If I specify UTF-16 as the default encoding the Oracle XML Parser for Java states that UTF-16 is not currently supported. From within my Java program if I define a Java String object as follows:

String name = "éééé"; 

and programmatically generate an XML document and save it to file then the é character is correctly written out to file. Can you tell me how I can successfully read in character data consisting of accented characters. I know that I can read in accented characters once I represent them in their HEX or Decimal format within the XML document, for example:

&#xe9; 

but I'd prefer not to do this.

Answer 1

You need to set the encoding based on the character set you were using when you created the xml file - I ran into this problem & solved it by setting the encoding to iso-8859-1 (western european ascii) - you may need to use something different depending on the tool and/or operating system you were using.

If you explicitly set the encoding to UTF-8 (or do not specify it at all), the parser interprets your accented character (which has an ascii value > 127) as the first byte of a UTF-8 multi-byte sequence. If the subsequent bytes do not form a valid UTF-8 sequence, you get this error.

Answer 2

This error just means that your editor is not saving the file with UTF-8 encoding. For example, it might be saving it with ISO-8859-1 encoding. Remember that the encoding is a particular scheme used to write the Unicode character number representation to disk. Just adding the string to the top of the document like:

<?xml version="1.0" encoding="UTF-8"?>

does not cause your *editor* to write out the bytes representing the file to disk using UTF-8 encoding. I believe Notepad uses UTF-8, so you might try that.

Adding XML Document as a Child

Adding an XMLDocument as a Child to Another Element

Question

I am trying to add an XMLDocument as a child to an existing element. Here's an example:

import org.w3c.dom.*;
import java.util.*;
import java.io.*;
import java.net.*;
import oracle.xml.parser.v2.*;
public class ggg {public static void main (String [] args) throws Exception
 {
new ggg().doWork();;
public void doWork() throws Exception {XMLDocument doc1 = new XMLDocument();
Element root1=doc1.createElement("root1");
XMLDocument doc2= new XMLDocument();	Element root2=doc2.createElement("root2");
	root1.appendChild(root2);
doc1.print(System.out);};};

This reports:

D:\Temp\Oracle\sample>c:\jdk1.2.2\bin\javac -classpath 
D:\Temp\Oracle\lib\xmlparserv2.jar;. 
ggg.javaD:\Temp\Oracle\sample>c:\jdk1.2.2\bin\java -classpath 
D:\Temp\Oracle\lib\xmlparserv2.jar;. gggException in thread "main" 
java.lang.NullPointerException        at 
oracle.xml.parser.v2.XMLDOMException.(XMLDOMException.java:67)        at 
oracle.xml.parser.v2.XMLNode.checkDocument(XMLNode.java:919)        at 
oracle.xml.parser.v2.XMLNode.appendChild(XMLNode.java, Compiled Code)        at 
oracle.xml.parser.v2.XMLNode.appendChild(XMLNode.java:494)        at 
ggg.doWork(ggg.java:20)        at ggg.main(ggg.java:12)

Answer a

The following works for me:

DocumentFragment rootNode = new XMLDocumentFragment(); DOMParser d  = new 
DOMParser(); d.parse("http://.../stuff.xml"); 
Document doc = d.getDocument(); 
Element e = doc.getDocumentElement(); 
// Important to remove it from the first doc 
// before adding it to the other doc. doc.removeChild(e); 
rootNode.appendChild(e); 

You need to use the DocumentFragment class to do this as a document cannot have more than one root.

Answer b

Actually, isn't this specifically a problem with appending a node created in another document, since all nodes contain a reference to the document they are created in? While Document Fragmentsolves this, it isn't a more than one root problem, is it? Is there a quick or easy way to convert a com.w3c.dom.Document to org.w3c.dom.DocumentFragment?

Adding an XML DocumentFragment as a Child to XMLDocument

Question

I have this piece of code:

XSLStylesheet XSLProcessorStylesheet = new XSLStylesheet(XSLProcessorDoc, 
XSLProcessorURL);
XSLStylesheet XSLRendererStylesheet = new XSLStylesheet(XSLRendererDoc, 
XSLRendererURL);
XSLProcessor processor = new XSLProcessor();
// configure the processorprocessor.showWarnings(true);
processor.setErrorStream(System.err);
XMLDocumentFragment processedXML = processor.processXSL(XSLProcessorStylesheet, 
XMLInputDoc);
XMLDocumentFragment renderedXML = processor.processXSL(XSLRendererStylesheet, 
processedXML);
Document resultXML = new XMLDocument();
resultXML.appendChild(renderedXML);

The last line causes Exception in thread "main" oracle.xml.parser.v2.

XMLDOMException: Node of this type cannot be added.

Do I have to create a root element _every time_, even if I know that the resulting DocumentFragment is a well formed XML Document (and of course has only one root element!)?

Answer

It happens, as you have guessed, because a Fragment can have more than one "root" element (for lack of a better term). In order to work around this, use the Node functions to extract the one root element from your fragment and cast it into an

Uninstalling Parsers

Removing XML Parser from the Database

Question

I am uninstalling a version of XML Parser and installing a newer version. How do I do that? I know that there is something like dropjava, but still there are other packages which are loaded into the schema. I want to clean out the earlier version and install the new version in a clean manner.

Answer

You'll need to write SQL to write SQL based on the USER_OBJECTS table where:

SELECT 'drop java class '''&#0124; &#0124;       dbms_java.longname(object_
name)&#0124; &#0124;''';

from user_objects where

OBJECT_TYPE = 'JAVA CLASS'and DBMS_JAVA.LONGNAME(OBJECT_NAME)     LIKE 
'oracle/xml/parser/%'

This will spew out a set of DROP JAVA CLASS command which you can capture in a file using SQL*Plus': SPOOL somefilenamecommand.

Then run that spool file as a SQL script and all the right classes will be dropped.

XML Parser for Java: Installation

XMLPARSER Fails to Install

Question

I'm getting an error message when I try installing XMLPARSER:

loadjava -user username/manager -r -v xmlparserv2.jar
Error:
Exception in thread "main" java.lang.NoClassDefFounderr:

oracle/jdbc/driver/OracleDriver at oracle.aurora.server.tools..

Answer

This is a failure to find the JDBC classes111.zip in your classpath. The loadjava utility connects to the database to load your classes using the JDBC driver.

I checked 'loadjava' and the path to classes111.zip is

<ORACLE_HOME>/jdbc/lib/classes111.zip

In version 8.1.6, classes111.zip resides in:

<ORACLE_HOME/jdbc/admin

General XML Parser Related Questions

How the XML Parser Works

Question

What does an XML Parser do?

Answer

The parser accepts any XML document giving you a tree-based API (DOM) to access or modify the document's elements and attributes as well as an event-API (SAX) that provides a listener to be registered and report specific elements or attributes and other document events.

Converting XML Files to HTML Files

Question

How do I convert XML files into HTML files?

Answer

You need to create an XSL stylesheet to render your XML into HTML. You can start with an HTML document in your desired format and populated with dummy data. Then you can replace this data with the XSLT commands that will populate the HTML with data from the XML document completing your stylesheet.

Does XML Parser Validate Against XML Schema?

Question

Does the XML Parser v2 validate against an XML Schema?

Answer

Yes. It supports both validating and non-validating modes. XML Schema is still under the development W3C XML Schema committee and is supported by Oracle9i. Currently, XML Parser for Java supports validating, non-validating, partial validating DTDs and XML Schemas with the modes: non-validating mode, DTD validating mode, partial validation mode, and schema validation mode.

Including Binary Data in an XML Document

Question

How do I include binary data in an XML document?

Answer

There is no way to directly include binary data within the document; however, there are two ways to work around this:

What is XML Schema?

Question

What is the XML Schema?

Answer

XML Schema is a W3C XML standards effort to bring the concept of data types to XML documents and in the process replace the syntax of DTDs to one based on XML. For more details, check out http://www.w3.org/TR/xmlschema-1/ and http://www.w3.org/TR/xmlschema-2/. XML Schema is supported in Oracle9i and higher.

Oracle's Participation in Defining the XML/SQL Standard

Question

Does Oracle participate in defining the XML/XSL standard?

Answer

Oracle has representatives participating actively in the following 3C Working Groups related to XML/XSL: XML Schema, XML Query, XSL, XLink/XPointer, XML Infoset, DOM and XML Core.

XDK Version Numbers

Question

How do I determine the version number of the XDK toolkit that I downloaded?

Answer

You can find out the full version number by looking at the readme.html file included in the archive and linked off of the Release Notes page.

Inserting <, >, >= and <= in XML Documents

Question

How do I insert these characters in the XML documents: >,<,>=, and <=?

Answer

You need to use the entities &lt; for < and &gt; for >.

Are Namespace and Schema Supported

Question

Is support for Namespaces and Schema included?

Answer

The current XML parsers support Namespaces. Schema support is provided in Oracle9i and higher.

Using JDK 1.1.x with XML Parser for Java v2

Question

Can I use JDK 1.1.x with XML Parser v2 for Java?

Answer

v2 of XML Parser for Java has nothing to do with Java2. It is simply a designation that indicates that it is not backwards compatible with the v1 Parser and that it includes XSLT support. The v2 parser will work fine with JDK 1.1.x.

Sorting the Result on the Page

Question

I have a set of records say 100, I am showing 10 at a time, now on each column name I have made a link, on the click of the same, I want to sort the data in the page alone, based on that column. How to go about it?

Answer

It depends on how you are going about. If you are writing for IE5 alone and receiving XML data, you could just use MS's XSL to sort data in a page.If you are writing for other browser and browsers are getting data as HTML, then you have to have a sort parameter in XSQL script and use it in ORDER BY clause. Just passed it along with skip-rows parameter.

Is Oracle9i Needed to Run XML Parser for Java?

Question

Do I need Oracle9i to run the XML Parser for Java?

Answer

XML Parser for Java can be used with any of the supported version JavaVMs. The only difference with 9i is that you can load it into the database and use JServer, which is an internal JVM. For other database versions or servers, you simply run it in an external JVM and as necessary connect to a database through JDBC.

Dynamically Setting the Encoding in an XML File

Question

Is it possible to dynamically set the encodings in the XML file?

Answer

No, you need to include the proper encoding declaration in your document as per the specification. You cannot use setEncoding() to set the encoding for you input document. SetEncoding() is used with oracle.xml.parser.v2.XMLDocument to set the correct encoding for the printing.

Parsing a String

Question

How do I parse a string?

Answer

We do not currently have any method that can directly parse an XML document contained within a String. You would need to convert the String into an InputStream or InputSource before parsing. An easy way is to create a ByteArrayInputStream using the bytes in the String.

Displaying an XML Document

Question

How do I display my XML document?

Answer

If you are using IE5 as your browser you can display the XML document directly. Otherwise, you can use our XSLT Processor in v2 of the parser to create the HTML document using an XSL Stylesheet. The Oracle XML Transviewer bean also allows you to view your XML document.

System.out.println() and Special Characters

Question

I am having problems using System.out.println() with special character encoding.

Answer

You can't use System.out.println(). You need to use an output stream which is encoding aware (Ex.OutputStreamWriter). You can construct an OutputStreamWriter and use the write(char[ ], int, int) method to print.

/* Example */
OutputStreamWriter out = new OutputStreamWriter
(System.out, "8859_1");
/* Java enc string for ISO8859-1*/

Obtaining Ampersand from Character Data

Question

How do I to get ampersand from character data?

Answer

You cannot have "raw" ampersands in XML data. You need to use the entity, &amp; instead. This is defined in the XML standard.

How Can We Use Special Characters in the Tags?

Question

I have a tag in XML <COMPANYNAME>

When we try to use "A&B", the parser gives an error with invalid character. How do we use special characters when parsing companyname tag? We are using the Oracle XML Parser for C.

Answer 1

You have to represent literal...

& as &amp;

< as &lt;

Answer 2

I think you may want to use special characters as part of XML name. For example: <A&B>abc</A&B>

If this is the case, using name entity doesn't solve the problem. According to XML 1.0 spec(http://www.w3.org/TR/2000/REC-xml-20001006), NameChar and Name are defined as follows:

[4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar |Extender

[5] Name ::= (Letter | '_' | ':') (NameChar)*

To answer your question, special character such as '&', '$', '#',... are not allowed to be used as NameChar. Hence, if you are creating XML document from scratch, you can use a workaround by using only valid NameChars. For example, <A_B>, <AB>, <A_AND_B>...

They are still readable.

If you are generating XML from external data sources such as database tables, then this is a problem. XML 1.0 does not address it.

In Oracle, the new type, XMLType, will help address this problem by offering a function which maps SQL names to XML names. This will address this problem at the application level. The SQL to XML name mapping function will escape invalid XML NameChar in the format of _XHHHH_ where HHHH is a Unicode value of the invalid character. For example, table name "V$SESSION" will be mapped to XML name "V_X0024_SESSION".

At last, escaping invalid characters is a hack to give people a way to serialize names so that they can reload them somewhere else.

Parsing XML from Data of Type String

Question

How do I parse XML from data of type String?

Answer

Check out the following example:

/* xmlDoc is a String of xml */
byte aByteArr [] = xmlDoc.getBytes();
ByteArrayInputStream bais = new ByteArrayInputStream (aByteArr, 0, 
aByteArr.length);
domParser.parse(bais);

Extracting Data from XML Document into a String

Question

How do I extract data from an XML document into type String?

Answer

Here is an example to do that:

XMLDocument Your Document;
/* Parse and Make Mods */
:
StringWriter sw = new StringWriter();
PrintWriter  pw = new PrintWriter(sw);
YourDocument.print(pw);
String YourDocInString = sw.toString();

Disabling Output Escaping

Question

Does XML Parser for Java support Disabling Output Escaping?

Answer

Yes, since version 2.022, the parser provides an option to xsl:text to disable output escaping.

Using the XML Parser for Java with Oracle 8.0.5

Question

Is the XML Parser for Java only available for use with Oracle 9i? Is it possible to use with Oracle 8.0.5

Answer

The XML Parser for Java can be used with any of the supported version JavaVMs. The only difference with Oracle9i is that you can load it into the database and use JServer which is an internal VM. For 8.0.5 you simple run it externally and connect through JDBC.

Delimiting Multiple XML Documents

Question

We need to be able to read (and separate) several XML documents as a single string. One solution would be to delimit these documents using some (programatically generated) special character that we know for sure can never occur inside an xml document. The individual documents can then be easily tokenized and extracted/parsed as required.

Has any one else done this before? Any suggestions for what character can be used as the delimiter (for instance can characters in the range #x0-#x8 ever occur inside an xml document?)

Answer

As far as legality is concerned and you limit it to 8-bit, #x0-#x8; #xB, #xC, #xE, and #xF are not legal. HOWEVER this assumes that you preprocess the doc and not depend upon exceptions as not ALL parsers reject ALL illegal characters.

Element, which you then append to the Document.

XML and Entity-references: XML Parser for Java

Question

  1. The XML-parser for Java does not expand entity references,such as &[whatever], instead all values are null. How can I fix this?

  2. It seems you cannot have international character (such as swedish characters, ...,,÷) as values for internal entities. How does one solve this problem?

Answer

  1. You probably have a simple error defining/using your entities since we've a number of regression tests that handle entity references fine. A simple example is: ]> Alpha, then &status

  2. What do you set your character set encoding to be?

Can I Break up and Store an XML Document without a DDL Insert?

Question

  1. We would like to break apart an arbitrary XML document and store it in the database without creating a DDL to insert. Is this possible?

  2. And as for querying, is it possible to perform hierarchical searches across XML documents?

Answer

  1. No this is not possible. Either the schema must already exist or and XSL stylesheet to create the DDL from the XML must exist.

  2. From Oracle8i Release 8.1.6 and higher, interMedia Text (now called Oracle Text) can do this.

Merging XML Documents

Question

How can I merge two XML Documents?

Answer

This is not possible with the current DOM specification. DOM2 specification may address this.

You can use a DOM-approach or an XSLT-based approach to accomplish this. If you use DOM, then you'll have to remove the node from one document before you append it into the other document to avoid ownership errors.

Here's an example of the XSL-based approach. Assume your two XML source files are:

demo1.xml

<messages>
  <msg>
    <key>AAA</key>
    <num>01001</num>
  </msg>
  <msg>
    <key>BBB</key>
    <num>01011</num>
  </msg>
</messages>

demo2.xml

<messages>
  <msg>
    <key>AAA</key>
    <text>This is a Message</text>
  </msg>
  <msg>
    <key>BBB</key>
    <text>This is another Message</text>
  </msg>
</messages>

Here is a stylesheet the "joins" demo1.xml to demo2.xml based on matching the "<key>" values.

demomerge.xsl

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:variable name="doc2" select="document('demo2.xml')"/>
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
<xsl:template match="msg">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
      <text><xsl:value-of select="$doc2/messages/msg[key=current()/key]/text"/>
</text>
    </xsl:copy>
</xsl:template>
</xsl:stylesheet>

If you use the command-line "oraxsl" to test this out, you would do:

$ oraxsl demo1.xml demomerge.xsl

And you'll get the merged result of:

<messages>
  <msg>
    <key>AAA</key>
    <num>01001</num>
    <text>This is a Message</text>
  </msg>
  <msg>
    <key>BBB</key>
    <num>01011</num>
    <text>This is another Message</text>
  </msg></messages>

Obviously not as efficient for larger-sized files as an equivalent database "join" between two tables, but this illustrates the technique if you only have XML files to work with.Error: Cannot Find Class

Getting the Value of a Tag

Question

I am using SAX to parse an XML document. How I can get the value of a particular tag? For example, Java. How do I get the value for title? I know there are startElement, endElement, and characters methods.

Answer

During a SAX parse the value of an element will be the concatenation of the characters reported from after startElement to before the corresponding endElement is called.

Granting JAVASYSPRIV to User

Question

We are using Oracle XML Parser for Java on NT 4.0. When we are parsing an XML document with an external DTD we get the following error:

<!DOCTYPE listsamplereceipt SYSTEM
"file:/E:/ORACLE/utl_file_dir/dadm/ae.dtd">
java.lang.SecurityExceptionat
oracle.aurora.rdbms.SecurityManagerImpl.checkFile(SecurityManagerImpl.java)at
oracle.aurora.rdbms.SecurityManagerImpl.checkRead(SecurityManagerImpl.java)at
java.io.FileInputStream.<init>(FileInputStream.java)at
java.io.FileInputStream.<init>(FileInputStream.java)at
sun.net.www.MimeTable.load(MimeTable.java)at
sun.net.www.MimeTable.<init>(MimeTable.java)at
sun.net.www.MimeTable.getDefaultTable(MimeTable.java)at
sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java)at
sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.
java)at
java.net.URL.openStream(URL.java)at
oracle.xml.parser.v2.XMLReader.openURL(XMLReader.java:2313)at
oracle.xml.parser.v2.XMLReader.pushXMLReader(XMLReader.java:176)at
...

What is causing this?

Answer

Grant the JAVASYSPRIV role to your user running this code to allow it to open the external file/URL.

Including an External XML File in Another XML File: External Parsed Entities

Question

  1. I am trying to include an external XML file in another XML file. Does Oracle Parser for Java v1 and v2 support external parsed entities?

  2. We are using version 1.0, because that is what is shipped to the customers with release 10.7 and 11.0 of our application. Can you refer me to this, or some other sample code to do this.

    Shouldn't file b.xml be in the format:

    <?xml version="1.0" ?>
    <b>
      <ok/>
    </b>
    
    

    Does Oracle XML Parser come with a utility to parse an XML file and see the parsed output?

Answer

  1. IE 5.0 will parse an XML file and show the parsed output. Just load the file like you would an HTML page.

    The following works, both browsing it in IE5 as well as parsing it with Oracle XML Parser v2. Even though I'm sure it works fine in Oracle XML Parser 1.0, you should be using the latest parser version as it is faster than v1.

    File: a.xml

    <?xml version="1.0" ?>
    <!DOCTYPE a [<!ENTITY b SYSTEM "b.xml">]>
     <a>&b;</a>
    
    

    File: b.xml

     <ok/>
    
    

    When I browse/parse a.xml I get the following:

    <a>
      <ok/>
    </a>
    
    
  2. Not strictly. The parsed external entity only needs to be a well-formed fragment. The following program (with xmlparser.jar from v 1.0) in your CLASSPATH shows parsing and printing the parsed document. It's parsing here from a String but the mechanism would be no different for parsing from a file, given it's URL.

    import oracle.xml.parser.*;
    import java.io.*;
    import java.net.*;
    import org.w3c.dom.*;
    import org.xml.sax.*;
    /*
    ** Simple Example of Parsing an XML File from a String
    ** and, if successful, printing the results.
    **
    ** Usage: java ParseXMLFromString <hello><world/></hello>
    */
    public class ParseXMLFromString {
      public static void main( String[] arg ) throws IOException, SAXException {
        String theStringToParse =
           "<?xml version='1.0'?>"+
           "<hello>"+
           "  <world/>"+
           "</hello>";
        XMLDocument theXMLDoc = parseString( theStringToParse );
        // Print the document out to standard out
        theXMLDoc.print(System.out);
      }
      public static XMLDocument parseString( String xmlString ) throws
       IOException, SAXException {
       XMLDocument theXMLDoc     = null;
        // Create an oracle.xml.parser.v2.DOMParser to parse the document.
        XMLParser theParser = new XMLParser();
        // Open an input stream on the string
        ByteArrayInputStream theStream =
             new ByteArrayInputStream( xmlString.getBytes() );
        // Set the parser to work in non-Validating mode
        theParser.setValidationMode(DTD_validation);
        try {
          // Parse the document from the InputStream
          theParser.parse( theStream );
          // Get the parsed XML Document from the parser
          theXMLDoc = theParser.getDocument();
        }
        catch (SAXParseException s) {
          System.out.println(xmlError(s));
          throw s;
        }
        return theXMLDoc;
      }
      private static String xmlError(SAXParseException s) {
         int lineNum = s.getLineNumber();
         int  colNum = s.getColumnNumber();
         String file = s.getSystemId();
         String  err = s.getMessage();
         return "XML parse error in file " + file +
                "\n" + "at line " + lineNum + ", character " + colNum +
                "\n" + err;
      }
    }
    
    
    

Where Can I Download OraXSL, The Parser's Command Line Interface?

Question

From where I can download oracle.xml.parser.v2.OraXSL?

Answer

It's part of our integrated XML Parser for Java V2 release. Our XML Parser, DOM, XPath implementation, and XSLT engine are nicely integrated into a single, cooperating package. http://otn.oracle.com/tech/xml/xdk_java/

Will Oracle Support Hierarchical Mapping?

Question

We are interested in using the Oracle database to primarily store XML. We would like to parse incoming XML documents and store data and tags in the database. We are concerned about the following two aspects of XML in Oracle:

Relational mapping of parsed XML data. We prefer hierarchical storage of parsed XML data. Is this a valid concern? Will XMLType in Oracle9i address this concern?

A lack of an "Ambiguous Content Mode" in the Oracle Parser for Java is limiting to our business. Are there plans to add an "Ambiguous Content Mode" to the Oracle Parser for Java?

Answer

Lots of customers initially have this concern. It depends on what kind of XML data you are storing. If you are storing XML datagrams that are really just encoding of relational information, a purchase order, for example, then you will get much better performance and much better query flexibility (via SQL) to store the data contained in the XML documents in relational tables, then on-demand reproduce an XML format when any particular data is needed to be extracted.

If you are storing documents that are more mixed-content, like legal proceeding, chapters of a book, reference manuals, and so on. Then storing them in chunks and searching them using Oracle Text's XML search capabilities is the best bet.

The book, "Building Oracle XML Applications" by Steve Muench, covers both of these storage and searching techniques with lots of examples.

See Also:

 

For the second point, Oracle's XML Parser implements all the XML 1.0 standard, and the XML 1.0 standard requires XML documents to have unambiguous content models, so there's no way a compliant XML 1.0 parser can implement ambiguous content models. See: http://www.xml.com/axml/target.html#determinism

XSLT Processor and XSL Stylesheets

HTML Error in XSL

Question

I don't know what is wrong here. This is my news_xsl.xsl file:

<?xml version ="1.0"?>
<xsl:stylesheet  xmlns:xsl="http://www.w3.org/TR/WD-xsl">    
<xsl:template match="/"> 
 <HTML>
    <HEAD>
        <TITLE>   Sample Form    </TITLE>        
        </HEAD> 
     <BODY>
      <FORM>  
       <input type="text" name="country" size="15">    </FORM>                         
    </BODY> 
 </HTML>  
</xsl:template>
</xsl:stylesheet>

ERROR:End tag 'FORM' does not match the start tag 'input'. Line 14, Position 12           
</FORM>-
----------^news.xml
<?xml version="1.0" ?>
<?xml-stylesheet type="text/xsl" href="news_xsl.xsl"?>
<GREETING/>

Answer

Unlike in HTML, in XML you must know that every opening/starting tag should have an ending tag. So even the input that you are giving should have a matching ending tag, so you should modify your script like this:

<FORM>
<input type="text" name="country" size="15"> </input>
</FORM> 

OR

<FORM>
<input type="text" name="country" size="15"/>
</FORM>

And also always remember, in XML the tags are case sensitive, unlike in HTML. So be careful.

Is <xsl:output method="html"/> Supported?

Question

Is the output method "html" supported in the recent version of the XML/XSL parser? I was trying to use the <BR> tag with the <xsl utput method="xml"/> declaration but I got an XSLException error message indicating a not well-formed XML document. Then I tried the following output method declaration: <xsl utput method="html"/>but I got the same result.

Here's a simple XSL stylesheet I was using:

<?xml version="1.0"?> <xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl utput method="html"/>    
<xsl:template match="/">      <HTML>         <HEAD></HEAD>         <BODY>         
<P>           Blah blah<BR>           More blah blah<BR>         </P>         
</BODY>      </HTML>   </xsl:template>

How do I use a not well-formed tag (like <IMG>, <BR>, etc.) in an XSL stylesheet?

Answer

We fully support all options of <xsl utput> The problem here is that your XSL Stylesheet must be a well-formed XML document, so everywhere you are using the <BR> element, you need to use <BR/> instead.<xsl utput method="html"/> requests that when the XSLT Engine *writes out* the result of your transformation, is a proper HTML document. What the XSLT engine reads *in* must be well-formed XML.

Question 2

Sorry for jumping in on this thread, but I have a question regarding your reply. I have an XSL stylesheet that preforms XML to HTML conversion. Everything works correctly with the exception of those HTML tags that are not well formed. Using your example if I have something like:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
......
<input type="text" name="{NAME}" size="{DISPLAY_LENGTH}" maxlength="{LENGTH}">
</input>
......
</xsl:stylesheet>

It would render HTML in the format of

<HTML>......<input type="text" name="in1" size="10" maxlength="20"/>
......
</HTML>

While IE can handle this Netscape can not. Is there anyway to generate completely cross browser compliant HTML with XSL?

Answer 2

If you are seeing:

<input ... />

instead of:

<input>

Then you are likely using the incorrect way of calling XSLProcessor.processXSL() since it appear that it's not doing the HTML output for you. Use:

void processXSL(style,sourceDoc,PrintWriter)

instead of:

DocumentFragment processXSL(style,sourceDoc)

and it will work correctly.

Netscape 4.0: Preventing XSL From Outputting <meta> Tag

Question

I'm using <xsl utput method="html" encoding="iso-8859-1" indent = "no" />. Is it possible to prevent XSLT from outputting <META http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> in the head because Netscape 4.0 has difficulties with this statement. It renders the page twice.

Answer

The XSLT 1.0 Recommendation says in Section 16.2 ("HTML Output Method")...If there is a HEAD element, then the html output method should add a META element immediately after the start-tag of the HEAD element specifying the character encoding actually used.

For example:

<HEAD><META http-equiv="Content-Type" content="text/html; charset=EUC-JP">.

So any XSLT 1.0-compliant engine needs to add this.

Question 2

Netscape 4.0 has following bug:

When Mozilla hits the meta-encoding tag it stops rendering the page and does a refresh. So you experience this annoying flickering. So I probably have to do a replacement in the servlets Outputstream, but I don't like doing so. Are there any alternatives.

Answer 2

Only alternatives I can think of are:

Neither is pretty, but either one might provide a workaround.

XSL Error Messages

Question

Where can I find more info on the XSL error messages. I get the error XSL-1900, exception occurred. What does this mean? How can I find out what caused the exception?

Answer

If you are using Java, you could write Exception routines to trap errors.Using tools such as JDeveloper also helps.

The error messages of our components are usually more legible. XSL-1900 indicates possible internal error or incorrect usage.

Generating HTML: "<" Character

Question

I am trying to generate an HTML form for inputting data using column names from the user_tab_columns table and the following XSL code:

<xsl:template match="ROW">
<xsl:value-of select="COLUMN_NAME"/>
<: lt;INPUT NAME="<xsl:value-of select="COLUMN_NAME"/>>
</xsl:template>

although 'gt;' is generated as '>' 'lt;' is generated as '#60;'. How do I generate the "<" character?

Comment

Using the following:

<xsl:text disable-output-escaping="yes">entity-reference</xsl:text>

does what I need.

HTML "<" Conversion Works in oraxsl but not XSLSample.java?

Question

I can't seem to display HTML from XML.In my XML file I store the HTML snippet in an XML tag:

<PRE>
<body.htmlcontent>
<&#60;table width="540" border="0" cellpadding="0" 
cellspacing="0">&#60;tr>&#60;td>&#60;font face="Helvetica, Arial" 
size="2">&#60;!-- STILL IMAGE GOES HERE -->&#60;img 
src="graphics/imagegoeshere.jpg"  width="200" height="175" align="right" 
vspace="0" hspace="7">&#60;!-- END STILL IMAGE TAG -->&#60;!-- CITY OR TOWN NAME 
GOES FIRST FOLLOWED BY TWO LETTER STATE ABBREVIATION -->&#60;b>City, state 
abbreviation&#60;/b> - &#60;!-- CITY OR TOWN NAME ENDS HERE -->&#60;!-- STORY 
TEXT STARTS HERE -->Story text goes here.. &#60;!-- STORY TEXT ENDS HERE 
-->&#60;/font>&#60;/td>&#60;/tr>&#60;/table>
</body.htmlcontent>
</PRE>

I use the following in my XSL:

<xsl:value-of select="body.HTMLcontent" disable-output-escaping="yes"/>

However, the HTML output

<PRE>&#60;</PRE>

is still outputted and all of the HTML tags are displayed in the browser. How do I display the HTML properly?

Comment

That doesn't look right. All of the < are #60; in the code with an ampersand in front of them. They are still that way when they are displayed in the browser.

Even more confusing is that it works with oraxsl, but not with XSLSample.java.

Answer

This makes sense. Here's why:

The former supports <xsl:output> and all options related to writing out output that might not be valid XML (including the disable output escaping). The latter is pure XML-to-XML tree returned, so no <xsl:output> or disabled escaping can be used since nothing's being output, just a DOM tree fragment of the result is being returned.


XSLT Examples

Question

Is there any site which has good examples or small tutorials on XSLT?

Answer

This site is an evolving tutorial on lots of different XML/XSLT/XPath-related subjects:

http://zvon.vscht.cz/ZvonHTML/Zvon/zvonTutorials_en.html

XSLT Features

Question

  1. Is there a list of features of the XSLT that Oracle XDK implements?

  2. So the v2 parsers implement more features of the recommendation than IE5? My first impression supports this, the use of <xsl:choose... and <xsl:if... works with the v2 parser but gives strange messages with IE5.

Answer

  1. Our v2 parsers support the W3C Recommendation of w3c XSLT version 1.0 at http://www.w3.org/TR/XSLT.

  2. You are correct. Ours is XSLT Recommendation compliant.

Using XSL To Convert XML Document To Another Form

Question

I am in the process of trying to convert an xml document from one format to another by means of an xsl (or xslt) stylesheet. Before incorporating it into my java code, I tried testing the transformation from the command line:

 > java oracle.xml.parser.v2.oraxsl jwnemp.xml jwnemp.xsl newjwnemp.xml

The problem is that instead of returning the transformed xml file (newjwnemp.xml), the above command just returns a file with the xsl code from jwnemp.xsl in it. I cannot figure out why this is occurring. I have attached the two input files.

 <?xml version="1.0"?>
 <employee_data>
    <employee_row>
       <employee_number>7950</employee_number>
       <employee_name>CLINTON</employee_name>
       <employee_title>PRESIDENT</employee_title>
       <manager>1111</manager>
       <date_of_hire>20-JAN-93</date_of_hire>
       <salary>125000</salary>
       <commission>1000</commission>
       <department_number>10</department_number>
    </employee_row>
 </employee_data>

 <?xml version='1.0'?>
 <ROWSET xmlns:xsl="HTTP://www.w3.org/1999/XSL/Transform">
    <xsl:for-each select="employee_data/employee_row">
    <ROW>
       <EMPNO><xsl:value-of select="employee_number"/></EMPNO>
       <ENAME><xsl:value-of select="employee_name"/></ENAME>
       <JOB><xsl:value-of select="employee_title"/></JOB>
       <MGR><xsl:value-of select="manager"/></MGR>
       <HIREDATE><xsl:value-of select="date_of_hire"/></HIREDATE>
       <SAL><xsl:value-of select="salary"/></SAL>
       <COMM><xsl:value-of select="commission"/></COMM>
       <DEPTNO><xsl:value-of select="department_number"/></DEPTNO>
    </ROW>
    </xsl:for-each>
 </ROWSET>

Answer

This is occurring nearly 100%-likely because you have the wrong XSL namespace uri for your xmlns:xsl="..." namespace declaration.

If you use: xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

everything works.

If you use xmlns:xsl="-- any other string here --"

If will do what you're seeing.

Information on XSL?

Question

I cannot find anything about using XSL. Can you help? I would like to get an XML and XSL file to show my company what they can expect from this technology. XML alone is not very impressive for users.

Answer

A pretty good starting place for XSL is the following page:

http://metalab.unc.edu/xml/books/bible/updates/14.html

It shows pretty much in english what the gist of xsl is. XSL isn't really anything more than an XML file, so I don't think that it will be anymore impressive to show to a customer. There's also the main website for XSL which is: http://www.w3.org/style/XSL/

XSLProcessor and Multiple Outputs?

Question

I recall seeing discussions about XSLProcessor producing more than one result from one XML and XSL. How can this can be achieved?

Answer

XML Parser 2.0.2.8 supports <ora:output> to handle this.

What Good Books for XML/XSL Can You Recommend?

Question

Can any one suggest good books for learning about XML/XSL?

Answer

There are many excellent articles, white papers, and books that describe all facets of XML technology. Many of these are available on the world wide web. The following are some of the most useful resources we have found:

XML Developer Kits for HP/UX Platform

Question

I would like to know if there are any release plans for the XML Parser or an XDK for HP/UX platform.

Answer

HP-UX ports for our C/C++ Parser as well as our C++ Class Generator are available. Look for an announcement on http://technet.oracle.com

Compressing Large Volumes of XML Documents

Question

Can we compress XML documents when saving them to the database as a CLOB? If they are compressed, what is the implication of using Oracle Text (intermedia) against the documents? We have large XML documents that go up to 1 megabyte and they need to be minimized.

The main requirement is to save cost in terms of disk storage as the XML documents stored are history information (more of a datawarehouse environment). We could save a lot of disk space if we could compress the documents before storage. The searching capability is only secondary, but a big plus.

Answer a

XDK for Java support a compression mechanism in Oracle9i. It supports streaming compression/uncompression. The compression is achieved by removing the markup in the XML Document. The initial version does not support searching the compressed data. This is planned for a future release.

Answer b

If you want to store and search your XML docs, Oracle Text can handle this. I am sure that the size of individual document is not a problem for Oracle Text.

If you want to compress the 1megabyte docs for saving disk space/costs, Oracle Text will not be able to automatically handle a compressed XML document.

Try looking at XMLZip:

http://www.xmls.com/resources/xmlzip.xml?id=resources_xmlzip

My only concern would be the performance hit to do the uncompression. If you are just worried about transmitting the XML from client to server or vice versa, then HTTP compression could be easier.

How Can I Generate an XML Document Based on Two Tables?

Question

I would like to generate an XML-document based on 2 tables with a master detail relationship between them. Suppose I have two tables :

And a master detail relationship between PARENT and CHILD. How can I generate a document that looks like this ?

<?xml version = '1.0'?> 
  <ROWSET> 
     <ROW num="1"> 
       <parent_name>Bill</parent_name> 
         <child_name>Child 1 of 2</child_name> 
         <child_name>Child 2 of 2</child_name> 
      </ROW> 
      <ROW num="2"> 
       <parent_name>Larry</parent_name> 
         <child_name>Only one child</child_name> 
      </ROW> 
  </ROWSET>

Answer

You can (should) use an object view to generate an XML document from a master-detail structure. In your case:

create type child_type is object 
(child_name <data type child_name>) ; 
/ 
create type child_type_nst 
is table of child_type ; 
/ 

create view parent_child 
as 
select p.parent_name 
, cast 
  ( multiset 
    ( select c.child_name 
      from   child c 
      where  c.parent_id = p.id 
    ) as child_type_nst 
  ) child_type 
from parent p 
/ 

A SELECT * FROM parent_child, processed by an SQL to XML utility would generate a valid XML document for your parent child relationship. The structure would not look like the one you have presented, though. It would be like:

<?xml version = '1.0'?> 
<ROWSET> 
   <ROW num="1"> 
      <PARENT_NAME>Bill</PARENT_NAME> 
      <CHILD_TYPE> 
         <CHILD_TYPE_ITEM> 
            <CHILD_NAME>Child 1 of 2</CHILD_NAME> 
         </CHILD_TYPE_ITEM> 
         <CHILD_TYPE_ITEM> 
            <CHILD_NAME>Child 2 of 2</CHILD_NAME> 
         </CHILD_TYPE_ITEM> 
      </CHILD_TYPE> 
  </ROW> 
   <ROW num="2"> 
      <PARENT_NAME>Larry</PARENT_NAME> 
      <CHILD_TYPE> 
         <CHILD_TYPE_ITEM> 
            <CHILD_NAME>Only one child</CHILD_NAME> 
         </CHILD_TYPE_ITEM> 
      </CHILD_TYPE> 
  </ROW> 
</ROWSET> 



Go to previous page Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index