Skip Headers

Oracle9i Application Developer's Guide - XML
Release 1 (9.0.1)

Part Number A88894-01
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index

Go to previous page Go to next page

F
XDK for C++: Specifications and Cheat Sheet

This appendix contains the following sections:

XML Parser for C++ Specifications

Oracle provides a set of XML parsers for Java, C, C++, and PL/SQL. Each of these parsers is a stand-alone XML component that parses an XML document (or a standalone DTD) so that it can be processed by an application. Library and command-line versions are provided supporting the following standards and features:

Validating and Non-Validating Mode Support

The XML Parser for C++ can parse XML in validating or non-validating modes.

Validation involves checking whether or not the attribute names and element tags are legal, whether nested elements belong where they are, and so on.

Example Code

See Chapter 26, "Using XML Parser for C++" for example code and suggestions on how to use the XML Parser for C++.

Online Documentation

Documentation for Oracle XML Parser for C++ is located in the $ORACLE_HOME/xdk/cpp/parser/doc directory.

Release Specific Notes

The readme.html file in the root directory of the archive contains release specific information including bug fixes, API additions, and so on.

The Oracle XML parser for C++ is written in C with C++ wrappers. It will check if an XML document is well-formed, and optionally validate it against a DTD. The parser will construct an object tree which can be accessed via a DOM interface or operate serially via a SAX interface.

Standards Conformance

XML Parser for C++ conforms to the following standards:

Supported Character Set Encodings

XML Parser for C++ supports documents in the following encodings, in addition to the ones specified in Appendix A, "Character Sets", of Oracle9i Globalization and National Language Support Guide:

Default:

The default encoding is UTF-8. It is recommended that you set the default encoding explicitly if using only single byte character sets (such as US-ASCII or any of the ISO-8859 character sets) for performance up to 25% faster than with multibyte character sets, such as UTF-8.

XML Parser for C++ Revision History

Table F-1 lists the XML Parser for C++ revision history.


Table F-1 XML Parser for C++: Revision History  
Revision  Description 

Oracle XML Parser 2.0.4.0.0 (C++) 

First production v2 release. Changes are mainly bug fixes.

For XML parser, the following bugs were fixed:

  • 1352943 XMLPARSE() SOMETIMES CHOKES ON FILENAMES

  • 1302311 PROBLEM WITH PARAMETER ENTITY PROCESSING

  • 1323674 INCONSISTENT ERROR HANDLING IN THE C XML PARSER

  • 1328871 LPXPRINTBUFFER UNCONDITIONALLY PREPENDS XML COMMENT TO OUTPUT

  • 1349962 USING FREED MEMORY LOCATION CAUSES TLPXVNSA31.DIF oraxmldom.h was renamed to oradom.h

 

 

For the XSLT processor, the following bugs were fixed:

  • 1225546 USELESS ERROR MESSAGE NEEDS DETAIL

  • 1267616 TLPXST14.DIF: REPLACE DBL_MAX WITH SBIG_ORAMAXVAL IN LPXXP.C:LPXXPSUBSTRING()

  • 1289228 ERROR CONTEXT REQUIRED FOR DEBUGGING: FILE NAME, LINE#, FUNCTION, ETC

 

 

  • 1289214 XSL:CHOOSE DOESN'T WORK

  • 1298028 XPATH CONSTRUCT NOT(POSITION()=LAST()) NOT WORKING

  • 1298193 XPATH FUNCTIONS DON'T PROVIDE IMPLICIT TYPE CONVERSION OF PARAMS

  • 1323665 C XML PARSER CANNOT SET BASE DIRECTORY OR URI FOR STYLESHEET PARSING

  • 1325452 SEVERE MEMORY CONSUMPTION / LEAK IN XSLPROCESS

  • 1333693 CHAINED TRANSFORMS WITH C XSL PROCESSOR DON'T WORK: LPX-00002

 

Oracle XML Parser 2.0.3.0.0 (C++) 

SAX memory usage: Smaller, and flat for any input size and multiple parses (memory leaks plugged).

XSLT memory usage: Improved.

Validation warnings: Validity Constraint (VC) errors have been changed to warnings and do not terminate parsing. For compatibility with the old behavior (halt on warnings as well as errors), a new flag XML_FLAG_STOP_ON_WARNING (or '-W' to the xml program) has been added.

Performance improvements: Switch to finite automata VC structure validation yields 10% performance gain.

HTTP support: HTTP URIs are now supported; look for FTP in the next release. For other access methods, the user may define their own callbacks with the new xmlaccess() API.  

Oracle XML Parser 2.0.2.0.0 (C++) 

XSLT improvements: Various bugs fixed in the XSLT processor; error messages are improved; xsl:number, xsl:sort, xsl:namespace-alias, xsl:decimal-format, forwards-compatible processing with xsl:version, and literal result element as stylesheet are now available; the following XSLT-specific additions to the core XPath library are now available: current(), format-number(), generate-id(), and system-property().

XML parser bug fixes: Some problems with validation and matching of start and end tags with SAX were fixed. Also, a bug with parameter entity processing in external entities was fixed.  

Oracle XML Parser 2.0.1.0.0 (C++) 

Performance improvements: Major performance improvement over the last, about two and a half times faster for UTF-8 parsing and about four times faster for ASCII parsing. Comparison timing against previous version for parsing (DOM) and validating various standalone files (SPARC Ultra 1 CPU time):

File size Old UTF-8 New UTF-8 Speedup Old ASCII New ASCII Speedup

42K 180ms 70ms 2.6 120ms 40ms 3.0

134K 510ms 210ms 2.4 450ms 100ms 4.5

247K 980ms 400ms 2.5 690ms 180ms 3.8

1M 2860ms 1130ms 2.5 1820ms 380ms 4.8

2.7M 10550ms 4100ms 2.6 7450ms 1930ms 3.9

10.5M 42250ms 16400ms 2.6 29900ms 7800ms 3.8

Conformance improvements: Stricter conformance to the XML 1.0 spec yields higher scores on standard test suites (Jim Clark, Oasis, etc).  

 

Lists, not arrays: Internal parser data structures are now uniformly lists; arrays have been dropped. Therefore, access is now better suited to a firstChild/nextSibling style loop instead of numChildNodes/getChildNode. DTD parsing:A new API call xmlparsedtd() is added which parses an external DTD directly, without needing an enclosing document. Used mainly by the Class Generator.  

 

Error reporting: Error messages are improved and more specific, with nearly twice as many as before. Error location is now described by a stack of line number/entity pairs, showing the final location of the error and intermediate inclusions (e.g. line X of file, line Y of entity).

NOTE: You must use the new error message file (lpxus.msb) provided with this release; the error message file provided with earlier releases is incompatible. See below.

XSL improvements: Various bugs fixed in the XSLT processor; xsl:call-template is now fully supported.  

Oracle XML Parser 2.0.1.0.0 (C++) 

Performance improvements: Major performance improvement over the last, about two and a half times faster for UTF-8 parsing and about four times faster for ASCII parsing. Comparison timing against previous version for parsing (DOM) and validating various standalone files (SPARC Ultra 1 CPU time):File sizeOld UTF-8New UTF-8SpeedupOld ASCIINew ASCIISpeedup42K180ms70ms2.6120ms40ms3.0134K510ms210ms2.4450ms100ms4.5247K980ms400ms2.5690ms180ms3.81M2860ms1130ms2.51820ms380ms4.82.7M10550ms4100ms2.67450ms1930ms3.910.5M42250ms16400ms2.629900ms7800ms3.8 

 

Conformance improvements: Stricter conformance to the XML 1.0 spec yields higher scores on standard test suites (Jim Clark, Oasis, etc).

Lists, not arrays: Internal parser data structures are now uniformly lists; arrays have been dropped. Therefore, access is now better suited to a firstChild/nextSibling style loop instead of numChildNodes/item.  

 

DTD parsing:A new method XMLParser::xmlparseDTD() is added which parses an external DTD directly, without needing an enclosing document. Used mainly by the Class Generator.  

 

Error reporting: Error messages are improved and more specific, with nearly twice as many as before. Error location is now described by a stack of line number/entity pairs, showing the final location of the error and intermediate inclusions (e.g. line X of file, line Y of entity).  

 

NOTE: Use the new error message file (lpxus.msb) provided with this release; the error message file provided with earlier releases is incompatible. See below.

XSL improvements: Various bugs fixed in the XSLT processor; xsl:call-template is now fully supported.  

Oracle XML Parser 2.0.0.0.0 (C++) 

The Oracle XML v2 parser is a beta release and is written in C, with a C++ wrapper. The main difference from the Oracle XML v1 parser is the ability to format the XML document according to a stylesheet via an integrated an XSLT processor. The XML parser will check if an XML document is well-formed, and optionally validate it against a DTD. The parser will construct an object tree which can be accessed via a DOM interface or operate serially via a SAX interface.  

XML Parser for C++: XMLParser() API

Table F-2 lists the main XML Parser for C++,class XMLParser() methods with a brief description of each. XMLParser() class contains top-level methods that do the following:

XML Parser for C++: DOM API

Table F-3 lists the XML Parser for C ++ DOM API methods a brief description of each.


Table F-3 XML Parser for C++: DOM API Classes (SubClasses)  
Class (Subclass)  Methods  Description 
Attr (Node)

This class contains methods for accessing the name and value of a single document node attribute. 

getName 

Return name of attribute 

getValue 

Return "value" (definition) of attribute 

getSpecified 

Return attribute's "specified" flag value 

setValue 

Set an attribute's value 

CDATASection (Text)

This class implements the CDATA node type, a subclass of Text. There are no methods. 

   
CharacterData (Node)

This class contains methods for accessing and modifying the data associated with text nodes.  

appendData 

Append a string to this node's data 

deleteData 

Remove a substring from this node's data 

getData 

Get data (value) of a text node 

getLength 

Return length of a text node's data 

insertData 

nsert a string into this node's data 

replaceData 

Replace a substring in this node's data 

substringData 

Fetch a substring of this node's data 

Comment (CharacterData)

This class implements the COMMENT node type, a subclass of CharacterData. There are no methods. 

   
Document (Node)

This class contains methods for creating and retrieving nodes.  

createAttribute 

Create an ATTRIBUTE node 

createCDATASection 

Create a CDATA node 

createComment 

Create a COMMENT node 

createDocumentFragment 

Create a DOCUMENT_FRAGMENT node 

createElement 

Create an ELEMENT node 

createEntityReference 

Create an ENTITY_REFERENCE node 

createProcessingInstruction 

Create a PROCESSING_INSTRUCTION node 

createTextNode 

Create a TEXT node 

getElementsByTagName 

Select nodes based on tag name 

getImplementation 

Return DTD for document 

DocumentFragment (Node)

This class implements the DOCUMENT_FRAGMENT node type, a subclass of Node. 

   
DocumentType (Node)

This class contains methods for accessing information about the Document Type Definition (DTD) of a document. 

getName 

R eturn name of DTD 

getEntities 

Return NamedNodeMap of DTD's (general) entities 

getNotations 

Return NamedNodeMap of DTD's notations 

DOMImplementation

This class contains methods relating to the specific DOM implementation supported by the parser. 

hasFeature 

Detect if the named feature is supported 

Element (Node 

This class contains methods pertaining to element nodes.  

getTagName 

Return the node's tag name 

getAttribute 

Select an attribute given its name 

setAttribute 

Create a new attribute given its name and value 

removeAttribute 

Remove an attribute given its name 

getAttributeNode 

Remove an attribute given its name 

setAttributeNode 

Add a new attribute node 

removeAttributeNode 

Remove an attribute node 

getElementsByTagName 

Return a list of element nodes with the given tag name 

normalize 

"Normalize" an element (merge adjacent text nodes) 

Entity (Node)

This class implements the ENTITY node type, a subclass of Node.  

getNotation 

NameReturn entity's NDATA (notation name) 

getPublicId 

Return entity's public ID 

getSystemId 

Return entity's system ID 

EntityReference (Node)

This class implements the ENTITY_REFERENCE node type, a subclass of Node.  

   
NamedNodeMap

This class contains methods for accessing the number of nodes in a node map and fetching individual nodes. 

item 

Return nth node in map 

getLength 

Return number of nodes in map 

getNamedItem 

Select a node by name 

setNamedItem 

Set a node into the map 

getLength 

Remove the named node from map 

Node

This class contains methods for details about a document node  

appendChild 

Append a new child to the end of the current node's list of children 

cloneNode 

Clone an existing node and optionally all its children 

getAttributes 

Return structure contains all defined node attributes 

getChildNode 

Return specific indexed child of given node 

getChildNodes 

Return structure contains all child nodes of given node 

getFirstChild 

Return first child of given node 

getLastChild 

Return last child of given node 

getLocal 

Returns the local name of the node 

getNamespace 

Return a node's namespace 

getNextSibling 

Return a node's next sibling 

getName 

Return name of node 

getType 

Return numeric type-code of node 

getValue 

Return "value" (data) of node 

getOwnerDocument 

Return document node which contains a node 

getParentNode 

Return parent node of given node 

getPrefix 

Returns the namespace prefix for the node 

getPreviousSibling 

Returns the previous sibling of the current node 

getQualifiedName 

Return namespace qualified node of given node 

hasAttributes 

Determine if node has any defined attributes 

hasChildNodes 

Determine if node has children 

insertBefore 

Insert new child node into a node's list of children 

numChildNodes 

Return count of number of child nodes of given node 

removeChild 

Remove a node from the current node's list of children 

replaceChild 

Replace a child node with another 

setValue 

Sets a node's value (data) 

NodeList

This class contains methods for extracting nodes from a NodeList 

item 

Return nth node in list 

getLength 

Return number of nodes in list 

Notation (Node)

This class implements the NOTATION node type, a subclass of Node. 

getData 

Return notation's data 

getTarget 

Return notation's target 

setData 

Set notation's data 

ProcessingInstruction (Node)

This class implements the PROCESSING_INSTRUCTION node type, a subclass of Node. 

getData 

Return the PI's data 

getTarget 

Return the PI's target 

setData 

Set the PI's data 

Text (CharacterData)

This class contains methods for accessing and modifying the data associated with text nodes (subclasses CharacterData). 

splitText 

Get data (value) of a text node 

XML Parser for C++: XSLT API

XSLT is a language for tranforming XML documents into other XML documents. It is designed for use as part of XSL, which is a stylesheet language for XML. In addition to XSLT, XSL includes an XML vocabulary for specifying formatting. XSL specifies the styling of an XML document by using XSLT to describe how the document is transformed into another XML document that uses the formatting vocabulary.

XSLT is also designed to be used independently of XSL. However, XSLT is not intended as a completely general-purpose XML transformation language. Rather it is designed primarily for the kinds of transformation that are needed when XSLT is used as part of XSL.

A transformation expressed in XSLT describes rules for transforming a source tree into a result tree. The transformation is achieved by associating patterns with templates. A pattern is matched against elements in the source tree. A template is instantiated to create part of the result tree. The result tree is separate from the source tree. The structure of the result tree can be completely different from the structure of the source tree. In constructing the result tree, elements from the source tree can be filtered and reordered, and arbitrary structure can be added.

Stylesheets

A transformation expressed in XSLT is called a stylesheet. This is because, in the case when XSLT is transforming into the XSL formatting vocabulary, the transformation functions as a stylesheet.

A stylesheet contains a set of template rules. A template rule has two parts:

How StylesheetTemplates are Processed

A template is instantiated for a particular source element to create part of the result tree. A template can contain elements that specify literal result element structure. A template can also contain elements from the XSLT namespace that are instructions for creating result tree fragments. When a template is instantiated, each instruction is executed and replaced by the result tree fragment that it creates.

Instructions can select and process descendant source elements. Processing a descendant element creates a result tree fragment by finding the applicable template rule and instantiating its template. Note that elements are only processed when they have been selected by the execution of an instruction. The result tree is constructed by finding the template rule for the root node and instantiating its template.

A software module called an XSL processor reads XML documents and transforms them into other XML documents with different styles.

XML Parser for C++ implementation of the XSL processor follows the XSL Transformations standard (version 1.0, November 16, 1999) and includes the required behavior of an XSL processor as specified in the XSLT specification.

Table F-4 lists the XSLProcessor class methods and syntax summary.

Table F-4 XML Parser for C++: XSLProcessor Class
Class  Method 

XSLProcessor

This class contains top-level methods for invoking the XSL processor.  

xslprocess()

Processes an XSL stylesheet with an XML document source.

Syntax:

uword xslprocess(XMLParser *docctx, XMLParser *xslctx, XMLParser *resctx, Node **result);

where:

docctx (IN/OUT) -- The XML document context

xslctx (IN) -- The XSL stylesheet context

resctx (IN) -- The result document fragment context

result (IN/OUT) -- The result document fragment node  

XML Parser for C++: SAX API

The SAX API is based on callbacks. Instead of the entire document being parsed and turned into a data structure which may be referenced (by the DOM interface), the SAX interface is serial. As the document is processed, appropriate SAX user callback functions are invoked. Each callback function returns an error code, zero meaning success, any non-zero value meaning failure. If a non-zero code is returned, document processing is stopped.

To use SAX, an xmlsaxcb structure is initialized with function pointers and passed to the xmlinit() call. A pointer to a user-defined context structure may also be included; that context pointer will be passed to each SAX function.

This SAX functionality is identical to the XML Parser for C version.

Table F-5 lists the XML Parser for C++, SAX API functions.

Table F-5 XML Parser for C++: SAX API Functions
SAX Function  Brief Description 

characters(void *ctx, const oratext *ch, size_t len) 

Receive notification of character data inside an element.  

endDocument(void *ctx)  

Receive notification of the end of the document.  

endElement(void *ctx, const oratext *name)  

Receive notification of the end of an element.  

ignorableWhitespace(void *ctx, const oratext *ch, size_t len) 

Receive notification of ignorable whitespace in element content.  

notationDecl(void *ctx, const oratext *name, const oratext *publicId, const oratext *systemId) 

Receive notification of a notation declaration.  

processingInstruction(void *ctx, const oratext *target, const oratext *data) 

Receive notification of a processing instruction.  

startDocument(void *ctx) 

Receive notification of the beginning of the document.  

startElement(void *ctx, const oratext *name, const struct xmlattrs *attrs) 

Receive notification of the start of an element.  

unparsedEntityDecl(void *ctx, const oratext *name, const oratext *publicId, const oratext *systemId, const oratext *notationName) 

Receive notification of an unparsed entity declaration.  

 

 

Non-SAX Callback Functions  

 

nsStartElement(void *ctx, const oratext *qname, const oratext *local, const oratext *namespace, const struct xmlattrs *attrs) 

Receive notification of the start of a namespace for an element.  

XML C++ Class Generator Specifications

Working in conjunction with the XML Parser for C++, the XML Class Generator generates a set of C++ source files based on an input DTD. The generated C++ source files can then be used to construct, optionally validate, and print a XML document that is compliant to the DTD specified. The Class Generator supports validation mode to assist debugging.

Input to the XML C++ Class Generator

Input is an XML document containing a DTD. The document body itself is ignored; only the DTD is relevant, though the dummy document must conform to the DTD. The underlying XML parser only accepts file names for the document and associated external entities. In future releases, no dummy document will be required, and URIs for additional protocols will be accepted.

Character Set Support

The following lists supported Character Set Encoding for files input to XML C++ Class Generator. These are in addition to the character sets specified in Appendix A, "Character Sets", of Oracle9i Globalization and National Language Support Guide.

Default:

The default encoding is UTF-8. It is recommended that you set the default encoding explicitly if using only single byte character sets (such as US-ASCII or any of the ISO-8859 character sets) for performance up to 25% faster than with multibyte character sets, such as UTF-8.

Output to XML C++ Class Generator

XML Parser for C++ output is a pair of C++ source files, .cpp and .h, named after the DTD. Constructors are provided for each class (element) that allow an object to be created in two different ways: initially empty, then adding the children or data after the initial creation, or created with the initial full set of children or initial data. A method is provided for #PCDATA (and Mixed) elements to set the data and, when appropriate, set an element's attributes.

Standards Conformance

XML C++ Class Generator conforms to the following "Standards":

Directory Structure

The XML C++ Class Generator has the following file and directory structure:

license.html licensing agreement
bin/         Standalone Class Generator "xmlcg" 
doc/         API documentation
include/     Header files
lib/         XML and support libraries 
mesg/        Error message files (including cause/action information in the
            .msg)
sample/      Example usage 

Table F-6 lists the libraries included with XML C++ Class Generator.

Table F-6 XML C++ Class Generator LIbraries  
XML C++ Class Generator Library  Description 

libxml8.a 

XML Parser/XSL Processor 

libxmlg8.a 

XML Class Generator  

libxmlc8.a 

Compatibility library needed to link with Oracle 8.1.5  

libcore8.a 

CORE functions 

libnls8.a 

National Language Support  


Go to previous page Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Solution Area
Go To Table Of Contents
Contents
Go To Index
Index