org.apache.xml.utils
Interface DTM


public interface DTM

**For internal use only** DTM is an XML document model expressed as a table rather than an object tree. It attempts to provide an interface to a parse tree that has very little object creation.

Nodes in the DTM are identified by integer "handles". A handle must be unique within a document. A processing application must be careful, because a handle is not unique within a process... you can have two handles that belong to different documents. It is up to the calling application to keep track of the association of a document with it's handle.

Namespace URLs, local-names, and expanded-names can all be represented by integer ID values. An expanded name is made of a combination of the URL ID in the high two bytes, and the local-name ID is held in the low two bytes. Thus a comparison of an expanded name can be quickly made in a single operation. Also, the symbol space for URLs and local-names must be limited to 32K each (if you are to only use positive values for index lookup), which means that these should not be part of a general string pool mechanism. Note that the namespace URL id can be 0, which should have the meaning that the namespace is null. Zero should also not be used for a local-name index.

The model of the tree, as well as the general navigation model, is that of XPath 1.0, for the moment. The model will be adapted to match the XPath 2.0 data model, XML Schema, and InfoSet.

DTM does _not_ directly support the W3C's Document Object Model. However, it attempts to come close enough that an implementation of DTM can be created that wraps a DOM.

State: In progress!!


Field Summary
static short ATTRIBUTE_NODE
          The node is an Attr.
static short CDATA_SECTION_NODE
          The node is a CDATASection.
static short COMMENT_NODE
          The node is a Comment.
static short DOCUMENT_FRAGMENT_NODE
          The node is a DocumentFragment.
static short DOCUMENT_NODE
          The node is a Document.
static short DOCUMENT_TYPE_NODE
          The node is a DocumentType.
static short ELEMENT_NODE
          The node is an Element.
static short ENTITY_NODE
          The node is an Entity.
static short ENTITY_REFERENCE_NODE
          The node is an EntityReference.
static short NAMESPACE_NODE
          The node is a namespace node.
static short NOTATION_NODE
          The node is a Notation.
static short PROCESSING_INSTRUCTION_NODE
          The node is a ProcessingInstruction.
static short TEXT_NODE
          The node is a Text node.
 
Method Summary
 void dispatchCharactersEvents(int nodeHandle, ContentHandler ch)
          Directly call the characters method on the passed ContentHandler for the string-value of the given node (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value).
 void dispatchToEvents(int nodeHandle, ContentHandler ch)
          Directly create SAX parser events from a subtree.
 boolean getDocumentAllDeclarationsProcessed()
          Return an indication of whether the processor has read the complete DTD.
 java.lang.String getDocumentBaseURI(int nodeHandle)
          Return the base URI of the document entity.
 java.lang.String getDocumentEncoding(int nodeHandle)
          Return the name of the character encoding scheme in which the document entity is expressed.
 java.lang.String getDocumentStandalone(int nodeHandle)
          Return an indication of the standalone status of the document, either "yes" or "no".
 int getDocumentSystemIdentifier(int nodeHandle)
          Return the system identifier of the document entity.
 int getDocumentTypeDeclarationPublicIdentifier()
          Return the public identifier of the external subset, normalized as described in 4.2.2 External Entities [XML].
 java.lang.String getDocumentTypeDeclarationSystemIdentifier()
          A document type declaration information item has the following properties: 1.
 java.lang.String getDocumentVersion(int documentHandle)
          Return a string representing the XML version of the document.
 int getExpandedNameID(int nodeHandle)
          Given a node handle, return an ID that represents the node's expanded name.
 int getExpandedNameID(java.lang.String namespace, java.lang.String localName)
          Given an expanded name, return an ID.
 int getFirstAttribute(int nodeHandle)
          Given a node handle, get the index of the node's first attribute.
 int getFirstChild(int nodeHandle)
          Given a node handle, get the handle of the node's first child.
 int getFirstNamespaceNode(int nodeHandle, boolean inScope)
          Given a node handle, get the index of the node's first child.
 int getLastChild(int nodeHandle)
          Given a node handle, advance to its last child.
 short getLevel(int nodeHandle)
          **For internal use only** Get the depth level of this node in the tree (equals 1 for a parentless node).
 java.lang.String getLocalName(int nodeHandle)
          Given a node handle, return its DOM-style localname.
 java.lang.String getLocalNameFromExpandedNameID(int ExpandedNameID)
          Given an expanded-name ID, return the local name part.
 java.lang.String getNamespaceFromExpandedNameID(int ExpandedNameID)
          Given an expanded-name ID, return the namespace URI part.
 java.lang.String getNamespaceURI(int nodeHandle)
          Given a node handle, return its DOM-style namespace URI (As defined in Namespaces, this is the declared URI which this node's prefix -- or default in lieu thereof -- was mapped to.)
 int getNextAttribute(int nodeHandle)
          Given a node handle, advance to the next attribute.
 int getNextDescendant(int subtreeRootHandle, int nodeHandle)
          Given a node handle, advance to its next descendant.
 int getNextFollowing(int axisContextHandle, int nodeHandle)
          Given a node handle, advance to the next node on the following axis.
 int getNextNamespaceNode(int namespaceHandle, boolean inScope)
          Given a namespace handle, advance to the next namespace.
 int getNextPreceding(int axisContextHandle, int nodeHandle)
          Given a node handle, advance to the next node on the preceding axis.
 int getNextSibling(int nodeHandle)
          Given a node handle, advance to its next sibling.
 java.lang.String getNodeName(int nodeHandle)
          Given a node handle, return its DOM-style node name.
 int getNodeType(int nodeHandle)
          Given a node handle, return its DOM-style node type.
 java.lang.String getNodeValue(int nodeHandle)
          Given a node handle, return its node value.
 int getParent(int nodeHandle)
          Given a node handle, find its parent node.
 java.lang.String getPrefix(int nodeHandle)
          Given a node handle, return its DOM-style name prefix.
 int getPreviousSibling(int nodeHandle)
          Given a node handle, find its preceeding sibling.
 java.lang.String getStringValue(int nodeHandle)
          Get the string-value of a node as a String object (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value).
 char[] getStringValueChunk(int nodeHandle, int chunkIndex, int[] startAndLen)
          Get a character array chunk in the string-value of a node.
 int getStringValueChunkCount(int nodeHandle)
          Get number of character array chunks in the string-value of a node.
 boolean isAttributeSpecified(int attributeHandle)
          5.
 boolean isCharacterElementContentWhitespace(int nodeHandle)
          2.
 boolean isDocumentAllDeclarationsProcessed(int documentHandle)
          10.
 boolean isNodeAfter(int nodeHandle1, int nodeHandle2)
          Figure out whether nodeHandle2 should be considered as being later in the document than nodeHandle1, in Document Order as defined by the XPath model.
 void setFeature(java.lang.String featureId, boolean state)
          Set an implementation dependent feature.
 void setParseBlockSize(int blockSizeSuggestion)
          Set a suggested parse block size for the parser.
 

Field Detail

ELEMENT_NODE

public static final short ELEMENT_NODE
The node is an Element.

ATTRIBUTE_NODE

public static final short ATTRIBUTE_NODE
The node is an Attr.

TEXT_NODE

public static final short TEXT_NODE
The node is a Text node.

CDATA_SECTION_NODE

public static final short CDATA_SECTION_NODE
The node is a CDATASection.

ENTITY_REFERENCE_NODE

public static final short ENTITY_REFERENCE_NODE
The node is an EntityReference.

ENTITY_NODE

public static final short ENTITY_NODE
The node is an Entity.

PROCESSING_INSTRUCTION_NODE

public static final short PROCESSING_INSTRUCTION_NODE
The node is a ProcessingInstruction.

COMMENT_NODE

public static final short COMMENT_NODE
The node is a Comment.

DOCUMENT_NODE

public static final short DOCUMENT_NODE
The node is a Document.

DOCUMENT_TYPE_NODE

public static final short DOCUMENT_TYPE_NODE
The node is a DocumentType.

DOCUMENT_FRAGMENT_NODE

public static final short DOCUMENT_FRAGMENT_NODE
The node is a DocumentFragment.

NOTATION_NODE

public static final short NOTATION_NODE
The node is a Notation.

NAMESPACE_NODE

public static final short NAMESPACE_NODE
The node is a namespace node.
Method Detail

setParseBlockSize

public void setParseBlockSize(int blockSizeSuggestion)
Set a suggested parse block size for the parser.
Parameters:
blockSizeSuggestion - Suggested size of the parse blocks, in bytes.

setFeature

public void setFeature(java.lang.String featureId,
                       boolean state)
Set an implementation dependent feature.
Parameters:
featureId - A feature URL.
state - true if this feature should be on, false otherwise.

getFirstChild

public int getFirstChild(int nodeHandle)
Given a node handle, get the handle of the node's first child. If not yet resolved, waits for more nodes to be added to the document and tries again.
Parameters:
nodeHandle - int Handle of the node..
Returns:
int DTM node-number of first child, or -1 to indicate none exists.

getLastChild

public int getLastChild(int nodeHandle)
Given a node handle, advance to its last child. If not yet resolved, waits for more nodes to be added to the document and tries again.
Parameters:
nodeHandle - int Handle of the node..
Returns:
int Node-number of last child, or -1 to indicate none exists.

getFirstAttribute

public int getFirstAttribute(int nodeHandle)
Given a node handle, get the index of the node's first attribute.
Parameters:
nodeHandle - int Handle of the node..
Returns:
Handle of first attribute, or -1 to indicate none exists.

getFirstNamespaceNode

public int getFirstNamespaceNode(int nodeHandle,
                                 boolean inScope)
Given a node handle, get the index of the node's first child. If not yet resolved, waits for more nodes to be added to the document and tries again
Parameters:
nodeHandle - handle to node, which should probably be an element node, but need not be.
inScope - true if all namespaces in scope should be returned, false if only the namespace declarations should be returned.
Returns:
handle of first namespace, or -1 to indicate none exists.

getNextSibling

public int getNextSibling(int nodeHandle)
Given a node handle, advance to its next sibling. If not yet resolved, waits for more nodes to be added to the document and tries again.
Parameters:
nodeHandle - int Handle of the node..
Returns:
int Node-number of next sibling, or -1 to indicate none exists.

getPreviousSibling

public int getPreviousSibling(int nodeHandle)
Given a node handle, find its preceeding sibling. WARNING: DTM is asymmetric; this operation is resolved by search, and is relatively expensive.
Parameters:
postition - int Handle of the node..
nodeHandle - the id of the node.
Returns:
int Node-number of the previous sib, or -1 to indicate none exists.

getNextAttribute

public int getNextAttribute(int nodeHandle)
Given a node handle, advance to the next attribute. If an element, we advance to its first attribute; if an attr, we advance to the next attr on the same node.
Parameters:
nodeHandle - int Handle of the node..
Returns:
int DTM node-number of the resolved attr, or -1 to indicate none exists.

getNextNamespaceNode

public int getNextNamespaceNode(int namespaceHandle,
                                boolean inScope)
Given a namespace handle, advance to the next namespace.
Parameters:
namespaceHandle - handle to node which must be of type NAMESPACE_NODE.
Returns:
handle of next namespace, or -1 to indicate none exists.

getNextDescendant

public int getNextDescendant(int subtreeRootHandle,
                             int nodeHandle)
Given a node handle, advance to its next descendant. If not yet resolved, waits for more nodes to be added to the document and tries again.
Parameters:
subtreeRootNodeHandle -  
nodeHandle - int Handle of the node..
Returns:
handle of next descendant, or -1 to indicate none exists.

getNextFollowing

public int getNextFollowing(int axisContextHandle,
                            int nodeHandle)
Given a node handle, advance to the next node on the following axis.
Parameters:
axisContextHandle - the start of the axis that is being traversed.
nodeHandle -  
Returns:
handle of next sibling, or -1 to indicate none exists.

getNextPreceding

public int getNextPreceding(int axisContextHandle,
                            int nodeHandle)
Given a node handle, advance to the next node on the preceding axis.
Parameters:
axisContextHandle - the start of the axis that is being traversed.
nodeHandle - the id of the node.
Returns:
int Node-number of preceding sibling, or -1 to indicate none exists.

getParent

public int getParent(int nodeHandle)
Given a node handle, find its parent node.
Parameters:
postition - int Handle of the node..
nodeHandle - the id of the node.
Returns:
int Node-number of parent, or -1 to indicate none exists.

getStringValue

public java.lang.String getStringValue(int nodeHandle)
Get the string-value of a node as a String object (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value).
Parameters:
nodeHandle - The node ID.
Returns:
A string object that represents the string-value of the given node.

getStringValueChunkCount

public int getStringValueChunkCount(int nodeHandle)
Get number of character array chunks in the string-value of a node. (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). Note that a single text node may have multiple text chunks.
Parameters:
nodeHandle - The node ID.
Returns:
number of character array chunks in the string-value of a node.

getStringValueChunk

public char[] getStringValueChunk(int nodeHandle,
                                  int chunkIndex,
                                  int[] startAndLen)
Get a character array chunk in the string-value of a node. (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). Note that a single text node may have multiple text chunks.
Parameters:
nodeHandle - The node ID.
chunkIndex - Which chunk to get.
startAndLen - An array of 2 where the start position and length of the chunk will be returned.
Returns:
The character array reference where the chunk occurs.

getExpandedNameID

public int getExpandedNameID(int nodeHandle)
Given a node handle, return an ID that represents the node's expanded name.
Parameters:
nodeHandle - The handle to the node in question.
Returns:
the expanded-name id of the node.

getExpandedNameID

public int getExpandedNameID(java.lang.String namespace,
                             java.lang.String localName)
Given an expanded name, return an ID. If the expanded-name does not exist in the internal tables, the entry will be created, and the ID will be returned. Any additional nodes that are created that have this expanded name will use this ID.
Parameters:
nodeHandle - The handle to the node in question.
Returns:
the expanded-name id of the node.

getLocalNameFromExpandedNameID

public java.lang.String getLocalNameFromExpandedNameID(int ExpandedNameID)
Given an expanded-name ID, return the local name part.
Parameters:
ExpandedNameID - an ID that represents an expanded-name.
Returns:
String Local name of this node.

getNamespaceFromExpandedNameID

public java.lang.String getNamespaceFromExpandedNameID(int ExpandedNameID)
Given an expanded-name ID, return the namespace URI part.
Parameters:
ExpandedNameID - an ID that represents an expanded-name.
Returns:
String URI value of this node's namespace, or null if no namespace was resolved.

getNodeName

public java.lang.String getNodeName(int nodeHandle)
Given a node handle, return its DOM-style node name.
Parameters:
nodeHandle - the id of the node.
Returns:
String Name of this node.

getLocalName

public java.lang.String getLocalName(int nodeHandle)
Given a node handle, return its DOM-style localname. (As defined in Namespaces, this is the portion of the name after any colon character)
Parameters:
nodeHandle - the id of the node.
Returns:
String Local name of this node.

getPrefix

public java.lang.String getPrefix(int nodeHandle)
Given a node handle, return its DOM-style name prefix. (As defined in Namespaces, this is the portion of the name before any colon character)
Parameters:
postition - int Handle of the node..
nodeHandle - the id of the node.
Returns:
String prefix of this node's name, or null if no explicit namespace prefix was given.

getNamespaceURI

public java.lang.String getNamespaceURI(int nodeHandle)
Given a node handle, return its DOM-style namespace URI (As defined in Namespaces, this is the declared URI which this node's prefix -- or default in lieu thereof -- was mapped to.)
Parameters:
postition - int Handle of the node..
nodeHandle - the id of the node.
Returns:
String URI value of this node's namespace, or null if no namespace was resolved.

getNodeValue

public java.lang.String getNodeValue(int nodeHandle)
Given a node handle, return its node value. This is mostly as defined by the DOM, but may ignore some conveniences.

Parameters:
postition - int Handle of the node..
nodeHandle - The node id.
Returns:
String Value of this node, or null if not meaningful for this node type.

getNodeType

public int getNodeType(int nodeHandle)
Given a node handle, return its DOM-style node type.
Parameters:
postition - int Handle of the node..
nodeHandle - The node id.
Returns:
int Node type, as per the DOM's Node._NODE constants.

getLevel

public short getLevel(int nodeHandle)
**For internal use only** Get the depth level of this node in the tree (equals 1 for a parentless node).
Parameters:
nodeHandle - The node id.
Returns:
the number of ancestors, plus one

getDocumentBaseURI

public java.lang.String getDocumentBaseURI(int nodeHandle)
Return the base URI of the document entity. If it is not known (because the document was parsed from a socket connection or from standard input, for example), the value of this property is unknown.
Parameters:
nodeHandle - The node id, which can be any valid node handle.
Returns:
the document base URI String object or null if unknown.

getDocumentSystemIdentifier

public int getDocumentSystemIdentifier(int nodeHandle)
Return the system identifier of the document entity. If it is not known, the value of this property is unknown.
Parameters:
nodeHandle - The node id, which can be any valid node handle.
Returns:
the system identifier String object or null if unknown.

getDocumentEncoding

public java.lang.String getDocumentEncoding(int nodeHandle)
Return the name of the character encoding scheme in which the document entity is expressed.
Parameters:
nodeHandle - The node id, which can be any valid node handle.
Returns:
the document encoding String object.

getDocumentStandalone

public java.lang.String getDocumentStandalone(int nodeHandle)
Return an indication of the standalone status of the document, either "yes" or "no". This property is derived from the optional standalone document declaration in the XML declaration at the beginning of the document entity, and has no value if there is no standalone document declaration.
Parameters:
nodeHandle - The node id, which can be any valid node handle.
Returns:
the document standalone String object, either "yes", "no", or null.

getDocumentVersion

public java.lang.String getDocumentVersion(int documentHandle)
Return a string representing the XML version of the document. This property is derived from the XML declaration optionally present at the beginning of the document entity, and has no value if there is no XML declaration.
Parameters:
the - document handle
Returns:
the document version String object

getDocumentAllDeclarationsProcessed

public boolean getDocumentAllDeclarationsProcessed()
Return an indication of whether the processor has read the complete DTD. Its value is a boolean. If it is false, then certain properties (indicated in their descriptions below) may be unknown. If it is true, those properties are never unknown.
Returns:
true if all declarations were processed; false otherwise.

getDocumentTypeDeclarationSystemIdentifier

public java.lang.String getDocumentTypeDeclarationSystemIdentifier()
A document type declaration information item has the following properties: 1. [system identifier] The system identifier of the external subset, if it exists. Otherwise this property has no value.
Returns:
the system identifier String object, or null if there is none.

getDocumentTypeDeclarationPublicIdentifier

public int getDocumentTypeDeclarationPublicIdentifier()
Return the public identifier of the external subset, normalized as described in 4.2.2 External Entities [XML]. If there is no external subset or if it has no public identifier, this property has no value.
Parameters:
the - document type declaration handle
Returns:
the public identifier String object, or null if there is none.

isNodeAfter

public boolean isNodeAfter(int nodeHandle1,
                           int nodeHandle2)
Figure out whether nodeHandle2 should be considered as being later in the document than nodeHandle1, in Document Order as defined by the XPath model. This may not agree with the ordering defined by other XML applications.

There are some cases where ordering isn't defined, and neither are the results of this function -- though we'll generally return true. TODO: Make sure this does the right thing with attribute nodes!!!

Parameters:
node1 - DOM Node to perform position comparison on.
node2 - DOM Node to perform position comparison on .
Returns:
false if node2 comes before node1, otherwise return true. You can think of this as (node1.documentOrderPosition <= node2.documentOrderPosition).

isCharacterElementContentWhitespace

public boolean isCharacterElementContentWhitespace(int nodeHandle)
2. [element content whitespace] A boolean indicating whether the character is white space appearing within element content (see [XML], 2.10 "White Space Handling"). Note that validating XML processors are required by XML 1.0 to provide this information. If there is no declaration for the containing element, this property has no value for white space characters. If no declaration has been read, but the [all declarations processed] property of the document information item is false (so there may be an unread declaration), then the value of this property is unknown for white space characters. It is always false for characters that are not white space.
Parameters:
nodeHandle - the node ID.
Returns:
true if the character data is whitespace; false otherwise.

isDocumentAllDeclarationsProcessed

public boolean isDocumentAllDeclarationsProcessed(int documentHandle)
10. [all declarations processed] This property is not strictly speaking part of the infoset of the document. Rather it is an indication of whether the processor has read the complete DTD. Its value is a boolean. If it is false, then certain properties (indicated in their descriptions below) may be unknown. If it is true, those properties are never unknown.
Parameters:
the - document handle
documentHandle - A node handle that must identify a document.
Returns:
true if all declarations were processed; false otherwise.

isAttributeSpecified

public boolean isAttributeSpecified(int attributeHandle)
5. [specified] A flag indicating whether this attribute was actually specified in the start-tag of its element, or was defaulted from the DTD.
Parameters:
the - attribute handle NEEDSDOC @param attributeHandle
Returns:
true if the attribute was specified; false if it was defaulted.

dispatchCharactersEvents

public void dispatchCharactersEvents(int nodeHandle,
                                     ContentHandler ch)
                              throws SAXException
Directly call the characters method on the passed ContentHandler for the string-value of the given node (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). Multiple calls to the ContentHandler's characters methods may well occur for a single call to this method.
Parameters:
nodeHandle - The node ID.
ch - A non-null reference to a ContentHandler.
Throws:
SAXException -  

dispatchToEvents

public void dispatchToEvents(int nodeHandle,
                             ContentHandler ch)
                      throws SAXException
Directly create SAX parser events from a subtree.
Parameters:
nodeHandle - The node ID.
ch - A non-null reference to a ContentHandler.
Throws:
SAXException -  


Copyright © 2000 Apache XML Project. All Rights Reserved.