|
| |
Don't use XML where it doesn't make sense. XML is not a panacea.
You will not get good performance by transferring and parsing a
lot of XML files.
Using XML is memory, CPU, and network intensive.
|
| |
Avoid creating a new parser each time you parse; reuse parser
instances. A pool of reusable parser instances might be a good idea
if you have multiple threads parsing at the same time.
The parser configuration will affect the performance of the parser.
If you are interested in evaluating the parser performance with DTDs use the
DTDConfiguration (Note: you can build Xerces with DTD-only support
using dtdjars build target).
For testing the performance for XML Schema validation turn on the schema validation feature and use
the StandardParserConfiguration (Note: this is the default parser
configuration).
|
 |  |  |  | XML Application Performance |  |  |  |  |
| |
- Validation -- if you don't need validation (and infoset augmentation)
of XML documents,
don't include validators (DTD or XML Schema) in the pipeline.
Including the validator components in the pipeline will result in a performance
hit for your application: if a validator component is present in the pipeline,
Xerces will try to augment the infoset even if the validation feature is set to false.
If you are only interested in validating against DTDs don't include
XML Schema validator in the pipeline.
- DOCTYPE -- if you don't need validation,
avoid using a DOCTYPE line in your XML document.
The current version of the parser will always read the DTD if the DOCTYPE line
is specified even when validation feature is set to false.
- Deferred DOM --
by default, the DOM feature defer-node-expansion is true,
causing DOM nodes to be expanded as the tree is traversed.
The performance tests produced by Denis Sosnoski showed that Xerces DOM with
deferred node expansion offers poor performance and large memory size
for small documents (0K-10K). Thus, for best performance when using Xerces DOM
with smaller documents you should disable the deferred node expansion feature.
For larger documents (~100K and higher) the deferred DOM offers
better performance than non-deferred DOM but uses a large memory size.
- SAX --
if memory usage using DOM is a concern, SAX should be considered;
the SAX parser uses very little memory and notifies the
application as parts of the document are parsed.
|
|