Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> [ANNOUNCEMENT] VTD-XML releases under GPL
I am pleased to announce that version 0.5 of VTD-XML -- a new,
non-extractive, Java-base XML processing API licensed under GPL
-- is now freely available on sourceforge.net. For source code,
documentation, detailed description of API and code examples,
please visit
Capable of random-access, VTD-XML attempts to be both memory
efficient and high performance. The starting point of this project is
the observation that, for XML documents that don't declare entities
in DTD, tokenization can indeed be done by only recording the starting
offset and length of a token. A discussion on this subject appeared
in a recently article on xml.com
(http://www.xml.com/pub/a/2004/05/19/parsing.html).
The core technology of VTD-XML is a binary format specification called Virtual Token Descriptor (VTD). A VTD record is a 64-bit integer that encodes the starting offset, length, type and nesting depth of a token in an XML document. Because VTD records don't contain actually token content, they work alongside of the original XML document, which is maintained intact in memory by the processing model.
VTD's memory-conserving features can be summarized as follows:
Our benchmark indicates that VTD-XML processes XML at the performance level similar to (and often better than) SAX with NULL content handler. The memory usage is typically between 1.3x ~ 1.6x of the size of the document, with "1" being the document itself.
Other features included in this release are:
In the upcoming releases, we plan to add the persistence support so that one can save/load VTD to/from the disk along with the XML documents to avoid repetitive parsing in read-only situations. XPATH support is also on the development roadmap. However, we would like to collect as many suggestions and bug reports before taking the next step.
Your input and suggestions are very important to make VTD-XML a truly useful XML processor.
Thanks,
Jimmy Zhang Received on Mon Jun 28 2004 - 13:27:57 CDT