Hi, If you were to use XNI (or SAX for that matter) you're not really writing a parser. Your DOM builder would be an event handler which responds to callbacks to build the DOM.
The Xerces DOM builder isn't lazy. It reads the whole XML document into memory before returning control to the user. It's the materialization of nodes in the deferred DOM implementation which is lazy. Instead of building the DOM using the standard API methods, Xerces calls methods specific to the deferred DOM to build table like structures (stored within the Document node) which are more compact than actual DOM node objects. As the user walks the DOM tree these tables are read to fill in the node objects in the tree. It does improve memory usage when only a fraction of the tree is accessed, but it's nowhere near as memory efficient as an implementation which lazily loads data from the parser or is able to unload nodes like you suggest. If you want your DOM implementation to pull data from the parser lazily you could try using an XMLPullParserConfiguration [1] or a standard API like StAX (not supported by Xerces) which can be used to incrementally parse the document. Thanks. [1] http://xerces.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/parser/XMLPullParserConfiguration.html Michael Glavassevich XML Technologies and WAS Development IBM Toronto Lab E-mail: mrgla...@ca.ibm.com E-mail: mrgla...@apache.org Dominik Rauch <e0825...@student.tuwien.ac.at> wrote on 04/10/2012 03:13:38 PM: > Hi Michael! > > XNI sounds neat, but do you really think we need to write our own parser as > well? We want to parse XML just as everybody else does, we just want it to > do that lazily - like the lazy Xerces parser. It would just be importatn to > be able to reload some parts later on in case we've dismissed it from the > memory in order to lower RAM usage. > > Is that not possible with the currrent Xerces parser? How does the current > lazy implementation work then? > > Best regards, > D.R. > > PS: Now posting from OldNabble instead of Outlook, hope this helps. > > > > Michael Glavassevich-3 wrote: > > > > Hi, > > > > Have you had a read through the XNI manual [1]? This is the framework on > > which all parsers in Xerces are built-on. > > > > You should be able to reuse much of that infrastructure, in particular the > > existing XMLParserConfigurations [2] which are the real guts of a parser > > in Xerces. > > > > Thanks. > > > > [1] http://xerces.apache.org/xerces2-j/xni.html > > [2] http://xerces.apache.org/xerces2-j/faq-xni.html#faq-3 > > > > Michael Glavassevich > > XML Technologies and WAS Development > > IBM Toronto Lab > > E-mail: mrgla...@ca.ibm.com > > E-mail: mrgla...@apache.org > > > > "Dominik Rauch" <e0825...@student.tuwien.ac.at> wrote on 27/09/2012 > > 04:25:09 PM: > > > >> Hello Xerces-List! > >> > >> We’re currently thinking about writing an advanced lazy DOM > >> implementation compliant with the W3C DOM specification. > >> We know that there is already a Xerces lazy-loading-solution, > >> however, it is never unloading nodes, which becomes a problem for > >> very big DOM trees which do not fit into memory. > >> > >> There are some ideas and/or commercial products (like xDB), however, > >> no open-source solution yet. > >> > >> We want to know if it is possible to replace the Xerces DOM parser > >> with our own lazy implementation and reuse all the XPath/etc. > >> features from Xerces or if we need to write everything from scratch. > >> > >> Hopefully you can give us a positive answer and maybe show us the > >> main extension points where we would have to fit in our > >> implementation (e.g. classes/packages we would have to re-implement > >> / derive / etc.) > >> > >> > >> Best regards, > >> D.R. > >> Technical University of Vienna > > > > > > -- > View this message in context: http://old.nabble.com/Lazy-DOM-with- > regards-to-memory-usage-tp34488915p34513281.html > Sent from the Xerces - J - Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org > For additional commands, e-mail: j-users-h...@xerces.apache.org