JSR-173 fits very well with the cursor-style strategy on the store.
This is especially true of 173's XmlStreamReader interface - it
can avoid object creation when iterating through XML.  We've been
closely reviewing the JSR and providing feedback to the EG.
 
173 hasn't gone final yet, but we'll be working on implementing
the API both to read information out of the store and to connect
to fast parsers.
 
Difference between XmlCursor and XmlStreamReader itself is
that XmlCursor is random-access-read-write while JSR-173 is
read-only-forward-only.  Sort of akin to the difference
between DOM and SAX usage.  So XmlCursor and XmlStreamReader
are the same basic strategy, but applied to a different
use case.

-----Original Message-----
From: Ted Leung [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 06, 2003 2:22 AM
To: [EMAIL PROTECTED]
Cc: Jakarta General List; [EMAIL PROTECTED]
Subject: Re: XMLBeans performance and source code status [Re: Proposal:
XMLBeans]


Eric,

What's the relationship between XmlCursor and the JSR-173 Streaming API 
for XML?

Ted

Eric Vasilik wrote:

>When working with XMLBeans in a strongly typed way (with a Schema), individual 
>objects are created for each piece of information, usually instances of simple and 
>complex Schema types.  However, you can also access and manipulate the XML in a 
>typeless manor.  What we've done with XMLBeans is provided access to the full XML 
>Infoset via the XmlCursor interface.
>
>XmlCursor provides functionality very similar to the DOM, but takes a very different 
>tact.  Instead of creating an DOM Node for each element, attribute, text, etc, one 
>may create a single XmlCursor and navigate that cursor about the XML instance, 
>interrogating the XML: element/attr names, child/parent elements, text, comments, 
>etc.  Also, one may modify the XML by removing elements and attrs, inserting text, 
>for example.  All of this can be done by either not creating objects or reusing 
>objects so that the number of objects needed to operate on the XML is constant, not 
>on the order of the size of the XML like a DOM would require.
>
>The kind of interface allows an implementer of an in memory XML store more freedom to 
>implement the internal structure which represents the XML in memory.  One, for 
>example, could simply store the XML as it was, for example, read in from disk and 
>implement a cursor as an index into that string, parsing or modifying the parts of 
>the string as necessary to satisfy the requests.  We don't go to quite this extreme.  
>In principle, we create one object for every leaf element or attribute and two 
>objects for every interior element.  All text for attribute values, comments, 
>procinst's and text between element markup is stored in a single character array.
>
>We have found that creating fewer objects and batching text leads to loading the XML 
>into memory faster as well as having a similar, if not slightly smaller, memory 
>footprint when compared to the DOM.  Also, working with cursors seems to be an easier 
>programming model than the DOM as it does not have text nodes and is more intuitive.
>
>With respect to the synchronized access, the strongly typed schema XMLBeans objects 
>cache values so that conversion to text does not occur until it is needed.  Likewise, 
>when modifications are made to the XML Infoset, the strongly typed data (ints, for 
>example) are not parsed from the text until requested.  In general the impact of 
>synchronization is quite low because of the lazy approach we have taken along with 
>the caching.  As I read your question again, I realize that you may have interpreted 
>synchronized to mean "managing data among several threads".  The synchronization 
>described refers to the fact that one may manipulate the XML via the XmlCursor or the 
>strongly typed XMLBean classes generated from the schema, each mechanism capable of 
>seeing the changes from the other in a tightly integrated way.
>
>With respect to building XMLBeans, we plan to remove any dependency upon the jars you 
>mentioned.  Indeed, there exists very little dependence on these.  Mostly just 
>interfaces, not any classes needed for the implementation.
>
>- Eric Vasilik
>
>  
>



---------------------------------------------------------------------
In case of troubles, e-mail:     [EMAIL PROTECTED]
To unsubscribe, e-mail:          [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to