Yes just a thread safe API for read is all one would reasonably expect.  Writes 
to the DOM, sure I would provide thread locking around where I am doing the 
writes - I wouldn't expect the library to want to handle that.  But reads...

"transaction locking for a group of related changes."

I'm not making any changes, just presenting a view over xml.  Can't you provide 
an extension that would lock the few methods are read only to the DOM but yet 
doing writes to the internal state?  Yes?


From: gzun...@googlemail.com [mailto:gzun...@googlemail.com] On Behalf Of 
Alasdair Thomson
Sent: Tuesday, July 19, 2011 12:46 PM
To: j-users@xerces.apache.org
Subject: RE: DOM thread safety issues & disapearring children


I think the relevant complaint is that the DOM isn't thread safe for read-only 
operations, which is counter-intuitive unless you have knowledge of the 
underlying implementation. I don't think anyone expects it to be thread safe 
for updating.

I've tended to use JAXB to transform the XML to java objects, which I can then 
make sure are thread safe for read only operations.
On Jul 19, 2011 5:26 PM, <kesh...@us.ibm.com<mailto:kesh...@us.ibm.com>> wrote:
> In 99% of the use cases, locking the individual DOM objects/operations
> would be the wrong level of granularity -- what you really need to prevent
> unexpected results is transaction locking for a group of related changes.
> That really does have to be done at the application level.
>
> Locking every individual operation also can have significant performance
> impact, in these days of multikernel/multiprocessor machines, due to the
> need to flush cache in order to make sure all the processors know the
> lock's state has changed. The days of "synchronize is free" really are
> over.
>
> Also, frankly, I would be reluctant to encourage people to rely on a
> protected DOM since if/when they change platforms their code will break
> unexpectedly.
>
> If you really want locks on every operation, you're free to build a
> "threadsafe DOM manipulation" library which provides threadsafety -- eg
> static Threadsafe.appendChild(Node parent, Node newChild). The code won't
> look exactly like a simple DOM call, but in most JVMs this kind of simple
> "tail call" is pretty efficient, it makes what you're doing explicit, and
> it's portable to any DOM you care to throw at it.
>
>
> ______________________________________
> "You build world of steel and stone
> I build worlds of words alone
> Skilled tradespeople, long years taught:
> You shape matter; I shape thought."
> (http://www.songworm.com/lyrics/songworm-parody/ShapesofShadow.html)
>
>
>
> From:
> "Newman, John W" <newma...@d3onc.com<mailto:newma...@d3onc.com>>
> To:
> "j-users@xerces.apache.org<mailto:j-users@xerces.apache.org>" 
> <j-users@xerces.apache.org<mailto:j-users@xerces.apache.org>>
> Date:
> 07/19/2011 11:49 AM
> Subject:
> RE: DOM thread safety issues & disapearring children
>
>
>
> I just wanted to follow up and say that switching off deferred parsing did
> not add any stability. Still the same issues, same steps to reproduce
> everytime. And yes it is actually on, my elements did change from
> DeferredElementImpl to ElementImpl. So no dice there.
>
> Also setting the node to readonly didn't work either, but I didn't really
> expect it to since that's for modifying the DOM and not clobbering the
> underlying unsafe state.
>
> Let me ask you this then .. since clearly there is an industry need for a
> thread safe document (like the one person said this comes up ALL the
> time), and you are pretty clear that the default implementation will
> always remain unsafe - "it's up to the caller to add the locks", why
> can't you provide a simple extension of the current implementation that
> properly takes care of the syncing at a higher level but in-between the
> caller and the unsafe impl? Effectively do what I'm scrambling to do
> correctly and offload this burden...
>
> class ThreadSafeDeferredElementImpl extends DeferredElementImpl {
> @Override
> public void iAmNotTotallySureWhatMethodsNeedSycned() {
> synchronzized (this.something) {
> return this.something.whatever();
> }
> }
> }
>
> Isn't that easy to do and win win? You can maintain your "it's not thread
> safe stance", and users that don't require thread safety will still have
> the good performance, but users like us that essentially need to ditch
> your library can have the syncing properly taken care of and out of our
> code. Why not just do that?
>
> Thanks,
> John
>
>
>
> From: Michael Glavassevich 
> [mailto:mrgla...@ca.ibm.com<mailto:mrgla...@ca.ibm.com>]
> Sent: Wednesday, June 08, 2011 12:06 AM
> To: j-users@xerces.apache.org<mailto:j-users@xerces.apache.org>
> Subject: Re: DOM thread safety issues & disapearring children
>
> Hi John,
>
> None of Xerces' DOM implementations are thread-safe, even the non-deferred
> ones. This is true even for read operations. In particular, the
> implementation of the NodeList methods (i.e. item() and getLength()) are
> not thread-safe. These methods do some internal writes to a cache which
> are necessary for good performance. There's a longer explanation in the
> JIRA issue you found.
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrgla...@ca.ibm.com<mailto:mrgla...@ca.ibm.com>
> E-mail: mrgla...@apache.org<mailto:mrgla...@apache.org>
>
> "Newman, John W" <newma...@d3onc.com<mailto:newma...@d3onc.com>> wrote on 
> 06/07/2011 04:17:22 PM:
>
>> All,
>>
>> My team has built a web application that has a few components that
>> rely heavily on the xerces DOM implementation. Unfortunately a lot
>> of this was developed before we even learned that the node
>> implementations are deliberately not thread safe. =) I've added a
>> few sync blocks where appropriate (I think), and everything is
>> functioning ok except for two remaining issues.
>>
>> 1) Under very high load, an element will somehow lose nearly
>> all of its children.
>> <root>
>> <ch0 />
>> <ch0 />
>> <ch0 />
>> <ch1 />
>> <ch1 />
>> <ch2><ch2.1 /></ch2>
>> <ch3><ch3.1><ch3.2 /></ch3.1></ch3>
>> .... Rather large document, many levels of nesting
>> </root>
>>
>> That will sit there and work fine for a few days, until something
>> (?) happens and most of the children will disappear . I cannot
>> isolate this problem at all, it has been very difficult to track
>> anything down so I'm asking for help. There are no exceptions in
>> the log or anything otherwise to indicate that something bad is
>> happening. One day I saw
>>
>> <root>
>> <ch0 />
>> <ch0 />
>> </root>
>>
>> And then a few days later
>>
>> <root>
>> <ch0 />
>> <ch0 />
>> <ch0 />
>> <ch1 />
>> </root>
>>
>> The fact that there doesn't seem to be any pattern to which children
>> stay vs. which disappear, and only under higher load has me
>> suspecting thread safety. In general it seems like the smaller
>> elements at the top are more likely to hang around, but again
>> there's no real pattern. We are not doing any modification on these
>> nodes, in fact I want to make them read only. I'm debating on
>> casting the org.w3c.Node to org.apache.xerces.dom.NodeImpl and
>> calling setReadOnly(true, true) on it to freeze it - but the
>> javadoc says I probably shouldn't need that method? If I did that,
>> I'd at least get a stack trace when whatever it is decides to modify
>> it. Does that sound like a good approach? Is there anything
>> obvious that would cause this problem, e.g. has anyone ran into this
>> before? Am I missing a sync? I'm about stumped.
>>
>> 2) Also under high load, I occasionally get this stack trace
>> (this is not the cause of or symptom of item 1, it is a separate
>> issue occurring at separate times)
>>
>> java.lang.NullPointerException:
>> (no message)
>> at org.apache.xerces.dom.ParentNode.nodeListItem(Unknown Source)
>> at org.apache.xerces.dom.ParentNode.item(Unknown Source)
>> at freemarker.ext.dom.NodeListModel.<init>(NodeListModel.java:89)
>> at freemarker.ext.dom.NodeModel.getChildNodes(NodeModel.java:302)
>> at freemarker.ext.dom.ElementModel.get(ElementModel.java:124)
>> at freemarker.core.Dot._getAsTemplateModel(Dot.java:76)
>>
>> Again I'm suspecting thread safety and a missing sync. Just
>> refreshing the page works ok. I raised the issue with freemarker
>> since it's only their stack frames calling the DOM, so I figured the
>> burden falls on them to sync. But they passed the puck back to me
>> and said 'we do not guarantee thread safety if your data model is
>> not thread safe to begin with.' They're not going to go and add
>> sync blocks all over their code due to an implementation artifact of
>> your library, and I would agree with that. Really the lack of
>> thread safety even for reads is a pretty poor fit for a web
>> application... How do I fix this problem, in general is there a way
>> to make this library more thread safe? The best suggestion I have
>> for that stack trace so far is to use CGLib to proxy the element and
>> inject sync blocks where they should be. Ugh... https://
>> issues.apache.org/jira/browse/XERCESJ-727<http://issues.apache.org/jira/browse/XERCESJ-727>
>>  is relevant here
>>
>> What about calling documentBuilderFactory.setFeature("http://
>> apache.org/xml/features/dom/defer-node-expansion<http://apache.org/xml/features/dom/defer-node-expansion>",
>>  false); to turn
>> off lazy parsing? Does that guarantee thread safety since
>> everything is already parsed into ram and it's just read only?
>>
>>
>> Any input is very much appreciated, these issues are affecting
> production. K
>>
>> Thanks,
>> John
>

Reply via email to