DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25767>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25767 Performance & threading issues in JAXPUtils' use of SAXParser & SAXParserFactory Summary: Performance & threading issues in JAXPUtils' use of SAXParser & SAXParserFactory Product: Ant Version: 1.6.0 Platform: PC OS/Version: Linux Status: NEW Severity: Minor Priority: Other Component: Core AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Comments on when JAXPUtils creates new factory / parser / document builder objects and when it caches them. 1. Correctness: private static SAXParserFactory parserFactory = null; public static synchronized SAXParserFactory getParserFactory() { if (parserFactory == null) { parserFactory = newParserFactory(); } return parserFactory; } (and similar methods) is likely not correct. An instance of SAXParserFactory is not thread-safe and might not be reentrant, according to its Javadoc. If more than one thread in the JVM is calling this method and using the same parser, random corruption might result - this might happen to someone using <parallel>, for example; or someone running several builds at once using an IDE integration. Or if this method is called in the middle of a parse for some reason - to parse some referenced file? - there might be corruption; at least SAX1's Parser says it is not reentrant. 2. Performance: casual examination of the performance of running a tiny build script w/ several antlib.xml's over and over using a profiler shows some 20-30% of the CPU time being spent in JAXPUtils.getNamespaceXMLReader(), specifically calling Xerces' SAXParserImpl.<init>(). Not an especially cheap call since it sets up a lot of objects of various kinds; you are supposed to cache the result from call to call if you want good performance and might be parsing a lot of files. (E.g. a lot of small antlib.xml's, or many little build.xml's with <subant>, etc.) Note that this time is not spent actually parsing the XML or even opening the input stream to it - just getting a parser object. Suggested fix to both problems: 1. Rework the helper methods in JAXPUtils to cache not just the factory objects but the parsers & document builders themselves. I don't know how expensive getting the factory is - JAXPUtils already caches them, and my profiling only took place on a "warmed-up" JVM - but invoking the factories' new*() methods is too expensive to do every time an XML file is parsed. 2. Leave uncached impls as e.g. SAXParser createSAXParser() and so on - these would always create new ones. Or for compatibility, keep the current method names, but deprecate them in favor of properly cached versions (see #3). 3. For cached usage, use e.g. SAXParser obtainSAXParser() and void releaseSAXParser(SAXParser) methods. These should use java.lang.ThreadLocal: private static final ThreadLocal/*<SAXParser>*/ saxParser = new ThreadLocal(); public static SAXParser obtainSAXParser() { SAXParser p = (SAXParser)saxParser.get(); if (p != null) { saxParser.set(null); } else { p = createSAXParser(); } return p; } public static void releaseSAXParser(SAXParser p) { saxParser.set(p); } // Usage: SAXParser p = obtainSAXParser(); try { // parse... } finally { releaseSAXParser(p); } Forgetting to release a parser is not a big deal, it will just need to be recreated on the next obtain*() call. Using ThreadLocal ensures that different threads never share the same object. Setting the ref to null in obtain*() ensures that reentrant calls (i.e. calling obtain*() while the last-obtained parser is still in use) will be safe. One issue is that several utility methods do not return the original cacheable object, but rather something derived from it, e.g. return newSAXParser(getNSParserFactory()).getXMLReader(); For this usage you should probably have obtainNamespaceXMLReader + releaseNamespaceXMLReader, etc. - i.e. keep a separate cache for each kind of object - since you can't get back to the SAXParser from the XMLReader. Could try to provide a patch if there is interest - at least modifications to JAXPUtils, and usage of the new methods in e.g. ProjectHelper2.parse(). --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]