Hi all,

[ please CC me on replies as I'm not following j-users ]

I'm hitting something that might be a xerces bug while parsing & validating rather small XML file that contains the CDATA of around 2000 bytes long. The schema for this element is defined like this:

<element name="shell">
  <complexType>
    <simpleContent>
      <extension base="cr:nonEmptyString">
        <attribute name="cmd" type="cr:nonEmptyString" use="required"/>
      </extension>
    </simpleContent>
  </complexType>
</element>

where cr:nonEmptyString is

<simpleType name="nonEmptyString">
  <annotation>
    <documentation>String with min length = 1</documentation>
  </annotation>
  <restriction base="string">
    <minLength value="1"/>
    <pattern value="(.|\n)*\S(.|\n)*"/>
  </restriction>
</simpleType>

and the fragment that causes the problem is:

<shell cmd="/bin/ksh"><![CDATA[
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
]]></shell>

The stacktrace of xerces when it dies with Stack Overflow is ending at matchString until the stack overflows:

...
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matchString
org.apache.xerces.impl.xpath.regex.RegularExpression.matches
org.apache.xerces.impl.xpath.regex.RegularExpression.matches
org.apache.xerces.impl.dv.xs.XSSimpleTypeDecl.getActualValue
org.apache.xerces.impl.dv.xs.XSSimpleTypeDecl.validate
org.apache.xerces.impl.xs.XMLSchemaValidator.elementLocallyValidComplexType
org.apache.xerces.impl.xs.XMLSchemaValidator.elementLocallyValidType
org.apache.xerces.impl.xs.XMLSchemaValidator.processElementContent
org.apache.xerces.impl.xs.XMLSchemaValidator.handleEndElement
org.apache.xerces.impl.xs.XMLSchemaValidator.endElement
org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument
org.apache.xerces.parsers.XML11Configuration.parse
org.apache.xerces.parsers.XML11Configuration.parse
org.apache.xerces.parsers.XMLParser.parse
org.apache.xerces.parsers.DOMParser.parse

In order to heal the problem, I have tried to simplify the cr:nonEmptyString type by removing all eventual restrictions (regex and minLenght) but I'm still getting the same stack trace.

The problem goes away, or maybe the StackOverflow does not happen when I shorten the CDATA section under 2000 bytes. The number 2000 here is probably related to -Xms defined when starting JVM...

Is this a known bug? Do you have any hints how to solve it?

thanx for your time,
Martin

P.S. I'm using xerces 2.9.1 and Sun JDK 5

--
There are no bad questions, only bad answers...

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to