Hi Michael,

I'm just trying to follow your approach and creating a InputStream wrapper class which removes the doctype declaration. Afaik that means:

- write an own EntityResolver which resolves file entities

- write a doctype removing reader which creates a FileInputStream for the resolved file entity. Analyzes the header (BOM, XML-PI) for setting up a InuptStreamReader with the correct encoding. Then skip an eventually following doctype declaration.

Is that correct? For the sake of not using XNI as described in my last mail I would have to duplicate parser functionality, hmmm.

Or is there an more minimal invasive way to hook into the parser?

Thank you for your help!

Wulf



Michael Glavassevich schrieb:
Hi Wulf,

Wulf Berschin <bersc...@dosco.de> wrote on 09/16/2009 02:48:52 AM:

 > Hi,
 >
 > for ease of editing we have a doctype declaration in each (file)
 > fragment. When I parse the full master (with resolving fragments) Xerces
 > throws a fatal error (Doctype not allowed in content) and goes in an
 > endless loop when setting this continue-after-fatal-error switch.
 >
 > How can make Xerces to ignore doctype declarations ocurring in content
 > (alt. in the header of file entities)?

You can't. Xerces (or any conformant XML parser for that matter) will not ignore or skip over any malformed / misplaced constructs in the document. Parsers are required to report the fatal error. The "continue-after-fatal-error" feature which allows Xerces to keep going is unreliable and can lead to a catastrophic failure (e.g. NPE, infinite loop, stack overflow, out of memory, etc...) if you turn it on. It's to be used with extreme caution and should never be enabled in a finished component / product.

You either need to remove these DOCTYPEs from the files or filter them out at a lower level (e.g. a wrapper InputStream which doesn't return the DOCTYPE from read()).

 > Wulf
 >
 > ---------------------------------------------------------------------
 > To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
 > For additional commands, e-mail: j-users-h...@xerces.apache.org

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to