Hi Michael,
I'm just trying to follow your approach and creating a InputStream
wrapper class which removes the doctype declaration. Afaik that means:
- write an own EntityResolver which resolves file entities
- write a doctype removing reader which creates a FileInputStream for
the resolved file entity. Analyzes the header (BOM, XML-PI) for setting
up a InuptStreamReader with the correct encoding. Then skip an
eventually following doctype declaration.
Is that correct? For the sake of not using XNI as described in my last
mail I would have to duplicate parser functionality, hmmm.
Or is there an more minimal invasive way to hook into the parser?
Thank you for your help!
Wulf
Michael Glavassevich schrieb:
Hi Wulf,
Wulf Berschin <bersc...@dosco.de> wrote on 09/16/2009 02:48:52 AM:
> Hi,
>
> for ease of editing we have a doctype declaration in each (file)
> fragment. When I parse the full master (with resolving fragments) Xerces
> throws a fatal error (Doctype not allowed in content) and goes in an
> endless loop when setting this continue-after-fatal-error switch.
>
> How can make Xerces to ignore doctype declarations ocurring in content
> (alt. in the header of file entities)?
You can't. Xerces (or any conformant XML parser for that matter) will
not ignore or skip over any malformed / misplaced constructs in the
document. Parsers are required to report the fatal error. The
"continue-after-fatal-error" feature which allows Xerces to keep going
is unreliable and can lead to a catastrophic failure (e.g. NPE, infinite
loop, stack overflow, out of memory, etc...) if you turn it on. It's to
be used with extreme caution and should never be enabled in a finished
component / product.
You either need to remove these DOCTYPEs from the files or filter them
out at a lower level (e.g. a wrapper InputStream which doesn't return
the DOCTYPE from read()).
> Wulf
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
> For additional commands, e-mail: j-users-h...@xerces.apache.org
Thanks.
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org