Hi Wulf, I didn't say it would it be easy, just that you're on shaky ground if your solution involves hooking into or extending Xerces' internals.
There might be other ways to deal with this, for example using XInclude instead of entity references and/or removing the DOCTYPEs from the files and programmatically inserting them when appropriate through EntityResolver2.getExternalSubset() [1], though I don't know much about your scenario and how much flexibility you have with changing the data. Thanks. [1] http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ext/EntityResolver2.html Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: mrgla...@ca.ibm.com E-mail: mrgla...@apache.org Wulf Berschin <bersc...@dosco.de> wrote on 10/01/2009 07:26:41 AM: > Hi Michael, > > I'm just trying to follow your approach and creating a InputStream > wrapper class which removes the doctype declaration. Afaik that means: > > - write an own EntityResolver which resolves file entities > > - write a doctype removing reader which creates a FileInputStream for > the resolved file entity. Analyzes the header (BOM, XML-PI) for setting > up a InuptStreamReader with the correct encoding. Then skip an > eventually following doctype declaration. > > Is that correct? For the sake of not using XNI as described in my last > mail I would have to duplicate parser functionality, hmmm. > > Or is there an more minimal invasive way to hook into the parser? > > Thank you for your help! > > Wulf > > > > Michael Glavassevich schrieb: > > Hi Wulf, > > > > Wulf Berschin <bersc...@dosco.de> wrote on 09/16/2009 02:48:52 AM: > > > > > Hi, > > > > > > for ease of editing we have a doctype declaration in each (file) > > > fragment. When I parse the full master (with resolving fragments) Xerces > > > throws a fatal error (Doctype not allowed in content) and goes in an > > > endless loop when setting this continue-after-fatal-error switch. > > > > > > How can make Xerces to ignore doctype declarations ocurring in content > > > (alt. in the header of file entities)? > > > > You can't. Xerces (or any conformant XML parser for that matter) will > > not ignore or skip over any malformed / misplaced constructs in the > > document. Parsers are required to report the fatal error. The > > "continue-after-fatal-error" feature which allows Xerces to keep going > > is unreliable and can lead to a catastrophic failure (e.g. NPE, infinite > > loop, stack overflow, out of memory, etc...) if you turn it on. It's to > > be used with extreme caution and should never be enabled in a finished > > component / product. > > > > You either need to remove these DOCTYPEs from the files or filter them > > out at a lower level (e.g. a wrapper InputStream which doesn't return > > the DOCTYPE from read()). > > > > > Wulf > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org > > > For additional commands, e-mail: j-users-h...@xerces.apache.org > > > > Thanks. > > > > Michael Glavassevich > > XML Parser Development > > IBM Toronto Lab > > E-mail: mrgla...@ca.ibm.com > > E-mail: mrgla...@apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org > For additional commands, e-mail: j-users-h...@xerces.apache.org