Your assumptions are correct. the nbsp is not in the dtd. I get

javax.xml.transform.TransformerException: org.xml.sax.SAXParseException: The entity "nbsp" was referenced, but not declared. at org.apache.xalan.transformer.TransformerIdentityImpl.transform(TransformerIdentityImpl.java:502)

I am using xerces 2.9.1 and xalan 2.7.1

When i do

XMLReader r = XMLReaderFactory.createXMLReader();
URL u = r.getClass().getClassLoader().getResource(r.getClass().getName().replace('.', '/') + ".class");
System.out.println(u);namespace aware or

i get


jar:file:/home/dave/dev/tomailer/war/WEB-INF/lib/xercesImpl.jar!/org/apache/xerces/parsers/SAXParser.class

which is the jar i'm expecting to get the parser from.

the   is in PCDATA.

Do i need to specify that the parser is validating in order to get this to work, perhaps?

thanks again.
dave



Michael Glavassevich wrote:

"dbros...@mebigfatguy.com" <dbros...@mebigfatguy.com> wrote on 11/08/2009 08:21:21 PM:

> Thanks for the response. If i understand you correctly, i should do this:
>
>             XMLReader r = XMLReaderFactory.createXMLReader();
>             MyXMLFilter filter = new MyXMLFilter();
>             filter.setParent(r);
> > DocumentBuilderFactory dbf =
> DocumentBuilderFactory.newInstance();
>             DocumentBuilder db = dbf.newDocumentBuilder();
>             Document d = db.newDocument();
> > TransformerFactory tf = TransformerFactory.newInstance();
>             Transformer t = tf.newTransformer();
> > t.transform(new SAXSource(filter, new InputSource(srcHtml)),
> new DOMResult(d));

Looks about right.

> and i should expect the
>
> skippedEntity(String entity)
>
> method of my filter to get called when a &nbsp; is found.
>
> However this method is never called for me. Am I missing something?

Is "nbsp" declared in your DTD? You said your goal was to "resolve unknown entities" so I'm assuming it's not declared.

Is this entity reference part of an attribute value? If it is you're out of luck with SAX. skippedEntity() as well as the entity methods on LexicalHandler are never called for attributes.

Also can you double check that you're actually using a recent release of Xerces-J and not its JDK fork or NekoHTML or something else.

> Michael Glavassevich wrote:
> >
> > Hi,
> >
> > You're making assumptions about the implementation which aren't
> > required and certainly aren't true for Xerces. There is no underlying
> > SAX parser. The DOM is built from XNI events. You cannot plug SAX
> > handlers into it.
> >
> > If you want to build a SAX filter for replacing skipped entities [1]
> > and then build a DOM from that, you could try using the Transformer
> > API instead (i.e. javax.xml.transform.Transformer.transform(SAXSource, > > DOMResult)) where the SAXSource contains your XMLFilter [2] which does
> > this resolution.
> >
> > Thanks.
> >
> > [1]
> > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/
> ContentHandler.html#skippedEntity(java.lang.String)
> > <http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/
> ContentHandler.html#skippedEntity%28java.lang.String%29>
> > [2]
> > http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/XMLFilter.html
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: mrgla...@ca.ibm.com
> > E-mail: mrgla...@apache.org
> >
> > "dbros...@mebigfatguy.com" <dbros...@mebigfatguy.com> wrote on
> > 11/08/2009 12:52:53 PM:
> >
> > > I want to be able to resolve unknown entities while using
> > > DocumentBuilder.parse. I see sax has a LexicalHandler for this purpose, > > > and i'd assume that there's some way to tell DOM to pass a handler into > > > the underlying sax parser that builds the dom, but i haven't found it.
> > > Could someone point me in the right direction?
> > >
> > > thanks.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
> > > For additional commands, e-mail: j-users-h...@xerces.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
> For additional commands, e-mail: j-users-h...@xerces.apache.org

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to