[ https://issues.apache.org/jira/browse/TIKA-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090159#comment-13090159 ]
Jukka Zitting commented on TIKA-651: ------------------------------------ bq. Is it so bad to add a dependency on Xerce's/Xalan's serializer.jar? Yes. I've had numerous battles with XML processing gone haywire in systems that have accidentally pulled in a wrong versions of the XML processing libraries. BTW, serializing SAX events to XML, XHTML or HTML4 streams shouldn't be that difficult to implement directly. We could even copy relevant parts of the code from Xalan. > Unescaped attribute value generated > ----------------------------------- > > Key: TIKA-651 > URL: https://issues.apache.org/jira/browse/TIKA-651 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.9 > Reporter: Raimund Merkert > Assignee: Jukka Zitting > Attachments: XHTMLSerializer.java > > > I've converted a word document that contains hyperlinks with a complex query > component. The & character is not escaped and mozilla complains about that > when I write out the XHTML via a content handler that I wrote. > It's not clear to me whether or not my contenthandler should assume > attributes are properly escaped or not. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira