The SAXParseExceptions [1] reported to your ErrorHandler and thrown by the 
XMLReader contain location information. See getColumnNumber() and 
getLineNumber().

[1] 
http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/SAXParseException.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

Julien Buchanan <[EMAIL PROTECTED]> wrote on 03/26/2007 05:17:46 PM:

> hi.
> i'd like to use SAXParserImpl.JAXPSAXParser from xerces2-j to parse and
> correct
> html/xhtml user input.
> - is there a way to know at which line/column (or, since there's
> probably no line
> counter, at which character index) an error happened?
> 
> i didn't see where to get that info in the api docs, and my attempt to
> hijack the input
> stream to get count position myself also failed since after the first
> couple chars read individually, xerces reads 2045 at once (which makes
> sense, performance-wise).
> so how do i know at what char a parse error happens?
> 
> also, as a side-question, not necessarily xerces-related, but you guys
> might know and
> spare me some wheel-reinventing:
> i'm looking for a solution in java to sanitize untrusted user-input
> html/xhtml of
> not just malFORMED stuff (tagsoup or an own xerces-based parser solution
> would do that)
> but specifically MALICIOUS input, e.g. XSS attempts etc. for php,
> there's htmlPurifier
> ( http://hp.jpsband.org/ ), but for java, i found nothing equivalent.
> Do you know of some existing solution for that?
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to