On Thu, Mar 13, 2008 at 6:51 PM, Stanimir Stamenkov <[EMAIL PROTECTED]> wrote:
> Wed, 12 Mar 2008 16:25:59 -0300, /Daniel Yokomizo/:
>
>
>  > The only issue I still have
>  > is getting the xml declaration info (e.g. version, encoding) but right
>  > now I can just ignore it.
>
>  That info you should be able to obtain through the Locator2 [1]
>  interface.  For example, in your ContentHandler implementation:
>
>      Locator locator;
>
>      public void setDocumentLocator(Locator locator) {
>          this.locator = locator;
>      }
>
>      public void startDocument() {
>          if (locator instanceof Locator2) {
>               Locator2 loc = (Locator2) locator;
>               loc.getXMLVersion();
>               loc.getEncoding();
>          }
>      }
>
>  [1]
>  http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ext/Locator2.html
>
>  --
>  Stanimir

Thank you, that solved my problem. I got into some weird behavior,
which I think it's a bug but I'm not certain. I created the
InputSource using a Reader, didn't set the encoding property of the
InputSource and tried to parse. Even if the document has a xml
declaration with explicit encoding, the locator.getEncoding() returned
null. Creating the InputSource with a InputStream worked, because the
parser tried to discover the encoding based on the first bytes of the
stream. I think this is a bug because the document has the encoding
information and there are no other places with this information
(either explicit, like in the InputSource, or implicit like in the
InputStream case) that could possibly conflict, so the locator should
have this info. Should I open a bug report (assuming that this isn't a
known bug, I seached the JIRA but I couldn't find a thing)? Either way
I changed my uses to InputStream and everything worked ok.

Best regards,
Daniel Yokomizo.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to