Stanimir Stamenkov <[EMAIL PROTECTED]> wrote on 07/28/2006 10:46:24 AM:
<snip/>
> > Could you be so kind as to tell me how to parse an ISO-8859-1 encoded
> > document with xerces, please????
>
> Seems you're trying something but asking a different question. The
> things I've mentioned above still apply. If you don't want or can't
> add an XML Declaration to the document you could feed a parser with
> ready decoded character stream instead of byte stream, like:
>
> InputStream byteStream;
> ...
> Reader charStream = new InputStreamReader(byteStream, "ISO-8859-1");
> InputSource source;
> DocumentBuilder parser; // it could be SAXParser as well
> ...
> source.setCharacterStream(charStream);
> parser.parse(source);
Or set the encoding on the InputSource if you're sure what it is and give
the parser an opportunity to use an optimized reader.
InputSource source;
...
source.setByteStream(byteStream);
source.setEncoding("ISO-8859-1");
parser.parse(source);
> [1] http://www.w3.org/TR/REC-xml/#NT-XMLDecl
> [2] http://www.w3.org/TR/REC-xml/#sec-guessing-no-ext-info
>
> --
> Stanimir
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]