Hi all,

 

 I have some problems with ISO-5589-1 and UTF-8 encodings in XML documents. 
Concretely, I have this ISO-8859-1 - encoded XML document:

 

<?xml version="1.0" encoding="ISO-8859-1"?>

<DOCUMENTO>

<PERFILES>Á</PERFILES>

<PERFILES>É</PERFILES>

<PERFILES>Í</PERFILES>

<PERFILES>Ó</PERFILES>

<PERFILES>Ú</PERFILES>

</DOCUMENTO> 

 

Then I UTF-8 - encode it, by means of the following piece of code:

 

            Transformer transformer = 
TransformerFactory.newInstance().newTransformer();

            StreamSource ds = new StreamSource(new 
ByteArrayInputStream(xmliso88191.getBytes()));

            transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8");

            ByteArrayOutputStream baos = new ByteArrayOutputStream();

            transformer.transform(ds,new StreamResult(baos));

            return baos.toString();

 

to obtain this XML document:

 

<?xml version="1.0" encoding="utf-8"?>

<DOCUMENTO>

<PERFILES>Ã?</PERFILES>

<PERFILES>É</PERFILES>

<PERFILES>Ã?</PERFILES>

<PERFILES>Ó</PERFILES>

<PERFILES>Ú</PERFILES>

</DOCUMENTO>

 

Next, I ISO-8859-1- encode this document (UTF-8 encoded):

 

            Transformer transformer = 
TransformerFactory.newInstance().newTransformer();

            StreamSource ds = new StreamSource(new 
ByteArrayInputStream(xmlutf8.getBytes()));

            transformer.setOutputProperty(OutputKeys.ENCODING,"iso-8859-1");

            ByteArrayOutputStream baos = new ByteArrayOutputStream();

            transformer.transform(ds,new StreamResult(baos));

            return baos.toString();

 

But I can not get it. Instead, I obtain the following exception:

 

[Fatal Error] :8:11: Invalid byte 2 of 2-byte UTF-8 sequence.

javax.xml.transform.TransformerException: org.xml.sax.SAXParseException: Invali

 byte 2 of 2-byte UTF-8 sequence.

        at org.apache.xalan.transformer.TransformerIdentityImpl.transform(Trans

ormerIdentityImpl.java:449)

        at codificacion.PruebasCodificacion.encodeISO88891(PruebasCodificacion.

ava:302)

        at codificacion.PruebasCodificacion.prueba(PruebasCodificacion.java:73)

        at codificacion.PruebasCodificacion.main(PruebasCodificacion.java:356)

Caused by: org.xml.sax.SAXParseException: Invalid byte 2 of 2-byte UTF-8 sequen

e.

        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)

        at org.apache.xalan.transformer.TransformerIdentityImpl.transform(Trans

ormerIdentityImpl.java:432)

 

 

Is this process correct? Supposing that it is, it seems the exception is due to 
‘Ã?’ characters  (‘Á’ and ‘Í’ UTF-8 – encoding), so I would like to know how I 
could UTF-8 - encode ‘Á’ and ‘Í’ characters and then, back them to ISO-8859-1 
encoding.

 

Could anybody be so kind as to help me, please?

 

Thank you very much in advance.

Inma.

 

Reply via email to