Hi all,
I have some problems with ISO-5589-1 and UTF-8 encodings in XML documents. Concretely, I have this ISO-8859-1 - encoded XML document: <?xml version="1.0" encoding="ISO-8859-1"?> <DOCUMENTO> <PERFILES>Á</PERFILES> <PERFILES>É</PERFILES> <PERFILES>Í</PERFILES> <PERFILES>Ó</PERFILES> <PERFILES>Ú</PERFILES> </DOCUMENTO> Then I UTF-8 - encode it, by means of the following piece of code: Transformer transformer = TransformerFactory.newInstance().newTransformer(); StreamSource ds = new StreamSource(new ByteArrayInputStream(xmliso88191.getBytes())); transformer.setOutputProperty(OutputKeys.ENCODING,"utf-8"); ByteArrayOutputStream baos = new ByteArrayOutputStream(); transformer.transform(ds,new StreamResult(baos)); return baos.toString(); to obtain this XML document: <?xml version="1.0" encoding="utf-8"?> <DOCUMENTO> <PERFILES>Ã?</PERFILES> <PERFILES>É</PERFILES> <PERFILES>Ã?</PERFILES> <PERFILES>Ó</PERFILES> <PERFILES>Ú</PERFILES> </DOCUMENTO> Next, I ISO-8859-1- encode this document (UTF-8 encoded): Transformer transformer = TransformerFactory.newInstance().newTransformer(); StreamSource ds = new StreamSource(new ByteArrayInputStream(xmlutf8.getBytes())); transformer.setOutputProperty(OutputKeys.ENCODING,"iso-8859-1"); ByteArrayOutputStream baos = new ByteArrayOutputStream(); transformer.transform(ds,new StreamResult(baos)); return baos.toString(); But I can not get it. Instead, I obtain the following exception: [Fatal Error] :8:11: Invalid byte 2 of 2-byte UTF-8 sequence. javax.xml.transform.TransformerException: org.xml.sax.SAXParseException: Invali byte 2 of 2-byte UTF-8 sequence. at org.apache.xalan.transformer.TransformerIdentityImpl.transform(Trans ormerIdentityImpl.java:449) at codificacion.PruebasCodificacion.encodeISO88891(PruebasCodificacion. ava:302) at codificacion.PruebasCodificacion.prueba(PruebasCodificacion.java:73) at codificacion.PruebasCodificacion.main(PruebasCodificacion.java:356) Caused by: org.xml.sax.SAXParseException: Invalid byte 2 of 2-byte UTF-8 sequen e. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xalan.transformer.TransformerIdentityImpl.transform(Trans ormerIdentityImpl.java:432) Is this process correct? Supposing that it is, it seems the exception is due to ‘Ã?’ characters (‘Á’ and ‘Í’ UTF-8 – encoding), so I would like to know how I could UTF-8 - encode ‘Á’ and ‘Í’ characters and then, back them to ISO-8859-1 encoding. Could anybody be so kind as to help me, please? Thank you very much in advance. Inma.