Hi David,

The encoding declared in your document (in the XML declaration) and the 
actual encoding of your document probably don't match after you've 
serialized it to a file. You should be passing a FileOutputStream to the 
transformer and let it handle the character encoding instead of using a 
FileWriter which assumes the platform default encoding (which could be 
anything) is acceptable. Something similar was discussed on the j-dev list 
many months ago. The thread starts here [1] in the archives if you're 
interested.

Thanks.

[1] 
http://mail-archives.apache.org/mod_mbox/xerces-j-dev/200504.mbox/[EMAIL 
PROTECTED]

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

"Jones, David [deljones]" <[EMAIL PROTECTED]> wrote on 10/30/2006 
12:50:39 PM:

> Hi
> I've been trying to write and read back a UTF-16 encoded XML 
> document, without success. I can seemingly write the document ok 
> (see  WriteUTF16.java ) and Firefox can open it without complaining 
> (except the fact it doesn't have any style information). When I try 
> and read back the document (using ReadUTF16.java) I get the 
> following exception; - 
> 
>  [Fatal Error] output.xml:1:40: Content is not allowed in prolog.
> Exception in thread "main" org.xml.sax.SAXParseException: Content is
> not allowed
>  in prolog.
>         at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>         at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown 
Source)
>         at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
>         at ReadUTF16.main(ReadUTF16.java:47)
> I'm not sure whether I'm missing something simple, doing something 
> really stupid or even if it is a platform dependent thing (I'm using
> Windows XP and Windows 2000).
> 
> many thanks
> 
> david 
> 
> 
> 
> ------- ReadUTF16.java -------
> import org.w3c.dom.*;
> import javax.xml.parsers.*;
> import java.io.*;
> import java.util.*;
> import javax.xml.transform.dom.*;
> import javax.xml.transform.*;
> import javax.xml.transform.stream.*;
> 
> 
> public class ReadUTF16
> {
>  public static void main ( String rags[] ) throws Exception 
>  {
>   System.out.println("read: main()");
>   File f = new File ( "output.xml");
> 
>   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
>   DocumentBuilder db = dbf.newDocumentBuilder();
>   Document doc = db.parse( f.toURI().toString() );
> 
>   // serialise to console using jaxp 
>     TransformerFactory tf = TransformerFactory.newInstance(); 
>   Transformer t = tf.newTransformer();
>   t.setOutputProperty ( OutputKeys.INDENT , "yes" ) ; 
>   DOMSource ds = new DOMSource ( doc  ); 
>   StreamResult res = new StreamResult  ( System.out ) ;
>   t.transform ( ds , res ) ;
> 
>  }// end method
> }// end class
> 
> 
> ------- WriteUTF16.java ------- 
> 
> import org.w3c.dom.*;
> import javax.xml.parsers.*;
> import java.io.*;
> import java.util.*;
> import javax.xml.transform.dom.*;
> import javax.xml.transform.*;
> import javax.xml.transform.stream.*;
> 
> 
> public class WriteUTF16
> {
> 
>  public static void main ( String rags[] ) throws Exception 
>  {
>   System.out.println("main()");
>   DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
>   DocumentBuilder db = dbf.newDocumentBuilder();
>   Document doc = db.newDocument();
>   Element element = doc.createElement( "tagname") ;
>   element.setAttribute("test","test-value");
>   doc.appendChild ( element ) ; 
>   // serialise using jaxp 
>    File f = new File ( "output.xml");
>   TransformerFactory tf = TransformerFactory.newInstance();
> 
>   Transformer t = tf.newTransformer();
> 
>   Properties p = t.getOutputProperties();
>   // print properties to console 
>   p.list ( System.out ) ;
>   // set up the transformer 
>   // there are properties to set and error listeners which can be set 
> 
>   // does something when output to the console anyway... 
>   t.setOutputProperty ( "encoding" , "UTF-16" ) ;
>   // set indentation 
>   t.setOutputProperty ( OutputKeys.INDENT , "yes" ) ;
>   // new DOMSource instance 
>   DOMSource ds = new DOMSource ( doc  );
>   // Print Writer 
>   PrintWriter writer = new PrintWriter( new BufferedWriter ( new 
> FileWriter ( f) ) ); 
>   StreamResult res = new StreamResult  ( writer ) ;
>   // transform 
>   t.transform ( ds , res ) ; 
>  }// end method
> }// end class

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to