Hi Chad,

The encoding declared in your document (in the XML declaration) and the 
actual encoding of your document probably don't match after you've 
serialized it to a file. You should be passing a FileOutputStream to the 
serializer and let it handle the character encoding instead of using a 
FileWriter which assumes the platform default encoding (which could be 
anything) is acceptable. This was discussed on the j-dev list many months 
ago. The thread starts here [1] in the archives if you're interested.

Thanks.

[1] 
http://mail-archives.apache.org/mod_mbox/xerces-j-dev/200504.mbox/[EMAIL 
PROTECTED]

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

Chad La Joie <[EMAIL PROTECTED]> wrote on 09/06/2006 02:26:30 PM:

> I'm using Xerces-J 2.8.0 as my endorsed JAXP 1.3 parser.  My code 
> needs to write 
> out XML and later read it back in.  I'm using the LSSerializer to 
> write out the 
> root element node of my document to a string and then using a FileWriter 
to 
> write it out to disk.  Then I use a FileInputStream to read it in 
> and parse it 
> with a DocumentBuilder.
> 
> When I do this I get a SAX exception indicating I have invalid content 
in the 
> prolog.  I searched the archives and found a thread where someone 
> else had this 
> issue.  It was caused by non-visible characters in the prolog.  I 
validated, 
> through an octal dump, that the file written by my code didn't have any 
extra 
> characters.
> 
> Attached is a test case that reproduces the problem.  Any assistance on 
this 
> would be greatly appreciated.  I'm sure I'm just messing up 
> something with the 
> Serializer but I'm not sure what.
> -- 
> Chad La Joie             2052-C Harris Bldg
> OIS-Middleware           202.687.0124
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileWriter;
> 
> import javax.xml.parsers.DocumentBuilder;
> import javax.xml.parsers.DocumentBuilderFactory;
> 
> import org.w3c.dom.DOMImplementation;
> import org.w3c.dom.Document;
> import org.w3c.dom.Element;
> import org.w3c.dom.Node;
> import org.w3c.dom.ls.DOMImplementationLS;
> import org.w3c.dom.ls.LSSerializer;
> 
> 
> public class XMLSerializeTest {
> 
>    private static DocumentBuilder docBuilder;
> 
>    public static void main(String[] args) throws Exception{
>       String xmlFilePath = "/tmp/example.xml";
> 
>       DocumentBuilderFactory builderFactory = 
> DocumentBuilderFactory.newInstance();
>       docBuilder = builderFactory.newDocumentBuilder();
> 
>       // Create an example element
>       Element exampleXML = createElement();
> 
>       // Write it out to a file and read it back in
>       // It fails when reading it in
>       File xmlFile = new File(xmlFilePath);
>       writeXML(exampleXML, xmlFile);
>       Element readXML = readXML(xmlFile);
> 
>       // Write it out to STDOUT
>       System.out.println(nodeToString(readXML));
> 
>    }
> 
>    public static Element createElement() throws Exception{
>       Document document = docBuilder.newDocument();
> 
>       Element rootElement = document.createElementNS("http://example.org
> ", "example:foo");
>       document.appendChild(rootElement);
>       rootElement.setAttribute("attrib1", "somevalue");
>       rootElement.setAttribute("attrib2", "somevalue");
> 
>       return rootElement;
>    }
> 
>     public static void writeXML(Element element, File output) 
throwsException{
>        FileWriter out = new FileWriter(output);
>         out.write(nodeToString(element));
>         out.flush();
>         out.close();
>     }
> 
>     public static String nodeToString(Node node) {
>         DOMImplementation domImpl = node.getOwnerDocument().
> getImplementation();
>         DOMImplementationLS domImplLS = (DOMImplementationLS) 
> domImpl.getFeature("LS", "3.0");
>         LSSerializer serializer = domImplLS.createLSSerializer();
>         return serializer.writeToString(node);
>     }
> 
>     public static Element readXML(File input) throws Exception{
>        FileInputStream in = new FileInputStream(input);
>       Document document = docBuilder.parse(in);
>       return document.getDocumentElement();
>     }
> }
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to