Seems like Xalan has already optimized that. Perhaps if you try a different serializer (e.g. DOM Level 3 LSSerializer [in serializer.jar] or Xerces' deprecated one) it will do what you were hoping for.
Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: mrgla...@ca.ibm.com E-mail: mrgla...@apache.org Ian Hummel <hum...@parityinc.net> wrote on 12/15/2008 03:42:50 PM: > Hi Michael, > > Here is my formatXmlAsString method formatted for brevity: > > public static String formatXmlAsString(Document doc) { > StringWriter out = new StringWriter(); > TransformerFactory factory = TransformerFactory.newInstance(); > factory.setAttribute("indent-number", new Integer(2)); > Transformer serializer; > serializer = factory.newTransformer(); > serializer.setOutputProperty(OutputKeys.INDENT, "yes"); > serializer.setOutputProperty("{http://xml.apache.org/xslt} > indent-amount", "4"); > serializer.transform(new DOMSource(doc), new StreamResult(out)); > return out.toString(); > } > > Here I tried appending a blank text node, and I can see in the DOM > that the text node _is_ there, but it still gets squashed upon output: > > DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); > DocumentBuilder db = dbf.newDocumentBuilder(); > Document d = db.newDocument(); > Element root = d.createElement("root"); > Element tag = d.createElement("tag"); > d.appendChild(root); > root.appendChild(tag); > > System.out.println("Child nodes? " + tag.getChildNodes().item(0)); > System.out.println("Child nodes l? " + tag.getChildNodes().getLength()); > Text text = d.createTextNode(""); > tag.appendChild(text); > System.out.println("Child nodes? " + tag.getChildNodes().item(0)); > System.out.println("Child nodes l? " + tag.getChildNodes().getLength()); > System.out.println(formatXmlAsString(d)); > > Child nodes? null > Child nodes l? 0 > Child nodes? [#text: ] > Child nodes l? 1 > <?xml version="1.0" encoding="UTF-8"?> > <root> > <tag/> > </root> > > On Dec 15, 2008, at 3:29 PM, Michael Glavassevich wrote: > > To add to what I said ... > > I think it's likely the case that the Xerces/Xalan serializer will > write <tag></tag> if you attach an empty text node to the element. > You should keep in mind that this is an implementation detail that > could change in the future. Perhaps one day it will write <tag/> instead. > > Thanks. > > Michael Glavassevich > XML Parser Development > IBM Toronto Lab > E-mail: mrgla...@ca.ibm.com > E-mail: mrgla...@apache.org > > Michael Glavassevich/Toronto/i...@ibmca wrote on 12/15/2008 03:20:57 PM: > > > Hi Ian, > > > > I've never heard of XmlUtils.formatXmlAsString. It's certainly not > > distributed with Xerces. Have you tried one of the standard > > serialization methods [1] from JAXP or DOM Level 3? > > > > Thanks. > > > > [1] http://xerces.apache.org/xerces2-j/faq-general.html#faq-6 > > > > Michael Glavassevich > > XML Parser Development > > IBM Toronto Lab > > E-mail: mrgla...@ca.ibm.com > > E-mail: mrgla...@apache.org > > > > Ian Hummel <hum...@parityinc.net> wrote on 12/15/2008 03:08:49 PM: > > > > > Hi Michael, > > > > > > I know <tag></tag> and <tag/> are the same, but unfortunately the > > > buggy-parser-that-cannot-be-changed on the other end doesn't :) > > > > > > DocumentBuilder db = dbf.newDocumentBuilder(); > > > Document d = db.newDocument(); > > > Element root = d.createElement("root"); > > > Element tag = d.createElement("tag"); > > > tag.setTextContent(""); > > > d.appendChild(root); > > > root.appendChild(tag); > > > System.out.println(XmlUtils.formatXmlAsString(d)); > > > > > > This always outputs <tag/> and never <tag></tag> like I need it to. > > > > > > - Ian. > > > > > > On Dec 12, 2008, at 10:33 PM, Michael Glavassevich wrote: > > > > > > Hi Ian, > > > > > > > I need to create XML that looks like this whenever the value of > > > > "tag" is "" (the empty string): > > > > > > > > <root> > > > > <tag></tag> > > > > </root> > > > > > > Why? <tag/> and <tag></tag> have the same meaning. Whichever form is > > > chosen by the serializer should have no significance. > > > > > > > I am more concerned in preserving the empty text node when I > > > > serialize to e.g. a file... not so much the parsing. > > > > > > > > Any one else have any ideas? > > > > > > Would help if you showed your code for serializing the document. > > > > > > > Are blank text nodes like that invalid XML or something? > > > > > > In the snippet you posted you created a text node with the '\t' > > > (tab) character in it. That isn't "blank" or empty. > > > > > > Thanks. > > > > > > Michael Glavassevich > > > XML Parser Development > > > IBM Toronto Lab > > > E-mail: mrgla...@ca.ibm.com > > > E-mail: mrgla...@apache.org > > > > > > Ian Hummel <hum...@parityinc.net> wrote on 12/12/2008 09:21:47 AM: > > > > > > > Hi, I didn't really understand how that's going to help. > > > > > > > > I am more concerned in preserving the empty text node when I > > > > serialize to e.g. a file... not so much the parsing. > > > > > > > > Any one else have any ideas? Are blank text nodes like that invalid > > > > XML or something? > > > > > > > > On Dec 11, 2008, at 11:53 AM, ravika...@gmail.com wrote: > > > > > > > > Hi Lan, > > > > > > > > I think we can Implement by LSParser Interface. http://java.sun. > > > > com/j2se/1.5.0/docs/api/org/w3c/dom/ls/LSParser.html > > > > this link may help you. > > > > > > > > Regards, > > > > Ravikanth > > > > > > > On Thu, Dec 11, 2008 at 7:36 PM, Ian Hummel <hum...@parityinc.net > > wrote: > > > > Hi everyone, > > > > > > > > I need to create XML that looks like this whenever the value of > > > > "tag" is "" (the empty string): > > > > > > > > <root> > > > > <tag></tag> > > > > </root> > > > > > > > > I've tried the following: > > > > > > > > DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); > > > > DocumentBuilder db = dbf.newDocumentBuilder(); > > > > Document d = db.newDocument(); > > > > Element root = d.createElement("root"); > > > > Element tag = d.createElement("tag"); > > > > d.appendChild(root); > > > > root.appendChild(tag); > > > > Text text = d.createTextNode("\t"); > > > > tag.appendChild(text); > > > > > > > > but I always end up with XML like this: > > > > > > > > <root> > > > > <tag/> > > > > </root> > > > > > > > > Is there a way to force empty text nodes to get "denormalized" ? > > > > > > > > Thanks, > > > > > > > > Ian. > > > > > > > > -- > > > > Ravikanth