Seems like Xalan has already optimized that. Perhaps if you try a different
serializer (e.g. DOM Level 3 LSSerializer [in serializer.jar] or Xerces'
deprecated one) it will do what you were hoping for.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrgla...@ca.ibm.com
E-mail: mrgla...@apache.org

Ian Hummel <hum...@parityinc.net> wrote on 12/15/2008 03:42:50 PM:

> Hi Michael,
>
> Here is my formatXmlAsString  method formatted for brevity:
>
>     public static String formatXmlAsString(Document doc) {
>         StringWriter out = new StringWriter();
>         TransformerFactory factory = TransformerFactory.newInstance();
>         factory.setAttribute("indent-number", new Integer(2));
>         Transformer serializer;
>         serializer = factory.newTransformer();
>         serializer.setOutputProperty(OutputKeys.INDENT, "yes");
>         serializer.setOutputProperty("{http://xml.apache.org/xslt}
> indent-amount", "4");
>         serializer.transform(new DOMSource(doc), new StreamResult(out));
>         return out.toString();
>     }
>
> Here I tried appending a blank text node, and I can see in the DOM
> that the text node _is_ there, but it still gets squashed upon output:
>
> DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
> DocumentBuilder db = dbf.newDocumentBuilder();
> Document d = db.newDocument();
> Element root = d.createElement("root");
> Element tag = d.createElement("tag");
> d.appendChild(root);
> root.appendChild(tag);
>
> System.out.println("Child nodes? " + tag.getChildNodes().item(0));
> System.out.println("Child nodes l? " + tag.getChildNodes().getLength());
> Text text = d.createTextNode("");
> tag.appendChild(text);
> System.out.println("Child nodes? " + tag.getChildNodes().item(0));
> System.out.println("Child nodes l? " + tag.getChildNodes().getLength());
> System.out.println(formatXmlAsString(d));
>
> Child nodes? null
> Child nodes l? 0
> Child nodes? [#text: ]
> Child nodes l? 1
> <?xml version="1.0" encoding="UTF-8"?>
> <root>
>     <tag/>
> </root>
>
> On Dec 15, 2008, at 3:29 PM, Michael Glavassevich wrote:
>
> To add to what I said ...
>
> I think it's likely the case that the Xerces/Xalan serializer will
> write <tag></tag> if you attach an empty text node to the element.
> You should keep in mind that this is an implementation detail that
> could change in the future. Perhaps one day it will write <tag/> instead.
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrgla...@ca.ibm.com
> E-mail: mrgla...@apache.org
>
> Michael Glavassevich/Toronto/i...@ibmca wrote on 12/15/2008 03:20:57 PM:
>
> > Hi Ian,
> >
> > I've never heard of XmlUtils.formatXmlAsString. It's certainly not
> > distributed with Xerces. Have you tried one of the standard
> > serialization methods [1] from JAXP or DOM Level 3?
> >
> > Thanks.
> >
> > [1] http://xerces.apache.org/xerces2-j/faq-general.html#faq-6
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: mrgla...@ca.ibm.com
> > E-mail: mrgla...@apache.org
> >
> > Ian Hummel <hum...@parityinc.net> wrote on 12/15/2008 03:08:49 PM:
> >
> > > Hi Michael,
> > >
> > > I know <tag></tag> and <tag/> are the same, but unfortunately the
> > > buggy-parser-that-cannot-be-changed on the other end doesn't :)
> > >
> > > DocumentBuilder db = dbf.newDocumentBuilder();
> > > Document d = db.newDocument();
> > > Element root = d.createElement("root");
> > > Element tag = d.createElement("tag");
> > > tag.setTextContent("");
> > > d.appendChild(root);
> > > root.appendChild(tag);
> > > System.out.println(XmlUtils.formatXmlAsString(d));
> > >
> > > This always outputs <tag/> and never <tag></tag> like I need it to.
> > >
> > > - Ian.
> > >
> > > On Dec 12, 2008, at 10:33 PM, Michael Glavassevich wrote:
> > >
> > > Hi Ian,
> > >
> > > > I need to create XML that looks like this whenever the value of
> > > > "tag" is "" (the empty string):
> > > >
> > > > <root>
> > > > <tag></tag>
> > > > </root>
> > >
> > > Why? <tag/> and <tag></tag> have the same meaning. Whichever form is
> > > chosen by the serializer should have no significance.
> > >
> > > > I am more concerned in preserving the empty text node when I
> > > > serialize to e.g. a file... not so much the parsing.
> > > >
> > > > Any one else have any ideas?
> > >
> > > Would help if you showed your code for serializing the document.
> > >
> > > > Are blank text nodes like that invalid XML or something?
> > >
> > > In the snippet you posted you created a text node with the '\t'
> > > (tab) character in it. That isn't "blank" or empty.
> > >
> > > Thanks.
> > >
> > > Michael Glavassevich
> > > XML Parser Development
> > > IBM Toronto Lab
> > > E-mail: mrgla...@ca.ibm.com
> > > E-mail: mrgla...@apache.org
> > >
> > > Ian Hummel <hum...@parityinc.net> wrote on 12/12/2008 09:21:47 AM:
> > >
> > > > Hi, I didn't really understand how that's going to help.
> > > >
> > > > I am more concerned in preserving the empty text node when I
> > > > serialize to e.g. a file... not so much the parsing.
> > > >
> > > > Any one else have any ideas?  Are blank text nodes like that
invalid
> > > > XML or something?
> > > >
> > > > On Dec 11, 2008, at 11:53 AM, ravika...@gmail.com wrote:
> > > >
> > > > Hi Lan,
> > > >
> > > > I think we can Implement by LSParser Interface. http://java.sun.
> > > > com/j2se/1.5.0/docs/api/org/w3c/dom/ls/LSParser.html
> > > > this link may help you.
> > > >
> > > > Regards,
> > > > Ravikanth
> > >
> > > > On Thu, Dec 11, 2008 at 7:36 PM, Ian Hummel <hum...@parityinc.net
> > wrote:
> > > > Hi everyone,
> > > >
> > > > I need to create XML that looks like this whenever the value of
> > > > "tag" is "" (the empty string):
> > > >
> > > > <root>
> > > > <tag></tag>
> > > > </root>
> > > >
> > > > I've tried the following:
> > > >
> > > > DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
> > > > DocumentBuilder db = dbf.newDocumentBuilder();
> > > > Document d = db.newDocument();
> > > > Element root = d.createElement("root");
> > > > Element tag = d.createElement("tag");
> > > > d.appendChild(root);
> > > > root.appendChild(tag);
> > > > Text text = d.createTextNode("\t");
> > > > tag.appendChild(text);
> > > >
> > > > but I always end up with XML like this:
> > > >
> > > > <root>
> > > > <tag/>
> > > > </root>
> > > >
> > > > Is there a way to force empty text nodes to get "denormalized" ?
> > > >
> > > > Thanks,
> > > >
> > > > Ian.
> > > >
> > > > --
> > > > Ravikanth

Reply via email to