Hello Michaek, Thanks a lot for your answer. You´re right (sorry for that). I use Castor-XML for unmarshalling. And there was the problem: I just tried an older xerces version and everything was OK. But now I checked a newer Castor and everything is OK, too.
-> It was a incompatibility between my old Castor and the new Xerces. Thanks, Thimo -------------------- Thimo von Rauchhaupt Technical Engineering Director / Sales Europe EMPIC GmbH, Werner-von-Siemens-Str. 61 CEO: Joerg K. Kottenbrink 91052 Erlangen, Germany Reg. No: 2873 in Fuerth, Germany Phone: +49/9131/877 276 Fax: +49/9131/877 265 Mobile: +49/172/23 43 189 Skype: thimo_von_rauchhaupt eMail: [EMAIL PROTECTED] http://www.empic.eu -----Ursprüngliche Nachricht----- Von: Michael Glavassevich [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 30. November 2007 15:48 An: j-users@xerces.apache.org Betreff: Re: Xerces Unmarshaller bug removing single whitespace Hi Thimo, There's no such thing as a "Xerces Unmarshaller" so have no idea what library you're referring to but it certainly doesn't come from this project. I doubt this is a problem with Xerces. I suspect the Unmarshaller classes you're using are the source of the odd behaviour possibly because it's not handling multiple calls to the SAX characters() callback [1] correctly. A ContentHandler written like: private StringBuffer buf; public void characters(char[] ch, int start, int length) throws SAXException { buf.append(new String(ch, start, length).trim()); } would cause whitespace to be dropped from seemingly random points in the document (like you're seeing). Thanks. [1] http://xerces.apache.org/xerces2-j/javadocs/api/org/xml/sax/ContentHandler.h tml#characters(char[],%20int,%20int) Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] "Thimo von Rauchhaupt" <[EMAIL PROTECTED]> wrote on 11/30/2007 08:54:56 AM: > Hello , > > When using Xerces (2.9.0 as well as 2.9.1) for unmarshalling it removes > (from line 101:) > > <subjectmark><![CDATA[No specific subject]]></subjectmark> > > the single whitespace between "specific" and "subject". In the loaded object > the String value " No specificsubject" can be found. > > The strange behavior is, that if I enter some linebreaks obove the last > object tag (question) from > > </question> > <question> > > To > > </question> > > > > > > > <question> > > the bug does not occur. Also strange is that the same tag (subjectmark) with > the same value occurs many times in the file, but only this one is parsed > wrongly. > > My questions are: > 1) Does anybody can tell me if I did something wrong? > 2) Ist his a bug? Can anybody tell me how to report this bug / in which > component? The bug reporting page is awfully complicated to do so. I only > can read old bug reports but no data entry page can be found. > > Many thanks in advance, > Thimo > > > P.S.: My java code is: > > FileInputStream fis = new FileInputStream(aFileToImport); // is attached > file AnonymizedImport.xml > InputStreamReader isr = new InputStreamReader(fis, > Exporter.DEFAULT_ENCODING); // means UTF8 > > Unmarshaller tempUnmarshaller = new Unmarshaller(); > Mapping tempMapping = new Mapping(); > > tempMapping.loadMapping(Exporter.class.getClassLoader().getResource(Exporter > .XML_MAPPING_FILE)); // see attached file import.xml > tempUnmarshaller.setMapping(tempMapping); > tempUnmarshaller.setDebug(stdlog.isDebugEnabled()); > ImportExportBean tempImportBean = (ImportExportBean) > tempUnmarshaller.unmarshal(isr); > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]