Hi, i had a lot of troubles with Entity-References in Document Fragments. I found a solution at last, but I would like to know whether there is a better approach. My scenario is like this: There is a Main Document which defines some internal Entities, but it does not use them. Say: <!DOCTYPE article [ <!ENTIY foo "foo expanded" > ]> <article />
There is a separate file with an xml Document Fragment. I would like to parse them as Fragments in Context of the Main Document. I makes use of the Entity which is defined in the Main Document. Say: <?xml version='1.0' standalone='no'?> <para>Example only: &foo;</para> First approach was to use DOM Level 3 parserInContext -- well, it is not supported by Xerces up to now. So I had to do something like "parse in context" by my own. I set up a DOM Document from the Main Document using xerces as LSParser, Then, I tried to parse the fragment and to generate DOM Nodes which are to appended to the Main Document. I have tried SAX Parser for the fragment file. No way, because it complains about the undeclared Entity. SAX knows nothing about the context. I tried StaX XMLStreamReader for parsing of the fragment file . The difference to SAX is the ability to set javax.xml.stream.isReplacingEntityReferences=false. Then, where getting an EntityRefererence event, I generated an appropriate EntityReference from the main Document and appended this as a child Node. E. g. (pseudo Code for clarification): EntityReference er=mainDocument.createEntityReference (name-from-parsed-fragment); mainDocument.appendChild(er); This works without any Error, but not as expected. Serializing the mainDocument shows the EntityReference empty (no value). Debugging the code, i ended up with the information, that the Entity "foo" has a null value in the DocType of the main Document, because it is not used there. Fortunately, I found the DOM Level 3 normalizeDocument function, which says "This method acts as if the document was going through a save and load cycle, putting the document in a "normal" form. As a consequence, this method updates the replacement tree of EntityReference nodes ...". However, it was of no use. After doing so, the foo-EntityReference still shows up without any value in the normalized then serialized Document. The only solution that I found is an ugly hack: - add a new Element to the main Document with a name that is hopefully unique, immediately after parsing; - Iteration over all the Entities that are defined in the main Document. Add an EntiityReference to the newly created Element. - Serialize it to a ByteStream. Set DomConfig Parameter "entities" to "true" keeps EntityReference nodes in the document, that means the reference will be serialized as "&foo;". - Parse the content of the Bytestream gives us a new Document, which is essentially the same as the mainDocument. The difference is: there is an extra Element that has a Reference to each Entity, so that all Entities are in use now - Remove the extra Element. >From that moment on, the StAX parser functionality (described above) works well. But this is a lot of work for a problem that sounds very simple. Is there a simpler solution which I haven't seen yet? Also, I wonder, if the upcoming parserWithContext support for Apache Xerces will help me in this situation. Since the Entity foo *IS* defined in my example, I would expect that adding an EntityReference within in Fragment that is parsed in Context will work as expected - whether the Entity has been in use in the Context or not. Thank you, Frank Steimke --------------------------------------------------------------------- To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org For additional commands, e-mail: j-users-h...@xerces.apache.org