Hi everybody,

I have a question regarding the Java xml parser. I have some behavior
that baffles me, regarding the replacement text of parameter entity
references within a conditional INCLUDE-section. And I would really be
grateful if someone could explain it to me.

When I look into the xml specification (fifth edition), they state:

"Well-formedness constraint: PE Between Declarations
The replacement text of a parameter entity reference in a DeclSep MUST
match the production extSubsetDecl."

That prevents for example to split a markup declaration into the
replacement text of two separate parameter entities. Like for example
the entity-declaration

<!ENTITY copyright '(C)'>

cannot be split into two parameter entity references like that:

<!ENTITY % A "<!ENTITY ">
<!ENTITY % B "copyright '(C)'>">
%A;%B;

because in that case the replacement text for %A; would not match the
production extSubsetDecl (as required), because it is incomplete.

And the Java xml-parser reports the above as a fatal error, in
validation and non-validation-mode alike. So far so good.

But strangely the situation changes when the expression %A;%B; is put
into a conditional INCLUDE-section, like that:

<!ENTITY % A "<!ENTITY ">
<!ENTITY % B "copyright '(C)'>">
<![INCLUDE[%A;%B;]]>

In that case the Java xml-parser has no problem at all with the above in
non-validation-mode, and in validation-mode only two validation errors
are given, but no fatal error. In both cases the entity "copyright" is
declared and can be used within the document.

How can that be? I would really be grateful if someone could explain
that to me.

When I look at the grammatical definition of a conditional include-section

[62]       includeSect       ::=       '<![' S? 'INCLUDE' S? '['
extSubsetDecl ']]>'

the inner part should be an extSubsetDecl. So with looking at

[31]       extSubsetDecl       ::=       ( markupdecl | conditionalSect
| DeclSep)*

in our case, when processing the INCLUDE-section, %A; can only match a
DeclSep, and with the well-formedness constraint meantioned above, I
would have assumed that the replacement text of %A; "MUST match the
production extSubsetDecl". But it does not, since the replacement text
of %A; is incomplete.

I would be grateful for any hint.
Thank you so much for your work.
Apache is such a great project.

Bye everybody.
Stay healthy you all.

Stefan Bettner.


PS: I appended my xml files and Java file, for comparison. I was using
the following Java-version on Windows 10:
java version "12.0.2" 2019-07-16
Java(TM) SE Runtime Environment (build 12.0.2+10)

splitEntity.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Root SYSTEM "splitEntity.dtd" [
<!ELEMENT Root ANY>
] >
<Root>&copyright;</Root>


splitEntity.dtd

<!ENTITY % A "<!ENTITY ">
<!ENTITY % B "copyright '(C)'>">
<![INCLUDE[%A;%B;]]>


XMLSplitEntity.java

import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.helpers.DefaultHandler;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.File;
import java.io.IOException;

public class XMLSplitEntity
  {
  /*
   * custom DocumentHandler
   */
  public static class MyDocumentHandler extends DefaultHandler
    {
    @Override
    public void characters (char[] ch, int start, int length) throws
SAXException
      {
      System.out.println ("characters: " + new String (ch, start, length));
      }

    @Override
    public void warning (SAXParseException e) throws SAXException
      {
      System.out.println ("warning: " + e.getMessage ());
      }

    @Override
    public void error (SAXParseException e) throws SAXException
      {
      System.out.println ("error: " + e.getMessage ());
      }

    @Override
    public void fatalError (SAXParseException e) throws SAXException
      {
      System.out.println ("fatalError: " + e.getMessage ());
      }
    }

  /*
   * parse splitEntity.xml
   * (with external splitEntity.dtd)
   */
  public static void main (String[] args)
    {
    // create parser
    SAXParserFactory factory = SAXParserFactory.newInstance ();
    // factory.setValidating (true);
    SAXParser saxParser;
    try
      {
      saxParser = factory.newSAXParser ();
      }
    catch (ParserConfigurationException | SAXException e)
      {
      e.printStackTrace ();
      return;
      }

    // parse
    File file = new File ("C:\\splitEntity.xml");
    try
      {
      saxParser.parse (file, new MyDocumentHandler ());
      }
    catch (SAXException | IOException e)
      {
      e.printStackTrace ();
      }
    }
  }


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-users-h...@xerces.apache.org

Reply via email to