Hi John, If you really need to know the boundaries of character references you should enable the 'notify-char-refs' [1] feature. Note that this only applies to the content of elements (i.e. not attribute values).
Thanks. [1] http://xerces.apache.org/xerces2-j/features.html#scanner.notify-char-refs Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] John Byrne <[EMAIL PROTECTED]> wrote on 04/22/2008 04:58:15 PM: > "The distinction is syntactic, not semantic. Nothing that's looking at > the semantic content of XML documents should care about it... and > nothing should be looking at the purely syntactic details of XML except > the parser. " > > True. And from that point of view, what I am working on is, in fact, and > kind of parser, albiet a very specialized one. One of the things my > parser needs to do is detect the presence of these character references. > I need to distinguish between A and a letter A character. Now I > could go and write the code to do this, but I thought since Xerces must > already have a way of doing this, I'd go ahead and use that instead. > > I imagine that there is a callback method somewhere in the XNI API that > handles the translation of these references into their "normative" > representation. > > As regards the correctness of my design, all I can say is that I've have > given it quite a lot of thought, and I'm confident that my solution it > the best option available to me. Unfortunately I'm not in a position to > go into a lot of detail. While I do appreciate any and all advice, be it > theoretical or otherwise, what I really need is a practical solution! > > > [EMAIL PROTECTED] wrote: > > > > > & might be treated as being the same as &, but these are both > > > distinct from ordinary text > > > > As far as XML is concerned, neither is "distinct from ordinary text" > > -- they're just representations of the & character. > > > > For comparison, consider A. XML doesn't distinguish between this > > and a simple capital-A character. > > > > The distinction is syntactic, not semantic. Nothing that's looking at > > the semantic content of XML documents should care about it... and > > nothing should be looking at the purely syntactic details of XML > > except the parser. > > > > ______________________________________ > > "... Three things see no end: A loop with exit code done wrong, > > A semaphore untested, And the change that comes along. ..." > > -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish > > (http://www.ovff.org/pegasus/songs/threes-rev-11.html) > > ------------------------------------------------------------------------ > > > > No virus found in this incoming message. > > Checked by AVG. > > Version: 7.5.524 / Virus Database: 269.23.3/1390 - Release Date: > 21/04/2008 16:23 > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]