There are 4 standard entities that are predefined for XML. (I used to think that it was 5 with both " and ' being defined.) XML allows decimal entities of the form &#ddd;. Any others need to be defined in a DTD. A schema (an xsd in the case of OSIS) does not allow for the defining of entities. (I’m not familiar with other schemas types.)
Regarding parsing and validator: An xml document may be well-formed, but not valid. The former is the responsibility of the parser. The latter is the responsibility of a validator. A validator takes it’s content from the parser, which may be an in memory tree and compares it to a schema or DTD. What the validator gets, as far as I know, is without entities. — DM > On Dec 12, 2014, at 9:01 AM, Greg Hellings <[email protected]> wrote: > > If that's the case, how does it handle escaping <>? I believe entity > replacement is after XML validation but before passing them to a transformer > or such. > > On Dec 12, 2014 7:52 AM, "DM Smith" <[email protected] > <mailto:[email protected]>> wrote: > Best I can recall: > Nope. An entity is merely an alternate way of specifying a character. The XML > parser is supposed to replace the entity with the corresponding code point > before the value is evaluated against the schema. > >> On Dec 12, 2014, at 8:49 AM, Greg Hellings <[email protected] >> <mailto:[email protected]>> wrote: >> >> It should be possible to escape any such characters with an XML entity, no? >> >> On Dec 12, 2014 7:44 AM, "DM Smith" <[email protected] >> <mailto:[email protected]>> wrote: >> >> > On Dec 12, 2014, at 8:26 AM, Peter Von Kaehne <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Gesendet: Freitag, 12. Dezember 2014 um 13:16 Uhr >> > Von: "Troy A. Griffitts" <[email protected] >> > <mailto:[email protected]>> >> > >> >> Not sure, but I thought we used optional prefixes to specify the kind of >> >> gloss if there are multiple, e.g., > gloss="en_US:18 wheeler >> >> en_UK:articulated lorry" >> > >> > Should there be an option to escape colons? >> >> IMHO: >> Yes. >> >> The definition of gloss in the schema is xs:string, not osisGenRegex. >> The former places no semantic on the content an allows for an empty string. >> >> If gloss should have a semantic, then it should be changed in the OSIS spec. >> >> The latter is used by lemma and morph and is specified as: >> ((((\p{L}|\p{N}|_)+)(\.(\p{L}|\p{N}|_))*:)?([^:\s])+) >> which basically is work:value. >> If I read this right it does not allow for : to be escaped. I know we allow >> lemma=“x:a y:b” but I don’t see that this allows for the pattern to be >> repeated, separated by spaces. >> >> The pattern would need to change ([^:\s])+ to (\\:|[^:\s] <>)+ [ not tested >> ] >> >> In His Service, >> DM >> _______________________________________________ >> sword-devel mailing list: [email protected] >> <mailto:[email protected]> >> http://www.crosswire.org/mailman/listinfo/sword-devel >> <http://www.crosswire.org/mailman/listinfo/sword-devel> >> Instructions to unsubscribe/change your settings at above page >> _______________________________________________ >> sword-devel mailing list: [email protected] >> <mailto:[email protected]> >> http://www.crosswire.org/mailman/listinfo/sword-devel >> <http://www.crosswire.org/mailman/listinfo/sword-devel> >> Instructions to unsubscribe/change your settings at above page > > > _______________________________________________ > sword-devel mailing list: [email protected] > <mailto:[email protected]> > http://www.crosswire.org/mailman/listinfo/sword-devel > <http://www.crosswire.org/mailman/listinfo/sword-devel> > Instructions to unsubscribe/change your settings at above page > _______________________________________________ > sword-devel mailing list: [email protected] > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page
_______________________________________________ sword-devel mailing list: [email protected] http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
