Oleg replied that he will look into this bug when he has time (and that the patch looked reasonable), which sounded non-imminent :) Hopefully this means that there will be an upstream patch sometime in the future.
-- Linus Björnstam On Thu, 16 Jan 2020, at 13:00, Linus Björnstam wrote: > Hello Guilers! > > RhodiumToad found an error in sxml where it would not properly parse > CDATA: > would be converted to > inside CDATA blocks. This is > probably due to some wrong reading of the XML spec: > > "Within a CDATA section, only the CDEnd string is recognized as > markup, so that left angle brackets and ampersands may occur in their > literal form; they need not (and cannot) be escaped using ' < ' and > ' & '.". > > Notice that it mentions that only CDEnd is recognized, but omitts > > in the enumeration of things that need-not-and-cannot be escaped. > > No other XML libraries behave this way. Take for example python's Etree: > > Python 2.7.17 (default, Dec 23 2019, 21:25:33) > >>> import xml.etree.ElementTree as ET > >>> root = ET.fromstring("<e><![CDATA[>]]></e>") > >>> root.text > '>' > > The same thing with the un-patched (sxml ssax) (or rather (sxml > simple)): looks different: > > (xml->sxml "<e><![CDATA[>]]></e>") > ;; => (*TOP* (e ">")) > > The question is whether this patch should be sent upstream. Since there > has been very little activity there, I suspect it is a lost cause. > > Failing tests have been looked through, verified and fixed. No > unexpected errors were encountered. All SXML tests pass after this > patch. > > Best regards > Linus Björnstam > Attachments: > * 0001-module-sxml-upstream-SSAX.scm-Fix-improper-handling-.patch