On 26 August 2016 at 17:58, Joonas Liik <liik.joo...@gmail.com> wrote: > On 26 August 2016 at 16:10, Frank Millman <fr...@chagford.com> wrote: >> "Joonas Liik" wrote in message >> news:cab1gnpqnjdenaa-gzgt0tbcvwjakngd3yroixgyy+mim7fw...@mail.gmail.com... >> >>> On 26 August 2016 at 08:22, Frank Millman <fr...@chagford.com> wrote: >>> > >>> > So this is my conversion routine - >>> > >>> > lines = string.split('"') # split on attributes >>> > for pos, line in enumerate(lines): >>> > if pos%2: # every 2nd line is an attribute >>> > lines[pos] = line.replace('<', '<').replace('>', '>') >>> > return '"'.join(lines) >>> > >>> >>> or.. you could just escape all & as & before escaping the > and <, >>> and do the reverse on decode >>> >> >> Thanks, Joonas, but I have not quite grasped that. >> >> Would you mind explaining how it would work? >> >> Just to confirm that we are talking about the same thing - >> >> This is not allowed - '<root><fld name="<new>"/></root>' [A] >> >>>>> import xml.etree.ElementTree as etree >>>>> x = '<root><fld name="<new>"/></root>' >>>>> y = etree.fromstring(x) >> >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> File >> "C:\Users\User\AppData\Local\Programs\Python\Python35\lib\xml\etree\ElementTree.py", >> line 1320, in XML >> parser.feed(text) >> xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, >> column 17 >> >> You have to escape it like this - '<root><fld name="<new>"/></root>' >> [B] >> >>>>> x = '<root><fld name="<new>"/></root>' >>>>> y = etree.fromstring(x) >>>>> y.find('fld').get('name') >> >> '<new>' >>>>> >>>>> >> >> I want to convert the string from [B] to [A] for editing, and then back to >> [B] before saving. >> >> Thanks >> >> Frank >> >> >> -- >> https://mail.python.org/mailman/listinfo/python-list > > something like.. (untested) > > def escape(untrusted_string): > ''' Use on the user provided strings to render them inert for storage > escaping & ensures that the user cant type sth like '>' in > source and have it magically decode as '>' > ''' > return untrusted_string.replace("&","&").replace("<", > "<").replace(">", ">") > > def unescape(escaped_string): > '''Once the user string is retreived from storage use this > function to restore it to its original form''' > return escaped_string.replace("<","<").replace(">", > ">").replace("&", "&") > > i should note tho that this example is very ad-hoc, i'm no xml expert > just know a bit about xml entities. > if you decide to go this route there are probably some much better > tested functions out there to escape text for storage in xml > documents.
you might want to un-wrap that before testing tho.. no idea why my messages get mutilated like that :( (sent using gmail, maybe somebody can comment on that?) -- https://mail.python.org/mailman/listinfo/python-list