On May 29, 11:03 am, Zdenek Maxa <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I would like to perform regular expression replace (e.g. removing
> everything from within tags in a XML file) with multiple-line pattern.
> How can I do this?
>
> where = open("filename").read()
> multilinePattern = "^<tag> .... <\/tag>$"
> re.search(multilinePattern, where, re.MULTILINE)
>

If it helps, I have the following function:

8<-----------------------------------------------------------
def update_xml(infile, outfile, mapping, deep=False):
    from xml.etree import cElementTree as ET
    from utils.elementfilter import ElementFilter
    doc = ET.parse(infile)
    efilter = ElementFilter(doc.getroot())
    changes = 0
    for key, val in mapping.iteritems():
        pattern, repl = val
        efilter.filter = key
        changes += efilter.sub(pattern, repl, deep=deep)
    doc.write(outfile, encoding='UTF-8')
    return changes

mapping = {
        '/portal/[EMAIL PROTECTED]"page"]/@action': ('.*', 'ZZZZ'),
        '/portal/web-app/portlet-app/portlet/localedata/title':
('Portal', 'Gateway'),
        }

changes = update_xml('c:\\working\\tmp\\test.xml', 'c:\\working\\tmp\
\test2.xml', mapping, True)

print 'There were %s changes' % changes
8<-----------------------------------------------------------

where utils.elementfilter is this module:

    http://gflanagan.net/site/python/elementfilter/elementfilter.py

It doesn't support `re` flags, but you could change the sub method of
elementfilter.ElementFilter to do so, eg.(UNTESTED!):

    def sub(self, pattern, repl, count=0, deep=False, flags=None):
        changes = 0
        if flags:
            pattern = re.compile(pattern, flags)
        for elem in self.filtered:
              ...
              [rest of method unchanged]
              ...

Gerard





-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to