Wow, this was harder than I thought (at least for a rusty Pythoneer like myself). Here's my stab at an implementation. Remember, the goal is to add a "match" method to Template which works like Template.substitute, but in reverse: given a string, if that string matches the template, then it should return a dictionary mapping each template field to the corresponding value in the given string.

Oh, and as one extra feature, I want to support a ".greedy" attribute on the Template object, which determines whether the matching of fields should be done in a greedy or non-greedy manner.

------------------------------------------------------------
#!/usr/bin/python

from string import Template
import re

def templateMatch(self, s):
        # start by finding the fields in our template, and building a map
        # from field position (index) to field name.
        posToName = {}
        pos = 1
        for item in self.pattern.findall(self.template):
                # each item is a tuple where item 1 is the field name
                posToName[pos] = item[1]
                pos += 1
        
        # determine if we should match greedy or non-greedy
        greedy = False
        if self.__dict__.has_key('greedy'):
                greedy = self.greedy

        # now, build a regex pattern to compare against s
        # (taking care to escape any characters in our template that
        # would have special meaning in regex)
        pat = self.template.replace('.', '\\.')
        pat = pat.replace('(', '\\(')
        pat = pat.replace(')', '\\)') # there must be a better way...
        
        if greedy:
                pat = self.pattern.sub('(.*)', pat)
        else:
                pat = self.pattern.sub('(.*?)', pat)
        p = re.compile(pat)
        
        # try to match this to the given string
        match = p.match(s)
        if match is None: return None
        out = {}
        for i in posToName.keys():
                out[posToName[i]] = match.group(i)
        return out


Template.match = templateMatch

t = Template("The $object in $location falls mainly in the $subloc.")
print t.match( "The rain in Spain falls mainly in the train." )
------------------------------------------------------------

This sort-of works, but it won't properly handle $$ in the template, and I'm not too sure whether it handles the ${fieldname} form, either. Also, it only escapes '.', '(', and ')' in the template... there must be a better way of escaping all characters that have special meaning to RegEx, except for '$' (which is why I can't use re.escape).

Probably the rest of the code could be improved too. I'm eager to hear your feedback.

Thanks,
- Joe


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to