On Oct 9, 5:20 pm, Joe Strout <[EMAIL PROTECTED]> wrote: > Wow, this was harder than I thought (at least for a rusty Pythoneer > like myself). Here's my stab at an implementation. Remember, the > goal is to add a "match" method to Template which works like > Template.substitute, but in reverse: given a string, if that string > matches the template, then it should return a dictionary mapping each > template field to the corresponding value in the given string. > > Oh, and as one extra feature, I want to support a ".greedy" attribute > on the Template object, which determines whether the matching of > fields should be done in a greedy or non-greedy manner. > > ------------------------------------------------------------ > #!/usr/bin/python > > from string import Template > import re > > def templateMatch(self, s): > # start by finding the fields in our template, and building a map > # from field position (index) to field name. > posToName = {} > pos = 1 > for item in self.pattern.findall(self.template): > # each item is a tuple where item 1 is the field name > posToName[pos] = item[1] > pos += 1 > > # determine if we should match greedy or non-greedy > greedy = False > if self.__dict__.has_key('greedy'): > greedy = self.greedy > > # now, build a regex pattern to compare against s > # (taking care to escape any characters in our template that > # would have special meaning in regex) > pat = self.template.replace('.', '\\.') > pat = pat.replace('(', '\\(') > pat = pat.replace(')', '\\)') # there must be a better way... > > if greedy: > pat = self.pattern.sub('(.*)', pat) > else: > pat = self.pattern.sub('(.*?)', pat) > p = re.compile(pat) > > # try to match this to the given string > match = p.match(s) > if match is None: return None > out = {} > for i in posToName.keys(): > out[posToName[i]] = match.group(i) > return out > > Template.match = templateMatch > > t = Template("The $object in $location falls mainly in the $subloc.") > print t.match( "The rain in Spain falls mainly in the train." ) > ------------------------------------------------------------ > > This sort-of works, but it won't properly handle $$ in the template, > and I'm not too sure whether it handles the ${fieldname} form, > either. Also, it only escapes '.', '(', and ')' in the template... > there must be a better way of escaping all characters that have > special meaning to RegEx, except for '$' (which is why I can't use > re.escape). > > Probably the rest of the code could be improved too. I'm eager to > hear your feedback. > > Thanks, > - Joe
How about something like: import re def placeholder(m): if m.group(1): return "(?P<%s>.+)" % m.group(1) elif m.group(2): return "\\$" else: return re.escape(m.group(3)) regex = re.compile(r"\$(\w+)|(\$\$)") t = "The $object in $location falls mainly in the $subloc." print regex.sub(placeholder, t) -- http://mail.python.org/mailman/listinfo/python-list