On 01/28/10 11:28, Brian D wrote: > I've tackled this kind of problem before by looping through a patterns > dictionary, but there must be a smarter approach. > > Two addresses. Note that the first has incorrectly transposed the > direction and street name. The second has an extra space in it before > the street type. Clearly done by someone who didn't know how to > concatenate properly -- or didn't care. > > 1000 RAMPART S ST > > 100 JOHN CHURCHILL CHASE ST > > I want to parse the elements into an array of values that can be > inserted into new database fields. > > Anyone who loves solving these kinds of puzzles care to relieve my > frazzled brain? > > The pattern I'm using doesn't keep the "CHASE" with the "JOHN > CHURCHILL":
How does the following perform? pat = re.compile(r'(?P<streetnum>\d+)\s+(?P<streetname>[A-Z\s]+)\s+(?P<streetdir>N|S|W|E|)\s+(?P<streettype>ST|RD|AVE?|)$') or more legibly: pat = re.compile( r''' (?P<streetnum> \d+ ) #M series of digits \s+ (?P<streetname> [A-Z\s]+ ) #M one-or-more word \s+ (?P<streetdir> S?E|SW?|N?W|NE?| ) #O direction or nothing \s+ (?P<streettype> ST|RD|AVE? ) #M street type $ #M END ''', re.VERBOSE) -- http://mail.python.org/mailman/listinfo/python-list