On Jan 24, 7:57 am, "Reedick, Andrew" <[EMAIL PROTECTED]> wrote: > > Why is it that so many Python people are regex adverse? Use the dashed > line as a regex. Convert the dashes to dots. Wrap the dots in > parentheses. Convert the whitespace chars to '\s'. Presto! Simpler, > cleaner code.
Woo-hoo! Yesterday was HTML day, today is code review day. Yee-haa! > > import re > > state = 0 > header_line = '' > pattern = '' > f = open('a.txt', 'r') > for line in f: > if line[-1:] == '\n': > line = line[:-1] > > if state == 0: > header_line = line > state += 1 state = 1 > elif state == 1: > pattern = re.sub(r'-', r'.', line) > pattern = re.sub(r'\s', r'\\s', pattern) > pattern = re.sub(r'([.]+)', r'(\1)', pattern) Consider this: pattern = ' '.join('(.{%d})' % len(x) for x in line.split()) > print pattern > state += 1 state = 2 > > headers = re.match(pattern, header_line) > if headers: > print headers.groups() > else: > state = 2 assert state == 2 > m = re.match(pattern, line) > if m: > print m.groups() > > f.close() > -- http://mail.python.org/mailman/listinfo/python-list