wo_shi_big_stomach wrote: > Thanks for the great tip about fileinput.input(), and thanks to all who > answered my query. I've pasted the working code below. > [snip] > # check first line only > elif fileinput.isfirstline(): > if not re.search('^From ',line):
This "works", and in this case you are doing it on only the first line in each file, but for future reference: 1. Read the re docs section about when to use search and when to use match; the "^" anchor in your pattern means that search and match give the same result here. However the time they take to do it can differ quite a bit :-0 C:\junk>\python25\python -mtimeit -s"import re;text='x'*100" "re.match('^From ', text)" 100000 loops, best of 3: 4.39 usec per loop C:\junk>\python25\python -mtimeit -s"import re;text='x'*1000" "re.match('^From ' ,text)" 100000 loops, best of 3: 4.41 usec per loop C:\junk>\python25\python -mtimeit -s"import re;text='x'*10000" "re.match('^From ',text)" 100000 loops, best of 3: 4.4 usec per loop C:\junk>\python25\python -mtimeit -s"import re;text='x'*100" "re.search('^From ' ,text)" 100000 loops, best of 3: 6.54 usec per loop C:\junk>\python25\python -mtimeit -s"import re;text='x'*1000" "re.search('^From ',text)" 10000 loops, best of 3: 26 usec per loop C:\junk>\python25\python -mtimeit -s"import re;text='x'*10000" "re.search('^From ',text)" 1000 loops, best of 3: 219 usec per loop Aside: I noticed this years ago but assumed that the simple optimisation of search was not done as a penalty on people who didn't RTFM, and so didn't report it :-) 2. Then realise that your test is equivalent to if not line.startswith('^From '): which is much easier to understand without the benefit of comments, and (bonus!) is also much faster than re.match: C:\junk>\python25\python -mtimeit -s"text='x'*100" "text.startswith('^From ')" 1000000 loops, best of 3: 0.584 usec per loop C:\junk>\python25\python -mtimeit -s"text='x'*1000" "text.startswith('^From ')" 1000000 loops, best of 3: 0.583 usec per loop C:\junk>\python25\python -mtimeit -s"text='x'*10000" "text.startswith('^From ')" 1000000 loops, best of 3: 0.612 usec per loop HTH, John -- http://mail.python.org/mailman/listinfo/python-list