On 16/06/2006 7:28 PM, Holger wrote: > Well, that was an excellent opportunity to get some python practice, so > below is my first shot at the problem. > > Any feedback on what would be "the pythonic way" to do this would be > much appreciated! >
> #!/usr/bin/env python > # Copyright 2006 Holger Lindeberg Bille > > import sys, re, os > import popen2 > > workingfile = re.compile("^Working file: *(.*)$") > revision = re.compile("^revision *(.*)$") > fileend = > re.compile("^===========================================================================") > details = re.compile("^date: *") > entryend = re.compile("^----------------------------") > branches = re.compile("^branches:( *(.*);)*") > > class LogEntry: > def __init__(self): > self.rev = 0 > self.prevrev = 0 > self.text = [] > > def setName(self, name): > self.name = name > > def read(self, file): > done = 0 > for line in file: > regx = details.search(line) > if regx: > pass > else: > if entryend.search(line): > break > else: > if fileend.search(line): > done = 1 > break > else: > self.text.append(line.strip()) > return done IMHO that flight of geese heading equatorwards for winter is not Xic for any language X. Compare with: | def read(self, file): | done = 0 | for line in file: | regx = details.search(line) | if regx: | pass | elif entryend.search(line): | break | elif fileend.search(line): | done = 1 | break | else: | self.text.append(line.strip()) | return done 2nd comment: Make a habit of NOT using the names of built-ins like "file" for your own names. Pretend they are reserved words. Doesn't matter in this case, but will save you grief some day soon. 3rd comment: Read the section in the re manual that explains the difference between search and match. Searching for "^foo" will give the same results as using match() with "foo" or the redundantly anchored "^foo". However some regex engines when presented with re.search("^foo", "x" * 10000) will note that there is no joy at offset 0, there is no point (given the anchor "^") of looking at offset 1, and return almost immediately. Others (cough, cough) will check at offset 1, 2, ... Ponder these results: python -mtimeit -s"import re;rx=re.compile('^foo');txt='x'*10000" "rx.match(txt)" 100000 loops, best of 3: 1.2 usec per loop python -mtimeit -s"import re;rx=re.compile('foo');txt='x'*10000" " rx.search(txt)" 10000 loops, best of 3: 19.8 usec per loop python -mtimeit -s"import re;rx=re.compile('^foo');txt='x'*10000" "rx.search(txt)" 1000 loops, best of 3: 201 usec per loop 4th comment: what you have called "regx" is a match object. "mobj" might be a better choice. The term "regex" is applied to a pattern, or sometimes to the compiled re object. > def GuessPrevRev(self): > pass > > def filter(self, filter): Ugh. THREE filters: the built-in, the argument, and the method. In any case, this method doesn't perform a filtering operation, and the arg is not a filter, it's an re pattern. Suggestion: def anyLinesMatch(self, pattern): > found = 0 > for line in self.text: > if filter.search(line): > found = 1 > break > return found [snip] > class FileLog: > def __init__(self): > self.revs = [] > [snip] > > def filter(self, filter): > found = 0 > newrevs = [] > for rev in self.revs: > if rev.filter(filter): Waahhh! The filter count has now hit 4. > found = 1 > newrevs.append(rev) > self.revs = newrevs > return found > [snip] > > class LogDB: > def __init__(self): > self.flogs = [] [snip] > def filter(self, filter): > newflogs = [] > for flog in self.flogs: > if flog.filter(filter): See above. > newflogs.append(flog) > self.flogs = newflogs > [snip] HTH, John -- http://mail.python.org/mailman/listinfo/python-list