"ProvoWallis" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Hi, > > I've always struggled with classes and this one is no exception. > > I'm working in an SGML file and I want to renumber a couple of elements > in the hierarchy based on the previous level. > > E.g., > > My document looks like this > > <level1>A. Title Text > <level2>1. Title Text > <level2>1. Title Text > <level2>1. Title Text > <level1>B. Title Text > <level2>1. Title Text > <level2>1. Title Text > > but I want to change the numbering of the second level to sequential > numbers like 1, 2, 3, etc. so my output would look like this > > <level1>A. Title Text > <level2>1. Title Text > <level2>2. Title Text > <level2>3. Title Text > <level1>B. Title Text > <level2>1. Title Text > <level2>2. Title Text > > This is what I've come up with on my own but it doesn't work. I was > hoping someone could critique this and point me in the right or better > direction. > > Thanks, > > Greg > > ### > > > def Fix(m): > > new = m.group(1) > > class ReplacePtSubNumber(object): > > def __init__(self): > self._count = 0 > self._ptsubtwo_re = re.compile(r'<pt-sub2 > no=\"[0-9]\">', re.IGNORECASE| re.UNICODE) > # self._ptsubone_re = re.compile(r'<pt-sub1', > re.IGNORECASE| re.UNICODE) > > def sub(self, new): > return self._ptsubtwo_re.sub(self._ptsubNum, new) > > def _ptsubNum(self, match): > self._count +=1 > return '<pt-sub2 no="%s">' % (self._count) > > > new = ReplacePtSubNumber().sub(new) > return '<pt-sub1%s<pt-sub1' % (new) > > data = re.sub(r'(?i)(?m)(?s)<pt-sub1(.*?)<pt-sub1', Fix, data) >
This may not be as elegant as your RE approach, but it seems more readable to me. Using pyparsing, we can define search patterns, attach callbacks to be invoked when a match is found, and the callbacks can return modified text to replace the original. Although the running code matches your text sample, I've also included commented statements that match your source code sample. Download pyparsing at http://pyparsing.sourceforge.net. -- Paul testData = """<level1>A. Title Text <level2>1. Title Text <level2>1. Title Text <level2>1. Title Text <level1>B. Title Text <level2>1. Title Text <level2>1. Title Text """ from pyparsing import * class Fix(object): def __init__(self): self.curItem = 0 def resetCurItem(self,s,l,t): self.curItem = 0 def nextCurItem(self,s,l,t): self.curItem += 1 return "<level2>%d." % self.curItem # return '<pt-sub2 no="%d">' % self.curItem def fixText(self,data): # set up patterns for searching lev1 = Literal("<level1>") lev2 = Literal("<level2>") + Word(nums) + "." # lev1 = CaselessLiteral("<pt-sub1>") # lev2 = CaselessLiteral('<pt-sub2 no="') + Word(nums) + '">' # when level 1 encountered, reset the cur item counter lev1.setParseAction(self.resetCurItem) # when level 2 encountered, use next cur item counter value lev2.setParseAction(self.nextCurItem) patterns = (lev1 | lev2) return patterns.transformString( data ) f = Fix() print f.fixText( testData ) returns: <level1>A. Title Text <level2>1. Title Text <level2>2. Title Text <level2>3. Title Text <level1>B. Title Text <level2>1. Title Text <level2>2. Title Text -- http://mail.python.org/mailman/listinfo/python-list