On 31 oct, 20:38, netimen <[EMAIL PROTECTED]> wrote: > there may be different levels of nesting: > > "a < b < Ó > d > here starts a new group: < 1 < e < f > g > 2 > > another group: < 3 >" > > On 31 окт, 21:57, netimen <[EMAIL PROTECTED]> wrote: > > > Thank's but if i have several top-level groups and want them match one > > by one: > > > text = "a < b < Ó > d > here starts a new group: < e < f > g >" > > > I want to match first " b < Ó > d " and then " e < f > g " but not " > > b < Ó > d > here starts a new group: < e < f > g " > > On 31 ÏËÔ, 20:53, Matimus <[EMAIL PROTECTED]> wrote: > > > > On Oct 31, 10:25šam, netimen <[EMAIL PROTECTED]> wrote: > > > > > I have a text containing brackets (or what is the correct term for > > > > '>'?). I'd like to match text in the uppermost level of brackets. > > > > > So, I have sth like: 'aaaa 123 < 1 aaa < t bbb < a <tt š> ff > > 2 > > > > > bbbbb'. How to match text between the uppermost brackets ( 1 aaa < t > > > > bbb < a <tt š> ff > > 2 )? > > > > > P.S. sorry for my english. > > > > I think most people call them "angle brackets". Anyway it should be > > > easy to just match the outer most brackets: > > > > >>> import re > > > >>> text = "aaaa 123 < 1 aaa < t bbb < a <tt š> ff > > 2 >" > > > >>> r = re.compile("<(.+)>") > > > >>> m = r.search(text) > > > >>> m.group(1) > > > > ' 1 aaa < t bbb < a <tt š> ff > > 2 ' > > > > In this case the regular expression is automatically greedy, matching > > > the largest area possible. Note however that it won't work if you have > > > something like this: "<first> <second>". > > > > Matt > >
Hi, Regular expressions or pyparsing might be overkill for this problem ; you can use a simple algorithm to read each character, increment a counter when you find a < and decrement when you find a > ; when the counter goes back to its initial value you have the end of a top level group Something like : def top_level(txt): level = 0 start = None groups = [] for i,car in enumerate(txt): if car == "<": level += 1 if not start: start = i elif car == ">": level -= 1 if start and level == 0: groups.append(txt[start+1:i]) start = None return groups print top_level("a < b < 0 > d > < 1 < e < f > g > 2 > < 3 >") >> [' b < 0 > d ', ' 1 < e < f > g > 2 ', ' 3 '] Best, Pierre -- http://mail.python.org/mailman/listinfo/python-list