[EMAIL PROTECTED] wrote: >I have a string which I wish to match using RE, however when I run my > comparison (using the match method) on my machine it never returns, > using the CPU fully. > > The code is (very simply): > > ------------------------------ > import re > > buffer = r"#1 1 550 111 SYNC_PEER RES > <YP>syncpeers=(id=54325432;add=10." \ > > "0.0.1;port=89;up=89),(id=97899;add=10.0.0.1;port=543;up=543)," \ > "(id=54325432;add=10.0.0.1;port=89;up=8" > > p = re.compile(r'#(?P<tran>[0-9]+\s)(?P<sess>[0-9]+\s)' \ > + '(?P<ndto>[0-9]+\s)(?P<ndfr>[0-9]+\s)' \ > + '(?P<cmd>[a-zA-Z_]+\s)(?P<dir>[a-zA-Z_]+)' \ > + '(?P<parm>(\s<[a-zA-Z]+>(\s*[a-zA-Z]+=[a-zA' \ > + '-Z0-9.,=();]+\s*)+<[a-zA-Z]+>)*)#(?P<left>.*$)') > > print "starting the match" > rst = (p.match (buffer) != None) > print "finishing the match" > -------------------------------- > > Should this every actually happen? Regardless of whether the statment > matches or not, surely it should do this?
surely the following should never be slow? n = len(s) for a in range(n): for b in range(n): for c in range(n): for d in range(n): for e in range(n): ... etc ... (your RE contains several nested and/or related repeat operators, which forces the poor engine to test zillions of combinations before giving up. see Friedl's RE book for more information on this) I suggest using a real parser for this task (or using the RE engine as a tokenizer, and doing the rest in code; see http://effbot.org/zone/xml-scanner.htm for more on this) </F> -- http://mail.python.org/mailman/listinfo/python-list