On Nov 22, 8:46 am, harijay <[EMAIL PROTECTED]> wrote: > Hi > I am a few months new into python. I have used regexps before in perl > and java but am a little confused with this problem. > > I want to parse a number of strings and extract only those that > contain a 4 digit number anywhere inside a string > > However the regexp > p = re.compile(r'\d{4}') > > Matches even sentences that have longer than 4 numbers inside > strings ..for example it matches "I have 3324234 and more"
No it doesn't. When used with re.search on that string it matches 3324, it doesn't "match" the whole sentence. > > I am very confused. Shouldnt the \d{4,} match exactly four digit > numbers so a 5 digit number sentence should not be matched . {4} does NOT mean the same as {4,}. {4} is the same as {4,4} {4,} means {4,INFINITY} Ignoring {4,}: You need to specify a regex that says "4 digits followed by (non-digit or end-of-string)". Have a try at that and come back here if you have any more problems. some test data: xxx1234 xxx12345 xxx1234xxx xxx12345xxx xxx1234xxx1235xxx xxx12345xxx1234xxx -- http://mail.python.org/mailman/listinfo/python-list