python regex character group matches...group...gotcha

2008-09-17 Thread christopher taylor
My apologies to the respondents - I failed to screen my test cases before kicking them out to the global python-list. but yes, the 'X' character in my test case was a mistake on my part. I'll give group() a shot. ct -- http://mail.python.org/mailman/listinfo/python-list

Re: python regex character group matches

2008-09-17 Thread Fredrik Lundh
Steven D'Aprano wrote: Assuming that you want to find runs of \u escapes, simply use non-capturing parentheses: pat = re.compile(u"(?:\\\u[0-9A-F]{4})") Doesn't work for me: pat = re.compile(u"(?:\\\u[0-9A-F]{4})") it helps if you cut and paste the right line... here's a better v

Re: python regex character group matches

2008-09-17 Thread Steven D'Aprano
On Wed, 17 Sep 2008 09:27:47 -0400, christopher taylor wrote: > hello python-list! > > the other day, i was trying to match unicode character sequences that > looked like this: > > \\uAD0X... It is not clear what this is supposed to be. Is that matching a literal pair of backslashes, or a sing

Re: python regex character group matches

2008-09-17 Thread Steven D'Aprano
On Wed, 17 Sep 2008 15:56:31 +0200, Fredrik Lundh wrote: > Assuming that you want to find runs of \u escapes, simply use > non-capturing parentheses: > > pat = re.compile(u"(?:\\\u[0-9A-F]{4})") Doesn't work for me: >>> pat = re.compile(u"(?:\\\u[0-9A-F]{4})") UnicodeDecodeError: 'unico

Re: python regex character group matches

2008-09-17 Thread Fredrik Lundh
christopher taylor wrote: my issue, is that the pattern i used was returning: [ '\\uAD0X', '\\u1BF3', ... ] when i expected: [ '\\uAD0X\\u1BF3', ] the code looks something like this: pat = re.compile("(\\\u[0-9A-F]{4})+", re.UNICODE|re.LOCALE) #print pat.findall(txt_line) results = pat.find

Re: python regex character group matches

2008-09-17 Thread Marc 'BlackJack' Rintsch
On Wed, 17 Sep 2008 09:27:47 -0400, christopher taylor wrote: > the other day, i was trying to match unicode character sequences that > looked like this: > > \\uAD0X... > > my issue, is that the pattern i used was returning: > > [ '\\uAD0X', '\\u1BF3', ... ] > > when i expected: > > [ '\\uAD0X

python regex character group matches

2008-09-17 Thread christopher taylor
hello python-list! the other day, i was trying to match unicode character sequences that looked like this: \\uAD0X... my issue, is that the pattern i used was returning: [ '\\uAD0X', '\\u1BF3', ... ] when i expected: [ '\\uAD0X\\u1BF3', ] the code looks something like this: pat = re.compile