hello python-list! the other day, i was trying to match unicode character sequences that looked like this:
\\uAD0X... my issue, is that the pattern i used was returning: [ '\\uAD0X', '\\u1BF3', ... ] when i expected: [ '\\uAD0X\\u1BF3', ] the code looks something like this: pat = re.compile("(\\\u[0-9A-F]{4})+", re.UNICODE|re.LOCALE) #print pat.findall(txt_line) results = pat.finditer(txt_line) i ran the pattern through a couple of my colleagues and they were all in agreement that my pattern should have matched correctly. is this a simple case of a messed up regex or am i not using the regex api correctly? cheers, ct -- http://mail.python.org/mailman/listinfo/python-list