Re: regular expressions, unicode and XML

2006-01-27 Thread Justin Ezequiel
>> when I replace it end up with nothing: i.e., just a "" character in my >> file. how are you viewing the contents of your file? are you printing it out to stdout? are you opening your file in a non-unicode aware editor? try print repr(data) after re.sub so that you see what you actually have in

Re: regular expressions, unicode and XML

2006-01-26 Thread ProvoWallis
Thanks for this but I'm still getting an "empty" character (I don't know what else to call it) rather than the text captured by my regular expression in my replaced text. I even added the utf encoding declaration to my input data but still no luck. Any suggestions? -- http://mail.python.org/mai

Re: regular expressions, unicode and XML

2006-01-25 Thread Justin Ezequiel
import codecs f = codecs.open(pth, 'r', 'utf-8') data = f.read() f.close() ## data = re.sub ... f = codecs.open(pth, 'w', 'utf-8') f.write(data) f.close() -- http://mail.python.org/mailman/listinfo/python-list

regular expressions, unicode and XML

2006-01-25 Thread ProvoWallis
Hi, I'm hoping someone can help me. I'm hopelessly lost. I'm trying to make a change in some XML files using a regular expression (re.sub). I can capture the text I want to replace OK but when I replace it end up with nothing: i.e., just a "" character in my file. data = re.sub(r'(?i)(?u)Sample