Hello, I have this little grep-like program:
++++++++++snip++++++++++ #!/usr/bin/python import sys import re pattern = sys.argv[1] inputfile = file(sys.argv[2], 'r') for line in inputfile: matches = re.findall(pattern, line) if matches: print matches ++++++++++snip++++++++++ Like this, the program prints some characters as strange escape sequences, which is due to the input file being encoded in utf-8: When I convert "re.findall..." to a string and wrap an "unicode()" around it, the matches get printed correctly. Is it possible to make "matches" unicode without saving it as a single string first? The function "unicode ()" seems only to work for strings. Or is there a general way of telling Python to abandon the ancient and evil land of iso-8859 for good and use utf-8 only? Regards, Rehceb -- http://mail.python.org/mailman/listinfo/python-list