"david brochu jr" wrote: > still having problems....i have the following in a txt file: > > Windows Registry Editor Version 5.00
if this is a regedit export, the data is encoded as UTF-16. treating that as plain ASCII doesn't really work. > for line in new_line.readlines(): > line = re.sub('"',"",line) > print line > > I get: > > i n d o w s R e g i s t r y E d i t o r V e r s i o n 5 . 0 0 > etc etc...Too much space... it's NUL bytes (chr(0)), not space. to open an UTF-16 file with automatic decoding, use codecs.open: import codecs infile = codecs.open("file", "r", "utf-16") reading from "infile" will now give you properly decoded unicode strings, which you can process as usual. > this is killing me please help sounds like you need to read up on what text encodings are, and how you can let Python handle them for you. start here: http://www.google.com/search?q=python+unicode </F> -- http://mail.python.org/mailman/listinfo/python-list