Dotan Cohen wrote:
2008/10/14 <[EMAIL PROTECTED]>:
Dotan> Can Python go through a directory of files and replace each
Dotan> instance of "newline-space" with nothing?
Sure. Something like (*completely* untested, so caveat emptor):
import glob
import os
for f in glob.glob('*.vcf'):
# corrupt data
uncooked = open(f, 'rb').read()
# fix it
cooked = uncooked.replace('\n ', '')
# backup original file for safety
os.rename(f, '%s.orig' % f)
# and save it
open(f, 'wb').write(cooked)
Thanks, that's easier than I thought! I am sure with some googling I
will discover how to loop through all the files in a directory. One
question, though, is that code unicode-safe in the event that there
are unicode characters in there?
I believe that particular find/replace should be safe even if other
bytes represent encoded unicode.
In Python3, if you want to do more, and if you open in text mode, the
bytes with automatically be decoded, with UTF-8 the default, I believe.
Your sample said UTF-8, so that would be the right thing.
--
http://mail.python.org/mailman/listinfo/python-list