In a message of Thu, 03 Dec 2015 15:12:15 +0000, Adam Funk writes: >I'm having trouble with some input files that are almost all proper >UTF-8 but with a couple of troublesome characters mixed in, which I'd >like to ignore instead of throwing ValueError. I've found the >openhook for the encoding > >for line in fileinput.input(options.files, >openhook=fileinput.hook_encoded("utf-8")): > do_stuff(line) > >which the documentation describes as "a hook which opens each file >with codecs.open(), using the given encoding to read the file", but >I'd like codecs.open() to also have the errors='ignore' or >errors='replace' effect. Is it possible to do this? > >Thanks.
This should be both easy to add, and useful, and I happen to know that fileinput is being hacked on by Serhiy Storchaka right now, who agrees that this would be easy. So, with his approval, I stuck this into the tracker. http://bugs.python.org/issue25788 Future Pythons may not have the problem. Laura -- https://mail.python.org/mailman/listinfo/python-list