On 2015-12-03 15:12, Adam Funk wrote:
I'm having trouble with some input files that are almost all proper
UTF-8 but with a couple of troublesome characters mixed in, which I'd
like to ignore instead of throwing ValueError.  I've found the
openhook for the encoding

for line in fileinput.input(options.files, 

which the documentation describes as "a hook which opens each file
with codecs.open(), using the given encoding to read the file", but
I'd like codecs.open() to also have the errors='ignore' or
errors='replace' effect.  Is it possible to do this?

It looks like it's not possible with the standard "hook_encoded", but
you could write your own alternative:

import codecs

def my_hook_encoded(encoding, errors):

    def opener(path, mode):
        return codecs.open(path, mode, encoding=encoding, errors=errors)

    return opener

for line in fileinput.input(options.files, openhook=fileinput.my_hook_encoded("utf-8", "ignore")):


Reply via email to