Le lundi 28 octobre 2019 11:48:29 UTC+1, Peter J. Holzer a écrit : > On 2019-10-25 22:12:23 +0200, Pascal wrote: > > for line in fileinput.input(source): > > print(line.strip()) > > > > ----------------------- > > > > python3.7.4 myscript.py myfile.log > > Traceback (most recent call last): > > ... > > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 799: > > invalid continuation byte > [...] > > for line in fileinput.input(source, > > openhook=fileinput.hook_encoded("utf-8", "ignore")): > > print(line.strip()) > > The file you were trying to read was obviously not encoded in UTF-8, > since you got a decode error. > > So the first question you should ask is: > > Is it supposed to be encoded in UTF-8 (and just corrupted) or is in > supposed to be encoded in something else (e.g. iso-8859-1 or win-1252)? > > If it is supposed to be in UTF-8 but may contain errors, ignoring errors > may be reasonable. > > If is supposed to be something else, determine what that "something > else" actually is, and use that. > > hp > > -- > _ | Peter J. Holzer | we build much bigger, better disasters now > |_|_) | | because we have much more sophisticated > | | | h...@hjp.at | management tools. > __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
you're right, the log file came from Windows and was encoded in iso-8859-1, but my question was about the difference in result between reading a file and reading from stdin. -- https://mail.python.org/mailman/listinfo/python-list