To convert a string to utf-8 you need to do two operations: - decode the string to unicode (using the original file codec) - encode the unicode string using utf-8 codec
This is what decoder.decoder function is doing but it is guessing the original codec. You need to either provide the right codec for decoding (if you know it is always the same) or guess it better (e.g. by catching exception and trying different codecs in order). input_codec = "iso-8592-1" output = text.decode(input_codec).encode("utf-8")