On Sun, 15 Nov 2009 13:49:54 +0100, Luca wrote: > I was quite sure that this is not a very simple task. Right now search > only inside ASCII encode is not enough for me (my native language is > outside this encode :-) > Checking every single byte can be a good solution... > > I can start using the mimetype module and, if the file has no > extension, check byte one by one (commonly) as "file" command does. > Better: I can check use the "file" command if available.
Another possible solution: Universal Encoding Detector Character encoding auto-detection in Python 2 and 3 http://chardet.feedparser.org/ -- http://mail.python.org/mailman/listinfo/python-list