On 20/12/10 17:53, Alec Battles wrote:
I seem to remember that 'file' in Linux detects encodings, but it's
also a matter of calling it by the exact same name...

There is no foolproof way of detecting encoding unfortunately - you just
need to know what it is before you read the file.

That's interesting. I wonder if there's a mathematical proof of the
'undecidability' of text encodings.

Hofstadter describes the problem in Godel, Escher, Bach as the "Envelope Problem" IIRC - you need to have some idea of how to decode any message you are sent, and you even need to understand that it is a "message".

UNIX manages the latter for us by providing a filename - but how to interpret the contents is entirely up to you. It might be UTF-8, it might be a jpeg, it might be encrypted using AES. You need to know what to expect to try and interpret the contents.

I bet there is a name for this (although probably not a proof), but I don't know what it is ;)

Cheers,

Doug.


--
Telephone: +44 1904 567330, Mobile: +44 7879 423002
Switchboard: +44 1904 567349, Fax: +44 20 79006980
Post: Tower House, Fishergate, York, YO10 4UA, UK

Registered in England.  Company No 5171172.  VAT GB843570325.
Regd Office: 3&4 Park Court, Riccall Road, Escrick, York, YO19 6ED
_______________________________________________
python-uk mailing list
python-uk@python.org
http://mail.python.org/mailman/listinfo/python-uk

Reply via email to