Re: Subject Unicode

Hunkeler, Peter Fri, 10 Jan 2014 00:03:00 -0800

>Other than with a lot of inferential cleverness, there is no way to look at an 
>"ASCII-like" file and tell what the code page is.


The same applies to data encoded in EBCDIC. In fact, files are nothing but a 
series of bytes. You always need to know what those byes represent in order to 
be able work on the in a meaningful way. 

Especially in the distributed world, some conventions have been established 
that help programs in guessing what the file content might be. The first couple 
of bytes contain a certain byte sequence to identify the type of the file. But 
still, there is no guarantee the rest of the file matches that indication. 
Unfortunately, no such convention exists for pure text data. Neither a 
convention to indicate this is text nor to tell the encoding / code page used.

--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: Subject Unicode

Reply via email to