David Scott wrote:
> I am a total dunce when it comes to encodings though. How do you find the 
> encoding of a file?
>   
You don't. Either you know it, or you are up the proverbial creek (or 
roof). The "8-bit ascii" encodings is one of the greater computer crimes 
of the last century precisely because the files contain no clue about 
which encoding they are in.

Well, not quite true. Those of us with non-ascii letters in their 
language will know to look for certain tell-tale bytes or byte sequences 
(e.g. \xe6 is Danish character 'æ' in latin-1 whereas \xc3 is A-tilde 
but more likely to be the UTF-8 multibyte escape char), IF we have an 
idea about the language the file is written in.

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED])                  FAX: (+45) 35327907

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to