David Scott wrote: > I am a total dunce when it comes to encodings though. How do you find the > encoding of a file? > You don't. Either you know it, or you are up the proverbial creek (or roof). The "8-bit ascii" encodings is one of the greater computer crimes of the last century precisely because the files contain no clue about which encoding they are in.
Well, not quite true. Those of us with non-ascii letters in their language will know to look for certain tell-tale bytes or byte sequences (e.g. \xe6 is Danish character 'æ' in latin-1 whereas \xc3 is A-tilde but more likely to be the UTF-8 multibyte escape char), IF we have an idea about the language the file is written in. -- O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.