Re: [R] file reading /problems with encoding

T . Wunder Tue, 02 Mar 2010 00:32:53 -0800

Quoting Uwe Ligges <lig...@statistik.tu-dortmund.de>:

R is not able to re-encode the file to the native encoding. But if you
keep it in UTF-8, what is the problem to grep for the specific
characters (as grep and friends support the argument useBytes these
days)?



The Problem with UTF-8 is that I'm not able to cat a valid xml-file.

Using the encoding="UTF-8" option in either the file() or thereadLines() command will cause an error. If I would leave out both,it's not possible for me to run a gsub command on the string, becauseof special characters - even with the useBytes-option turned on:

grep("über 40%",xml,useBytes=TRUE)
will return integer(0). And the problem is obvious:
By reading in the file, the "ü" was taken to "Ã¼b".

However I believe, that I did not use the useBytes-option in the rightway, didn't I?


Thanks a lot for your help!

Best regards, Tom

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] file reading /problems with encoding

Reply via email to