On 13-11-09 12:07 PM, Sverre Stausland wrote:
As recently discussed on Stack Overflow, R for Mac OS and Ubuntu (so probably all Unix systems) can correctly write files with UTF-8 encoding, but R for Windows cannot:
That's not an accurate description of the problem. Some functions in R convert values to the native encoding, but not all do.
http://stackoverflow.com/questions/19877676/write-utf-8-files-from-r I strongly suggest that R for Windows should support this feature in upcoming versions.
It's not trivial to do. When R was written, and perhaps still on some obscure platforms, there wasn't any way to do that--Windows didn't support UTF-8 then, just Microsoft's version of UCS-2 and a variety of other more limited encodings. Unix platforms didn't support UCS-2. So internally R keeps many things in the native encoding.
If you decide to rewrite R from scratch now, I'd suggest that you handle things differently. If you'd rather not rewrite it yourself, then I don't know how you will convince someone else to take on that job.
You might find it easier to convince Microsoft to add a UTF-8 locale, so then the native encoding would be UTF-8, and the problem would go away.
Duncan Murdoch ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel