Milan, Anwer to your queries: -- But how do you read back the contents of the file? You need to specify the encoding when reading it too. Answer: I read back as stated in 'Case 2'
-- Are you sure the notepad saved the text as UTF-8? Answer: Yes. Regards, Sunny On Mon, Mar 28, 2016 at 9:58 PM, Milan Bouchet-Valat <nalimi...@club.fr> wrote: > Le lundi 28 mars 2016 à 20:12 +0530, Sunny Singha a écrit : >> Milan, >> Ok, Let me take a case of facebook. I used Rfacebook package >> to get posts (getPost()) which returns list() of data frames(post, >> comments, Likes) >> >> let me demonstrate 2 cases of read and write just as you suggested, >> Case 1::::::::: >> Lets say one of the facebook comment has below string value, in >> Japanese language--> >> "世界餐福事工 - 餐廳員工沒精打采 老是打盤子" >> >> On R console I now assign above string to variableas: x <- "世界餐福事工 - >> 餐廳員工沒精打采 老是打盤子" >> and write it as below: >> write.csv(x, file='x.csv', row.names=F, fileEncoding='UTF-8') >> I get this string in the file >> "" - >> " > But how do you read back the contents of the file? You need to specify > the encoding when reading it too. > >> Case 2:::::::::::::: >> I create a notepad 'x.txt' and save Japanese string "世界餐福事工 - 餐廳員工沒精打采 老是打盤子" >> and read it as below: >> read.table('x.txt', fileEncoding='UTF-8'), I get below output: >> >> V1 >> 1 ? >> Warning messages: >> 1: In read.table("x.txt", fileEncoding = "UTF-8") : >> invalid input found on input connection 'x.txt' >> 2: In read.table("x.txt", fileEncoding = "UTF-8") : >> incomplete final line found by readTableHeader on 'x.txt' > Are you sure the notepad saved the text as UTF-8? > >> Above was for demonstration, I'm infact reading social media data >> extracted, which ultimately is somewhere using httr package and >> returning data frames. >> I'm not sure how should I get it handled in Windows as I don't observe >> this behavior in Mac where system locase is set to 'en_US.UTF-8' >> >> Regards, >> Sunny >> >> >> >> >> On Mon, Mar 28, 2016 at 7:39 PM, Milan Bouchet-Valat wrote: >> > >> > Le lundi 28 mars 2016 à 19:16 +0530, Sunny Singha a écrit : >> > > >> > > Hi, >> > > I think I'm experiencing an issue regarding system Locale. I have >> > > exported '.csv' formatted data frames gathered from various social >> > > media platforms like facebook/twitter/G+, etc. >> > > >> > > I observe many variable/columns consists of strings formatted similar to >> > > below: >> > > " >> > > " >> > > >> > > As expected and I confirmed, in social media data, they are strings in >> > > different languages. >> > > Platform details are provide in the end of this mail. OS locale is set >> > > to English (United States) hence 'R' locale is 'English_United >> > > States.1252' >> > > >> > > I have attempted to change it to UTF-8 but receives below warning >> > > message: >> > > >> > > Warning message: >> > > In Sys.setlocale("LC_ALL", "UTF-8") : >> > > OS reports request to set locale to "UTF-8" cannot be honored >> > You don't need to set the locale. Just pass an appropriate value (e.g. >> > "UTF-8") to read.csv() or write.csv()'s fileEncoding argument. >> > >> > You also didn't tell us what program you used to read these files. Some >> > might guess the encoding incorrectly, or require you to choose it >> > manually. >> > >> > >> > Regards >> > >> > > >> > > I have gone through below forums but no resolution so far: >> > > --- >> > > http://stackoverflow.com/questions/20571147/how-to-set-unicode-locale-in-r >> > > --- https://stat.ethz.ch/pipermail/r-devel/2013-November/067940.html >> > > --- http://stackoverflow.com/questions/19877676/write-utf-8-files-from-r >> > > --- https://tomizonor.wordpress.com/2013/04/17/file-utf8-windows/ >> > > --- >> > > http://withr.me/configure-character-encoding-for-r-under-linux-and-windows/ >> > > >> > > I'm not sure whether the issue is while reading/extracting the data >> > > from media or while writing/exporting in Windows directory, but I >> > > don't experience similar issue in my personal Mac machine. I need some >> > > clarification here. >> > > >> > > How could I export the data just as I see on web ? Please guide. >> > > >> > > Regards, >> > > Sunny >> > > >> > > Platform I'm using:::::::::::::::::::::::::::: >> > > Operating System : Windows 7 Professional SP1 >> > > R version details: >> > > platform x86_64-w64-mingw32 >> > > arch x86_64 >> > > os mingw32 >> > > system x86_64, mingw32 >> > > status >> > > major 3 >> > > minor 2.3 >> > > year 2015 >> > > month 12 >> > > day 10 >> > > svn rev 69752 >> > > language R >> > > version.string R version 3.2.3 (2015-12-10) >> > > nickname Wooden Christmas-Tree >> > > >> > > ______________________________________________ >> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide >> > > http://www.R-project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.