Hi everyone, My `countrycode` package ships with two data frames of characters in several languages: codelist and codelist_panel.
I converted all strings to UTF-8 using the `enc2utf8` function, but I also tried several other ways, with the stringi package, etc. As far as I can tell, the strings are all in UTF-8 format now: url <- 'https://github.com/vincentarelbundock/countrycode/raw/master/data/codelist.rda' temp <- tempfile() download.file(url, temp) load(temp) tmp <- codelist[, sapply(codelist, is.character)] library(stringi) all(unlist(lapply(tmp, function(x) stri_enc_isutf8((na.omit(x)))))) [1] TRUE After encoding, I saved the data frames with this command: save(codelist, file = 'data/codelist.rda', compress = 'xz', version = 2) Yet, when I run R CMD check, I get the following warning: checking data for non-ASCII characters ... WARNING Warning: found non-ASCII strings 'W<c3><bc>rtemberg' in object 'codelist' 'S<c3><a3>o Tom<c3><a9> and Pr<c3><ad>ncipe' in object 'codelist' 'W<c3><bc>rtemberg' in object 'codelist_panel' 'S<c3><a3>o Tom<c3><a9> and Pr<c3><ad>ncipe' in object 'codelist_panel' This warning disappears if I save the data frames using `save(version = 3)`. However, I would prefer to use version 2 to keep compatibility with older versions of R. Does anyone have suggestions for how to handle this? What did I miss? Thanks a lot for your time! Vincent -- Vincent Arel-Bundock Professeur agrégé / Associate professor http://arelbundock.com Université de Montréal, Science politique 3150 rue Jean-Brillant, Pav. Lionel-Groulx, C-4020 Montréal, Québec, Canada, H3T 1N8 ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel