[R-pkg-devel] "found non-ASCII strings" with save(version = 2)

Vincent Arel-Bundock Wed, 05 Feb 2020 05:54:13 -0800

Hi everyone,

My `countrycode` package ships with two data frames of characters in several 
languages: codelist and codelist_panel.


I converted all strings to UTF-8 using the `enc2utf8` function, but I also 
tried several other ways, with the stringi package, etc. As far as I can tell, 
the strings are all in UTF-8 format now:

url <- 
'https://github.com/vincentarelbundock/countrycode/raw/master/data/codelist.rda'
temp <- tempfile()
download.file(url, temp)
load(temp)
tmp <- codelist[, sapply(codelist, is.character)]
library(stringi)
all(unlist(lapply(tmp, function(x) stri_enc_isutf8((na.omit(x))))))
[1] TRUE

After encoding, I saved the data frames with this command:

save(codelist, file = 'data/codelist.rda', compress = 'xz', version = 2)

Yet, when I run R CMD check, I get the following warning:

checking data for non-ASCII characters ... WARNING
    Warning: found non-ASCII strings
    'W<c3><bc>rtemberg' in object 'codelist'
    'S<c3><a3>o Tom<c3><a9> and Pr<c3><ad>ncipe' in object 'codelist'
    'W<c3><bc>rtemberg' in object 'codelist_panel'
    'S<c3><a3>o Tom<c3><a9> and Pr<c3><ad>ncipe' in object 'codelist_panel'

This warning disappears if I save the data frames using `save(version = 3)`. 
However, I would prefer to use version 2 to keep compatibility with older 
versions of R.

Does anyone have suggestions for how to handle this? What did I miss?

Thanks a lot for your time!

Vincent

--
Vincent Arel-Bundock

Professeur agrégé / Associate professor
http://arelbundock.com
Université de Montréal, Science politique
3150 rue Jean-Brillant, Pav. Lionel-Groulx, C-4020
Montréal, Québec, Canada, H3T 1N8

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[R-pkg-devel] "found non-ASCII strings" with save(version = 2)

Reply via email to