You could also try to submit the package to CRAN with a comment about the NOTE. There is interesting information in https://discuss.ropensci.org/t/note-on-utf-8-strings-by-goodpractice-gp/2165/
Good luck! Ma\\u00eblle Den fredag 17 september 2021 13:01:25 CEST, Enrico Schumann <e...@enricoschumann.net> skrev: On Fri, 17 Sep 2021, Marc Girondot via R-package-devel writes: > I have posted this question first to r-h...@r-project.org and Bert Gunter > informs me that it was better for this discussion list that I didn't know. > > Hello everyone, > > I am a little bit stucked on the problem to include a database with > utf-8 string in a package. When I submit it to CRAN, it reports NOTES > for several Unix system and I try to find a solution (if it exists) to > not have these NOTES. > > The database has references and some names have non ASCII characters. > > * First I don't agree at all with the solution proposed here: > > https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues > > "First, consider carefully if you really need non-ASCIItext." > > If a language has non ASCII characters, it is not just to make the > writting nicer of more complex, it is because it changes the prononciation. > > * Then I try to find solution to not have these NOTES. > > For example, here is a reference with utf-8 characters > >> DatabaseTSD$Reference[211] > > [1] Hernández-Montoya, V., Páez, V.P. & Ceballos, C.P. (2017) Effects of > temperature on sex determination and embryonic development in the > red-footed tortoise, Chelonoidis carbonarius. Chelonian Conservation and > Biology 16, 164-171. > > When I convert the characters into unicode, I get indeed only ASCII > characters. Perfect. > >> iconv(DatabaseTSD$Reference[211], "UTF-8", "ASCII", "Unicode") > > [1] "Hern<U+00E1>ndez-Montoya, V., P<U+00E1>ez, V.P. & Ceballos, C.P. > (2017) Effects of temperature on sex determination and embryonic > development in the red-footed tortoise, Chelonoidis carbonarius. > Chelonian Conservation and Biology 16, 164-171." > > Then I have no NOTES when I checked the package with database in UNIX... > but how can I print the reference back with original characters ? > > Thanks a lot to point me to best practices to include databases with > non-ASCII characters and not have NOTES while submitted package to CRAN. > > Marc > WRE in section 1.1.5 says: "Any byte will be allowed in a quoted character string but ‘\uxxxx’ escapes should be used for non-ASCII characters. However, non-ASCII character strings may not be usable in some locales and may display incorrectly in others." So you could try to use such escapes, e.g. stringi::stri_escape_unicode("Hernández-Montoya") ## [1] "Hern\\u00e1ndez-Montoya" -- Enrico Schumann Lucerne, Switzerland http://enricoschumann.net ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel