On 14/03/2015 11:07, Anthony Damico wrote:
hello, i am trying to replace non-ASCII characters in a character string
with a single space. the iconv() function works as i expect it to on
windows, but on unix, non-ASCII characters are getting replaced with two
spaces instead of one. i suppose i could write a workaround for my code,
but i'm wondering if i'm making some other mistake?
You are (not reading the help, not writing legible English) ...
sub: character string. If not ‘NA’ it is used to replace any
non-convertible bytes in the input.
Note *bytes* not characters. In UTF-8 'ó' is two bytes, other non-ASCII
characters can be 2, 3, 4 (in the current Unicode standard, originally
in principle up to 6).
We do not know what locale you used on Windows, but in non-CJK locales
characters == bytes.
I guess chartr() will do what you want using a character range.
in the output below, this is the result i'm getting:
[1] "cancelaci n"
and this is the result i want:
[1] "cancelaci n"
thanks!!
=================
getOption( "encoding" )
[1] "windows-1252"
What is the relevance of that?
a <- "cancelación"
iconv(a,"","ASCII")
[1] NA
iconv(a,"","ASCII",sub=" ")
[1] "cancelaci n"
=================
sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] R.utils_1.34.0 R.oo_1.18.0 R.methodsS3_1.6.1 descr_1.0.4
[5] SAScii_1.0 downloader_0.3 foreign_0.8-61 MonetDB.R_0.9.5
[9] digest_0.6.6 DBI_0.3.1
loaded via a namespace (and not attached):
[1] xtable_1.7-4
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.