Re: [R] replacing unicode characters

Adrian Dusa Fri, 30 Jun 2023 07:35:24 -0700

Right on the point, Ivan, that was the issue. The output from l10n_info()
was:


$MBCS
[1] FALSE

$`UTF-8`
[1] FALSE

$`Latin-1`
[1] FALSE

$codeset
[1] "US-ASCII"

(and the locale was just "C")

I simply needed to write something like:
export LC_ALL='en_US.UTF-8'

before starting the child process, and everything looks good now.

Thanks a lot, much obliged,
Adrian


On Fri, Jun 30, 2023 at 2:10 PM Ivan Krylov <krylov.r...@gmail.com> wrote:

> On Fri, 30 Jun 2023 11:33:34 +0300
> Adrian Dușa <dusa.adr...@unibuc.ro> wrote:
>
> > In a very simple test, I tried creating a text file from the Electron
> > app embedded R:
> > sink("test.txt")
> > cat("\u00e7")
> > sink()
> >
> > which resulted in:
> >
> > <U+00E7>
> >
> > I don't quite understand how this works, my best guess is it matters
> > less how R interprets these characters, but how they are passed
> > through the child process that started R.
>
> Something goes wrong with the locale setting when the R child process
> is being launched. For example,
>
> Rscript -e 'cat("\ue7\n")'
> # ç
>
> but:
> LC_ALL=C Rscript -e 'cat("\ue7\n")'
> # <U+00E7>
>
> When preparing \ue7 for output, R decides that it's not representable
> in the session encoding. What's the output of sessionInfo() and
> l10n_info() in the child process?
>
> --
> Best regards,
> Ivan
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] replacing unicode characters

Reply via email to