On 26/01/2009 5:44 PM, Peter Jepsen wrote:
Hi,
I am writing an Sweave document and am using 'xtable' to make frequency tables of diagnoses of people
undergoing cholecystectomy. Some of these diagnoses contain Danish characters ("æ", "ø",
and "å"), and these characters are all garbled in the Latex document after I run Sweave. The odd
thing is, everything looks absolutely right in the R console, and if I enter the same Danish characters in a
new variable, the new variable produces no problems?! Therefore, I cannot offer a reproducible example, but I
am hoping nonetheless that someone can point me towards a solution.
This looks like an encoding problem: there are several different
standards for encoding non-ASCII characters. All of your tools have to
agree on the encoding.
To my eye it looks as though in the first case R is writing out UTF-8,
and whatever you are using to look at your .tex file is assuming latin1
(some Windows programs say "ANSI", but I think that doesn't fully
specify the encoding: you also need a code page, which is set somewhere
in Windows control panel.)
The functions related to encodings in R are:
options(encoding="latin1") - set the default encoding
iconv(x, from="latin1", to="UTF-8") - re-encode entries, mapping each
character from one encoding to the other
Encoding(x) - display the encoding of each entry (unknown means ascii
or the native encoding for your platform)
Encoding(x) <- "latin1" - change the declared encoding, without
changing the bytes.
Duncan Murdoch
To illustrate:
library(xtable)
library(Hmisc)
rm(list=ls())
load("u:/kirurgi/cholecystit/Chol_oprenset.Rdata")
test2 <- chol$nydiag[3] # This 3rd observation contains a diagnosis with Danish
characters ("Kræft i fordøjelsessystemet", meaning gastrointestinal cancer).
print(xtable(table(test2)))
% latex table generated in R 2.8.1 by xtable 1.5-4 package
% Mon Jan 26 23:31:37 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rr}
\hline
& test2 \\
\hline
Kræft i fordøjelsessystemet & 1 \\ # It looks right here, but in the .tex-file
it says "Kræft i fordøjelsessystemet"
\hline
\end{tabular}
\end{center}
\end{table}
print(xtable(table("Kræft i fordøjelsessystemet"))) # This, on the other
hand, works like a charm.
% latex table generated in R 2.8.1 by xtable 1.5-4 package
% Mon Jan 26 23:36:53 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rr}
\hline
& V1 \\
\hline
Kræft i fordøjelsessystemet & 1 \\ # See, no problems here!
\hline
\end{tabular}
\end{center}
\end{table}
I am using Windows Vista 64-bit and MikTex 2.7.
Best regards,
Peter.
sessionInfo()
R version 2.8.1 (2008-12-22)
i386-pc-mingw32
locale:
LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Hmisc_3.4-4 foreign_0.8-30 xtable_1.5-4
loaded via a namespace (and not attached):
[1] cluster_1.11.12 grid_2.8.1 lattice_0.17-20 tools_2.8.1
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.