On 26/01/2009 5:44 PM, Peter Jepsen wrote:
Hi,

I am writing an Sweave document and am using 'xtable' to make frequency tables of diagnoses of people 
undergoing cholecystectomy. Some of these diagnoses contain Danish characters ("æ", "ø", 
and "å"), and these characters are all garbled in the Latex document after I run Sweave. The odd 
thing is, everything looks absolutely right in the R console, and if I enter the same Danish characters in a 
new variable, the new variable produces no problems?! Therefore, I cannot offer a reproducible example, but I 
am hoping nonetheless that someone can point me towards a solution.

This looks like an encoding problem: there are several different standards for encoding non-ASCII characters. All of your tools have to agree on the encoding.

To my eye it looks as though in the first case R is writing out UTF-8, and whatever you are using to look at your .tex file is assuming latin1 (some Windows programs say "ANSI", but I think that doesn't fully specify the encoding: you also need a code page, which is set somewhere in Windows control panel.)

The functions related to encodings in R are:

 options(encoding="latin1")  - set the default encoding

iconv(x, from="latin1", to="UTF-8") - re-encode entries, mapping each character from one encoding to the other

Encoding(x) - display the encoding of each entry (unknown means ascii or the native encoding for your platform)

Encoding(x) <- "latin1" - change the declared encoding, without changing the bytes.

Duncan Murdoch

To illustrate:

library(xtable)
library(Hmisc)
rm(list=ls())
load("u:/kirurgi/cholecystit/Chol_oprenset.Rdata")
        
test2 <- chol$nydiag[3]      # This 3rd observation contains a diagnosis with Danish 
characters ("Kræft i fordøjelsessystemet", meaning gastrointestinal cancer).

print(xtable(table(test2)))
% latex table generated in R 2.8.1 by xtable 1.5-4 package
% Mon Jan 26 23:31:37 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rr}
  \hline
 & test2 \\
  \hline
Kræft i fordøjelsessystemet &   1 \\        # It looks right here, but in the .tex-file 
it says "Kræft i fordøjelsessystemet"
   \hline
\end{tabular}
\end{center}
\end{table}

print(xtable(table("Kræft i fordøjelsessystemet")))   # This, on the other 
hand, works like a charm.
% latex table generated in R 2.8.1 by xtable 1.5-4 package
% Mon Jan 26 23:36:53 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rr}
  \hline
 & V1 \\
  \hline
Kræft i fordøjelsessystemet &   1 \\        # See, no problems here!
   \hline
\end{tabular}
\end{center}
\end{table}


I am using Windows Vista 64-bit and MikTex 2.7.
Best regards,
Peter.

sessionInfo()
R version 2.8.1 (2008-12-22) i386-pc-mingw32
locale:
LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Hmisc_3.4-4 foreign_0.8-30 xtable_1.5-4
loaded via a namespace (and not attached):
[1] cluster_1.11.12 grid_2.8.1      lattice_0.17-20 tools_2.8.1

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to