Hello again, Just for your information, I think I found a way to work around the problem described below. I don’t know if it’s the most elegant way, but it seems to work.
Am Mittwoch, den 26.08.2009, 11:55 +0200 schrieb Frederik Elwert: > Hello! > > I imported a DJI survey[1] from an SPSS file. When looking at some of > the variables, I noticed problems with the `table` function and similar. > It seems to be caused by duplicate levels which are generated from the > value labels. Not all values have labels, so those who don’t get an > empty string as the level, which leads to duplicates. > > I hope the code and output below illustrates the problem. Is it possible > to prevent this? I’d still like to use the labels, so using numeric > vectors instead of factors is not the best solution. > > Regards, > Frederik > > > > library(foreign) > > Data <- read.spss("js2003_16_29_db.sav", to.data.frame=TRUE, > reencode="latin1") > > table(Data$J203_A) > > überhaupt nicht wichtig > 35 2256 0 > > 0 0 0 > sehr wichtig Mehrfachnennung > 4660 0 > > table(as.numeric(Data$J203_A)) > > 1 2 3 4 5 6 7 > 35 39 84 227 626 1280 4660 > > is.factor(Data$J203_A) > [1] TRUE > > levels(Data$J203_A) > [1] "überhaupt nicht wichtig" " " > [3] " " " " > [5] " " " " > [7] "sehr wichtig" "Mehrfachnennung" for (i in 1:ncol(Data)){ if (is.factor(Data[,i])){ lvl <- levels(JS2003[,i]) if (" " %in% lvl){ empty <- lvl == " " lvl[empty] <- (1:length(lvl))[empty] levels(Data[,i]) <- lvl } } } > table(Data$J203_A) überhaupt nicht wichtig 2 3 35 39 84 4 5 6 227 626 1280 sehr wichtig Mehrfachnennung 4660 0 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.