Hello! I imported a DJI survey[1] from an SPSS file. When looking at some of the variables, I noticed problems with the `table` function and similar. It seems to be caused by duplicate levels which are generated from the value labels. Not all values have labels, so those who don’t get an empty string as the level, which leads to duplicates.
I hope the code and output below illustrates the problem. Is it possible to prevent this? I’d still like to use the labels, so using numeric vectors instead of factors is not the best solution. Regards, Frederik > library(foreign) > Data <- read.spss("js2003_16_29_db.sav", to.data.frame=TRUE, reencode="latin1") > table(Data$J203_A) überhaupt nicht wichtig 35 2256 0 0 0 0 sehr wichtig Mehrfachnennung 4660 0 > table(as.numeric(Data$J203_A)) 1 2 3 4 5 6 7 35 39 84 227 626 1280 4660 > is.factor(Data$J203_A) [1] TRUE > levels(Data$J203_A) [1] "überhaupt nicht wichtig" " " [3] " " " " [5] " " " " [7] "sehr wichtig" "Mehrfachnennung" [1] http://213.133.108.158/surveys/index.php?m=msw,0&sID=54
signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.