Hello!

I imported a DJI survey[1] from an SPSS file. When looking at some of
the variables, I noticed problems with the `table` function and similar.
It seems to be caused by duplicate levels which are generated from the
value labels. Not all values have labels, so those who don’t get an
empty string as the level, which leads to duplicates.

I hope the code and output below illustrates the problem. Is it possible
to prevent this? I’d still like to use the labels, so using numeric
vectors instead of factors is not the best solution.

Regards,
Frederik


> library(foreign)
> Data <- read.spss("js2003_16_29_db.sav", to.data.frame=TRUE,
reencode="latin1")
> table(Data$J203_A)

überhaupt nicht wichtig                                                 
                     35                    2256                       0 
                                                                        
                      0                       0                       0 
           sehr wichtig         Mehrfachnennung 
                   4660                       0 
> table(as.numeric(Data$J203_A))

   1    2    3    4    5    6    7 
  35   39   84  227  626 1280 4660 
> is.factor(Data$J203_A)
[1] TRUE
> levels(Data$J203_A)
[1] "überhaupt nicht wichtig" " "                      
[3] " "                       " "                      
[5] " "                       " "                      
[7] "sehr wichtig"            "Mehrfachnennung"        




[1] http://213.133.108.158/surveys/index.php?m=msw,0&sID=54

Attachment: signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to