Hello again,

Just for your information, I think I found a way to work around the
problem described below. I don’t know if it’s the most elegant way, but
it seems to work.

Am Mittwoch, den 26.08.2009, 11:55 +0200 schrieb Frederik Elwert:
> Hello!
> 
> I imported a DJI survey[1] from an SPSS file. When looking at some of
> the variables, I noticed problems with the `table` function and similar.
> It seems to be caused by duplicate levels which are generated from the
> value labels. Not all values have labels, so those who don’t get an
> empty string as the level, which leads to duplicates.
> 
> I hope the code and output below illustrates the problem. Is it possible
> to prevent this? I’d still like to use the labels, so using numeric
> vectors instead of factors is not the best solution.
> 
> Regards,
> Frederik
> 
> 
> > library(foreign)
> > Data <- read.spss("js2003_16_29_db.sav", to.data.frame=TRUE,
> reencode="latin1")
> > table(Data$J203_A)
> 
> überhaupt nicht wichtig                                                 
>                      35                    2256                       0 
>                                                                         
>                       0                       0                       0 
>            sehr wichtig         Mehrfachnennung 
>                    4660                       0 
> > table(as.numeric(Data$J203_A))
> 
>    1    2    3    4    5    6    7 
>   35   39   84  227  626 1280 4660 
> > is.factor(Data$J203_A)
> [1] TRUE
> > levels(Data$J203_A)
> [1] "überhaupt nicht wichtig" " "                      
> [3] " "                       " "                      
> [5] " "                       " "                      
> [7] "sehr wichtig"            "Mehrfachnennung"        

        for (i in 1:ncol(Data)){
            if (is.factor(Data[,i])){
                lvl <- levels(JS2003[,i])
                if (" " %in% lvl){
                    empty <- lvl == " "
                    lvl[empty] <- (1:length(lvl))[empty]
                levels(Data[,i]) <- lvl
                }
            }
        }

> table(Data$J203_A)

überhaupt nicht wichtig                       2                       3 
                     35                      39                      84 
                      4                       5                       6 
                    227                     626                    1280 
           sehr wichtig         Mehrfachnennung 
                   4660                       0

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to