There is probably an easier way to do this, but
> set.seed(42)
> mydf <- data.frame(t(replicate(100, sample(c("red", "blue",
+ "green", "yellow", NA), 4))))
> colnames(mydf) <- c("rank1", "rank2", "rank3", "rank4")
> head(mydf)
rank1 rank2 rank3 rank4
1 <NA> yellow red blue
2 yellow green <NA> red
3 yellow green blue <NA>
4 <NA> blue yellow green
5 <NA> red blue green
6 <NA> red green blue
> lvls <- levels(mydf$rank1)
> # convert color factors to numeric
> for (i in seq_along(mydf)) mydf[,i] <- as.numeric(mydf[,i])
> # stack the columns
> mydf2 <- stack(mydf)
> # convert rank factor to numeric
> mydf2$ind <- as.numeric(mydf2$ind)
> # add row numbers
> mydf2 <- data.frame(rows=1:100, mydf2)
> # Create table
> mytbl <- xtabs(ind~rows+values, mydf2)
> # convert to data frame
> mydf3 <- data.frame(unclass(mytbl))
> colnames(mydf3) <- lvls
> head(mydf3)
blue green red yellow
1 4 0 3 2
2 0 2 4 1
3 3 2 0 1
4 2 4 0 3
5 3 4 2 0
6 4 3 2 0
David C
-----Original Message-----
From: [email protected] [mailto:[email protected]] On
Behalf Of Simon Kiss
Sent: Friday, August 15, 2014 3:58 PM
To: [email protected]
Subject: Re: [R] Turn Rank Ordering Into Numerical Scores By Transposing A Data
Frame
Both the suggestions I got work very well, but what I didn't realize is that NA
values would cause serious problems. Where there is a missing value, using the
argument na.last=NA to order just returns the the order of the factor levels,
but excludes the missing values, but I have no idea where those occur in the or
rather which of those variables were actually missing.
Have I explained this problem sufficiently?
I didn't think it would cause such a problem so I didn't include it in the
original problem definition.
Yours, Simon
On Jul 25, 2014, at 4:58 PM, David L Carlson <[email protected]> wrote:
> I think this gets what you want. But your data are not reproducible since
> they are randomly drawn without setting a seed and the two data sets have no
> relationship to one another.
>
>> set.seed(42)
>> mydf <- data.frame(t(replicate(100, sample(c("red", "blue",
> + "green", "yellow")))))
>> colnames(mydf) <- c("rank1", "rank2", "rank3", "rank4")
>> mydf2 <- data.frame(t(apply(mydf, 1, order)))
>> colnames(mydf2) <- levels(mydf$rank1)
>> head(mydf)
> rank1 rank2 rank3 rank4
> 1 yellow green red blue
> 2 green blue yellow red
> 3 green yellow red blue
> 4 yellow red green blue
> 5 yellow red green blue
> 6 yellow red blue green
>> head(mydf2)
> blue green red yellow
> 1 4 2 3 1
> 2 2 1 4 3
> 3 4 1 3 2
> 4 4 3 2 1
> 5 4 3 2 1
> 6 3 4 2 1
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On
> Behalf Of Simon Kiss
> Sent: Friday, July 25, 2014 2:34 PM
> To: [email protected]
> Subject: [R] Turn Rank Ordering Into Numerical Scores By Transposing A Data
> Frame
>
> Hello:
> I have data that looks like mydf, below. It is the results of a survey where
> participants were to put a number of statements (in this case colours) in
> their order of preference. In this case, the rank number is the variable, and
> the factor level for each respondent is which colour they assigned to that
> rank. I would like to find a way to effectively transpose the data frame so
> that it looks like mydf2, also below, where the colours the participants were
> able to choose are the variables and the variable score is what that person
> ranked that variable.
>
> Ultimately what I would like to do is a factor analysis on these items, so
> I'd like to be able to see if people ranked red and yellow higher together
> but ranked green and blue together lower, that sort of thing.
> I have played around with different variations of t(), melt(), ifelse() and
> if() but can't find a solution.
> Thank you
> Simon
> #Reproducible code
> mydf<-data.frame(rank1=sample(c('red', 'blue', 'green', 'yellow'),
> replace=TRUE, size=100), rank2=sample(c('red', 'blue', 'green', 'yellow'),
> replace=TRUE, size=100), rank3=sample(c('red', 'blue', 'green', 'yellow'),
> replace=TRUE, size=100), rank4=sample(c('red', 'blue', 'green', 'yellow'),
> replace=TRUE, size=100))
>
> mydf2<-data.frame(red=sample(c(1,2,3,4),
> replace=TRUE,size=100),blue=sample(c(1,2,3,4),
> replace=TRUE,size=100),green=sample(c(1,2,3,4), replace=TRUE,size=100)
> ,yellow=sample(c(1,2,3,4), replace=TRUE,size=100))
> *********************************
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
*********************************
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.