Re: [R] Multiple lines for each record: how do I handle that

Luca Meyer Wed, 22 Feb 2012 14:16:44 -0800

Hi Jorge,

The method you suggest is indeed working fine on the small sample data set. 
When I apply to a larger dataset (714 rows by 160 columns) it transforms some 
variables from "factor" to "list", how can I change it back to their original 
class in an automatic way?


Thanks,
Luca

Il giorno 22/feb/2012, alle ore 21.05, Jorge I Velez ha scritto:

> Hi Luca,
> 
> Thank you for the example.  Here is one way of doing what you want (of course 
> there are many of them!):
> 
> # data
> d0 <- structure(list(id = c(1, 1, 2, 2, 2, 3), v1 = c(NA, 1, NA, 1, 
> NA, 1), v2 = structure(c(3L, 1L, 2L, 1L, 1L, 3L), .Label = c("", 
> "no", "yes"), class = "factor"), v3 = structure(c(NA, 1L, NA, 
> NA, 3L, 2L), .Label = c("1", "2", "3"), class = "factor")), .Names = c("id", 
> "v1", "v2", "v3"), row.names = c(NA, -6L), class = "data.frame")
> 
> # processing
> out <- lapply(split(d0, d0$id), function(l) apply(l[,-1], 2, function(x) 
> x[!is.na(x) & x != ""]))
> out <- data.frame(do.call(rbind, out))
> 
> # output
> cbind(id = unique(d0$id), out)
> 
> Perhaps plyr would be a better way ;-)
> 
> HTH,
> Jorge.-
> 
> 
> On Wed, Feb 22, 2012 at 2:49 PM, Luca Meyer <> wrote:
> Sure, I am sorry I have not done that in the first place.
> 
> The datasets I have looks like:
> 
> id <- c(1,1,2,2,2,3)
> v1 <- c(NA,1,NA,1,NA,1)
> v2 <- as.character(c("yes","","no","","","yes"))
> v3 <- as.factor(c(NA,1,NA,NA,3,2))
> d0 <- data.frame(id,v1,v2,v3)
> d0
> 
> What I would need is to derive a dataset that looks like:
> 
> id <- c(1,2,3)
> v1 <- c(1,1,1)
> v2 <- as.character(c("yes","no","yes"))
> v3 <- as.factor(c(1,3,2))
> d1 <- data.frame(id,v1,v2,v3)
> d1
> 
> The issue is related to the need to have an automated procedure that reads in 
> the different variable types and aggregates them accordingly as every dataset 
> will be different from the previous in terms of number of variables and 
> records involved.
> 
> Thank you,
> Luca
> 
> Il giorno 22/feb/2012, alle ore 20.26, Sarah Goslee ha scritto:
> 
> > If you provide a small reproducible example of your data format and
> > expected output, I'm sure someone here can offer a useful solution.
> >
> > Without knowing what your data look like, not so easy.
> >
> > Sarah
> >
> > On Wed, Feb 22, 2012 at 2:22 PM, Luca Meyer <> wrote:
> >> Hi Folks,
> >>
> >> I just discovered that my dataset (coming from QuestionPro platform) has 
> >> got multiple lines for each respondent id, but what I would really need is 
> >> a "regular" data matrix where each respondent's data is shown on a single 
> >> line.
> >>
> >> Does anyone has already develop a procedure that automatically takes the 
> >> multiple lines and aggregates them into a single line?
> >>
> >> Thank you in advance,
> >> Luca
> >>
> >> Mr. Luca Meyer
> >> www.lucameyer.com
> >> R version 2.14.1 (2011-12-22)
> >> Mac OS X 10.6.8
> >>
> >>
> > --
> > Sarah Goslee
> > http://www.functionaldiversity.org
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple lines for each record: how do I handle that

Reply via email to