Re: [R] Multiple lines for each record: how do I handle that

Jorge I Velez Wed, 22 Feb 2012 12:09:30 -0800

Hi Luca,

Thank you for the example.  Here is one way of doing what you want (of
course there are many of them!):


# data
d0 <- structure(list(id = c(1, 1, 2, 2, 2, 3), v1 = c(NA, 1, NA, 1,
NA, 1), v2 = structure(c(3L, 1L, 2L, 1L, 1L, 3L), .Label = c("",
"no", "yes"), class = "factor"), v3 = structure(c(NA, 1L, NA,
NA, 3L, 2L), .Label = c("1", "2", "3"), class = "factor")), .Names =
c("id",
"v1", "v2", "v3"), row.names = c(NA, -6L), class = "data.frame")

# processing
out <- lapply(split(d0, d0$id), function(l) apply(l[,-1], 2, function(x) x[!
is.na(x) & x != ""]))
out <- data.frame(do.call(rbind, out))

# output
cbind(id = unique(d0$id), out)

Perhaps plyr would be a better way ;-)

HTH,
Jorge.-


On Wed, Feb 22, 2012 at 2:49 PM, Luca Meyer <> wrote:

> Sure, I am sorry I have not done that in the first place.
>
> The datasets I have looks like:
>
> id <- c(1,1,2,2,2,3)
> v1 <- c(NA,1,NA,1,NA,1)
> v2 <- as.character(c("yes","","no","","","yes"))
> v3 <- as.factor(c(NA,1,NA,NA,3,2))
> d0 <- data.frame(id,v1,v2,v3)
> d0
>
> What I would need is to derive a dataset that looks like:
>
> id <- c(1,2,3)
> v1 <- c(1,1,1)
> v2 <- as.character(c("yes","no","yes"))
> v3 <- as.factor(c(1,3,2))
> d1 <- data.frame(id,v1,v2,v3)
> d1
>
> The issue is related to the need to have an automated procedure that reads
> in the different variable types and aggregates them accordingly as every
> dataset will be different from the previous in terms of number of variables
> and records involved.
>
> Thank you,
> Luca
>
> Il giorno 22/feb/2012, alle ore 20.26, Sarah Goslee ha scritto:
>
> > If you provide a small reproducible example of your data format and
> > expected output, I'm sure someone here can offer a useful solution.
> >
> > Without knowing what your data look like, not so easy.
> >
> > Sarah
> >
> > On Wed, Feb 22, 2012 at 2:22 PM, Luca Meyer <> wrote:
> >> Hi Folks,
> >>
> >> I just discovered that my dataset (coming from QuestionPro platform)
> has got multiple lines for each respondent id, but what I would really need
> is a "regular" data matrix where each respondent's data is shown on a
> single line.
> >>
> >> Does anyone has already develop a procedure that automatically takes
> the multiple lines and aggregates them into a single line?
> >>
> >> Thank you in advance,
> >> Luca
> >>
> >> Mr. Luca Meyer
> >> www.lucameyer.com
> >> R version 2.14.1 (2011-12-22)
> >> Mac OS X 10.6.8
> >>
> >>
> > --
> > Sarah Goslee
> > http://www.functionaldiversity.org
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple lines for each record: how do I handle that

Reply via email to