Hi: Here's one approach:
# date typo fixed in record 5 - changed 35 to 5 tC <- textConnection(" Subject Date parameter1 bob 3/2/99 10 bob 4/2/99 10 bob 5/5/99 10 bob 6/27/99 NA bob 8/5/01 10 bob 3/2/02 10 steve 1/2/99 4 steve 2/2/00 7 steve 3/2/01 10 steve 4/2/02 NA steve 5/2/03 16 kevin 6/5/04 24 ") dat <- read.table(tC, header=TRUE, stringsAsFactors = FALSE) close.connection(tC) rm(tC) # Convert Date to an object of class Date dat <- transform(dat, date = as.Date(Date, format = '%m/%d/%y')) # You could do this with transform() and the by() function, but # here is another way to use the min date per person as time 0 # using package plyr; mutate is a faster alternative to transform # and can be used for groupwise operations inside of ddply(): library('plyr') dat <- ddply(dat, .(Subject), mutate, days = as.numeric(date - min(date))) # Since Kevin has one record, want to return NAs for his coefficients # The function f returns NA if there are less than three observations # per subgroup; you can change 3 to 2 if you like. Otherwise, it returns # the coefficients of the least squares line as a data frame. f <- function(d) { if(nrow(d) < 3) {return(data.frame(intercept = NA, slope = NA)) } else { p <- coef(lm(parameter1 ~ days, data = d)) data.frame(intercept = p[1], slope = p[2]) } } # Apply the function to each person's sub-data frame ddply(dat, .(Subject), f) Subject intercept slope 1 bob 10.000000 0.000000000 2 kevin NA NA 3 steve 3.998485 0.007591638 Another option is to use the lmList() function in the nlme package. HTH, Dennis On Mon, Sep 12, 2011 at 12:42 AM, marcel <marcelcur...@gmail.com> wrote: > I have data of the form > > tC <- textConnection(" > Subject Date parameter1 > bob 3/2/99 10 > bob 4/2/99 10 > bob 5/5/99 10 > bob 6/27/99 NA > bob 8/35/01 10 > bob 3/2/02 10 > steve 1/2/99 4 > steve 2/2/00 7 > steve 3/2/01 10 > steve 4/2/02 NA > steve 5/2/03 16 > kevin 6/5/04 24 > ") > data <- read.table(header=TRUE, tC) > close.connection(tC) > rm(tC) > > I am trying to calculate rate of change of parameter1 in units/day for each > person. I think I need something like: > "lapply(split(mydata, mydata$ppt), function(x) lm(parameter1 ~ day, > data=x))" > > I am not sure how to handle the dates in order to have the first day for > each person be time = 0, and the remaining dates to be handled as days since > time 0. Also, is there a way to add the resulting slopes to the data set as > a new column? > > Thanks, > Marcel > > -- > View this message in context: > http://r.789695.n4.nabble.com/regression-on-data-subsets-in-datafile-tp3806743p3806743.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.