This should do it: > x <- read.table(textConnection("subject hospital date_enrollment hospital_beds + 1 hospitalA 1/3/2002 300 + 2 hospitalA 1/6/2002 300 + 3 hospitalB 2/4/2002 150 + 4 hospitalC 3/2/2002 200"), header=TRUE) > closeAllConnections() > y <- as.Date(x$date_enrollment, "%m/%d/%Y") > z <- cbind(x, year=format(y, "%Y"), month=format(y, "%m")) > # partition the data > z.s <- split(z, list(z$year, z$month, z$hospital), drop=TRUE) > # now aggregate > do.call(rbind, lapply(z.s, function(a) data.frame(hospital=a$hospital[1], cases=nrow(a), + year=a$year[1], month=a$month[1], beds=a$hospital[1]))) hospital cases year month beds 2002.01.hospitalA hospitalA 2 2002 01 hospitalA 2002.02.hospitalB hospitalB 1 2002 02 hospitalB 2002.03.hospitalC hospitalC 1 2002 03 hospitalC > > >
On Mon, Jun 9, 2008 at 1:51 PM, Ricardo Pietrobon <[EMAIL PROTECTED]> wrote: > Jim, thanks a lot. This does the trick for dates, but what I have > been struggling the most with is actually the conversion from having > one subject per row to having one month per row. I didn't explain > that well at all in my previous email and so let me try again. The > idea is that the current data set is displayed with one subject per > row. I would like to have it displayed having one hospital per month > per row. For example, the new data set would look like this: > > month year site number_enrolled_subjects > hospital_beds > 1 2002 hospitalA 22 > 300 > > meaning that hospital A enrolled 22 subjects in 01/2002, and hospital > A has 300 beds -- the beds variable is one variable in a vector that > would display all the covariates for my ARIMA model > > your suggestion solved the problem for the dates, but the command I am > looking for now is something that would count the number of subjects > per site per month of a year and then displayed it in the format > above. any thoughts? > > I really appreciate your help > > > > > On Mon, Jun 9, 2008 at 1:04 PM, jim holtman <[EMAIL PROTECTED]> wrote: > > Will something like this work for you: > > > >> x <- read.table(textConnection("subject hospital date_enrollment > >> hospital_beds > > + 1 hospitalA 1/3/2002 300 > > + 2 hospitalA 1/6/2002 300 > > + 3 hospitalB 2/4/2002 150 > > + 4 hospitalC 3/2/2002 200"), header=TRUE) > >> closeAllConnections() > >> y <- as.Date(x$date_enrollment, "%m/%d/%Y") > >> cbind(x, year=format(y, "%Y"), month=format(y, "%m")) > > subject hospital date_enrollment hospital_beds year month > > 1 1 hospitalA 1/3/2002 300 2002 01 > > 2 2 hospitalA 1/6/2002 300 2002 01 > > 3 3 hospitalB 2/4/2002 150 2002 02 > > 4 4 hospitalC 3/2/2002 200 2002 03 > >> > >> > > > > > > On Mon, Jun 9, 2008 at 12:45 PM, Ricardo Pietrobon <[EMAIL PROTECTED]> > > wrote: > >> > >> I currently have a data set describing human subjects enrolled into an > >> international clinical trial, the name of the hospital enrolling this > >> human subject, the date when the subject was enrolled, and a vector > >> with variables representing characteristics of the site (e.g., number > >> of beds in a hospital). my data sets looks like this: > >> > >> subject hospital date_enrollment hospital_beds > >> 1 hospitalA 1/3/2002 300 > >> 2 hospitalA 1/6/2002 300 > >> 3 hospitalB 2/4/2002 150 > >> 4 hospitalC 3/2/2002 200 > >> > >> to perform a time series analysis I am now trying to get to a format > >> that would give me the following variables: > >> > >> month year site number_enrolled_subjects hospital_beds > >> > >> the data would be displayed on one-month intervals, and number of > >> subjects clustered around sites. > >> > >> any help would be greatly appreciate > >> > >> thanks > >> > >> > >> Ricardo > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > Jim Holtman > > Cincinnati, OH > > +1 513 646 9390 > > > > What is the problem you are trying to solve? > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.