Jim, thanks a lot. This does the trick for dates, but what I have been struggling the most with is actually the conversion from having one subject per row to having one month per row. I didn't explain that well at all in my previous email and so let me try again. The idea is that the current data set is displayed with one subject per row. I would like to have it displayed having one hospital per month per row. For example, the new data set would look like this:
month year site number_enrolled_subjects hospital_beds 1 2002 hospitalA 22 300 meaning that hospital A enrolled 22 subjects in 01/2002, and hospital A has 300 beds -- the beds variable is one variable in a vector that would display all the covariates for my ARIMA model your suggestion solved the problem for the dates, but the command I am looking for now is something that would count the number of subjects per site per month of a year and then displayed it in the format above. any thoughts? I really appreciate your help On Mon, Jun 9, 2008 at 1:04 PM, jim holtman <[EMAIL PROTECTED]> wrote: > Will something like this work for you: > >> x <- read.table(textConnection("subject hospital date_enrollment >> hospital_beds > + 1 hospitalA 1/3/2002 300 > + 2 hospitalA 1/6/2002 300 > + 3 hospitalB 2/4/2002 150 > + 4 hospitalC 3/2/2002 200"), header=TRUE) >> closeAllConnections() >> y <- as.Date(x$date_enrollment, "%m/%d/%Y") >> cbind(x, year=format(y, "%Y"), month=format(y, "%m")) > subject hospital date_enrollment hospital_beds year month > 1 1 hospitalA 1/3/2002 300 2002 01 > 2 2 hospitalA 1/6/2002 300 2002 01 > 3 3 hospitalB 2/4/2002 150 2002 02 > 4 4 hospitalC 3/2/2002 200 2002 03 >> >> > > > On Mon, Jun 9, 2008 at 12:45 PM, Ricardo Pietrobon <[EMAIL PROTECTED]> > wrote: >> >> I currently have a data set describing human subjects enrolled into an >> international clinical trial, the name of the hospital enrolling this >> human subject, the date when the subject was enrolled, and a vector >> with variables representing characteristics of the site (e.g., number >> of beds in a hospital). my data sets looks like this: >> >> subject hospital date_enrollment hospital_beds >> 1 hospitalA 1/3/2002 300 >> 2 hospitalA 1/6/2002 300 >> 3 hospitalB 2/4/2002 150 >> 4 hospitalC 3/2/2002 200 >> >> to perform a time series analysis I am now trying to get to a format >> that would give me the following variables: >> >> month year site number_enrolled_subjects hospital_beds >> >> the data would be displayed on one-month intervals, and number of >> subjects clustered around sites. >> >> any help would be greatly appreciate >> >> thanks >> >> >> Ricardo >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.