On Dec 28, 10:27 pm, <bill.venab...@csiro.au> wrote: > Dear 'analyst41' (it would be a courtesy to know who you are) > > Here is a low-level way to do it. > > First create some dummy data > > > allDates <- seq(as.Date("2010-01-01"), by = 1, length.out = 50) > > client_ID <- sample(LETTERS[1:5], 50, rep = TRUE) > > value <- 1:50 > > date <- sample(allDates) > > clientData <- data.frame(client_ID, date, value) > > At this point clientData has 50 rows, with 5 clients, each with a sample of > datas. Everything is in random order execept "value". > > Now write a little function to fill out a subset of the data consisting of > one client's data only: > > > fixClient <- function(cData) { > > + dateRange <- range(cData$date) > + dates <- seq(dateRange[1], dateRange[2], by = 1) > + fullSet <- data.frame(client_ID = as.character(cData$client_ID[1]), > + date = dates, value = NA) > + > + fullSet$value[match(cData$date, dates)] <- cData$value > + fullSet > + } > > Now split up the data, apply the fixClient function to each section and > re-combine them again: > > > allData <- do.call(rbind, > > + lapply(split(clientData, clientData$client_ID), > fixClient)) > > Check: > > > head(allData) > > client_ID date value > A.1 A 2010-01-04 36 > A.2 A 2010-01-05 18 > A.3 A 2010-01-06 NA > A.4 A 2010-01-07 NA > A.5 A 2010-01-08 NA > A.6 A 2010-01-09 49 > > > > Seems OK. At this point the data are in sorted order by client and date, but > that should not matter. > > Bill Venables. > >
It is of course a great honor to receive a reply from you (but please allow me to continue to be an anonymous source of bits and bytes over the net). This is a neat solution, but please watch this space to see my dumber version (the code might need to be changed to a procedural languaage eventually). Thank you. > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of analys...@hotmail.com > Sent: Wednesday, 29 December 2010 10:45 AM > To: r-h...@r-project.org > Subject: [R] filling up holes > > I have a data frame with three columns > > client ID | date | value > > For each cilent ID I want to determine Min date and Max date and for > any dates in between that are missing I want to insert a row > > Client ID | date| NA > > Any help would be appreciated. > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.- Hide > quoted text - > > - Show quoted text - ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.