Thank you very much Rui On 19 August 2012 13:49, Rui Barradas <ruipbarra...@sapo.pt> wrote: > Hello, > > Yes you can, if you have memory problems, say so and we'll see it then. > In the mean time, there's something you should change, to allow for several > minima but to only return one per combination of TYPE and DATE. > > Replace this > > x[which(min(a) == a), ] > > by this > > x[which.min(a), ] > > Rui Barradas > > Em 19-08-2012 12:00, Francesco escreveu: > >> Dear Riu, Many thanks for your suggestion >> >> However these are just simplified examples... in reality the dataset A >> contains millions of observations and B several thousands of rows... >> Could I still use a modified form of your suggestion? >> >> Thanks >> >> On 19 August 2012 12:51, Rui Barradas <ruipbarra...@sapo.pt> wrote: >>> >>> Hello, >>> >>> Try the following. >>> >>> >>> A <- read.table(text=" >>> >>> TYPE DATE >>> A 2 >>> A 5 >>> A 20 >>> B 10 >>> B 2 >>> ", header = TRUE) >>> >>> >>> B <- read.table(text=" >>> >>> TYPE Special_Date >>> A 2 >>> A 6 >>> A 20 >>> A 22 >>> B 5 >>> B 6 >>> ", header = TRUE) >>> >>> result <- do.call( rbind, lapply(split(merge(A, B), list(m$DATE, >>> m$TYPE)), >>> function(x){ >>> a <- abs(x$DATE - x$Special_Date) >>> if(nrow(x)) x[which(min(a) == a), ] }) ) >>> result$Difference <- result$DATE - result$Special_Date >>> result$Special_Date <- NULL >>> rownames(result) <- seq_len(nrow(result)) >>> result >>> >>> >>> Also, it's a good practice to post data examples using dput(). For >>> instance, >>> >>> dput(A) >>> structure(list(TYPE = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("A", >>> "B"), class = "factor"), DATE = c(2L, 5L, 20L, 10L, 2L)), .Names = >>> c("TYPE", >>> "DATE"), class = "data.frame", row.names = c(NA, -5L)) >>> >>> Now all we have to do is run the statement A <- structure(... etc...) to >>> have an exact copy of the data example. >>> Anyway, your example with input and the wanted result was very welcome. >>> >>> Hope this helps, >>> >>> Rui Barradas >>> >>> Em 19-08-2012 11:10, Francesco escreveu: >>>> >>>> Dear R-help >>>> >>>> Î would like to know if there is a short solution in R for this >>>> merging problem... >>>> >>>> Let say I have a dataset A as: >>>> >>>> TYPE DATE >>>> A 2 >>>> A 5 >>>> A 20 >>>> B 10 >>>> B 2 >>>> >>>> (there can be duplicates for the same type and date) >>>> >>>> and I have another dataset B as : >>>> >>>> TYPE Special_Date >>>> A 2 >>>> A 6 >>>> A 20 >>>> A 22 >>>> B 5 >>>> B 6 >>>> >>>> The question is : I would like to obtain the difference between the >>>> date of each observation in A and the closest special date in B with >>>> the same type. In case of ties I would take the latest date of the >>>> two. >>>> >>>> For example I would obtain here >>>> >>>> TYPE DATE Difference >>>> A 2 0=2-2 >>>> A 5 -1=5-6 >>>> A 20 0=20-20 >>>> B 10 +4=10-6 >>>> B 2 -3=2-5 >>>> >>>> Do you know how to (simply?) obtain this in R? >>>> >>>> Many thanks! >>>> Best Regards >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.