I have two dataframes, each with a column for timestamp. I want to merge the two dataframes such that each row from first dataframe is matched with the row in the second dataframe with most recent but preceding timestamp. Here is an example.
option.trades <- data.frame(timestamp = as.POSIXct(c("2023-08-07 10:23:22", "2023-08-07 10:25:33", "2023-08-07 10:28:41")), option.price = c(2.5, 2.7, 1.8)) stock.trades <- data.frame(timestamp = as.POSIXct(c("2023-08-07 10:23:21", "2023-08-07 10:23:34", "2023-08-07 10:24:57", "2023-08-07 10:28:37", "2023-08-07 10:29:01")), stock.price = c(102.2, 102.9, 103.1, 101.8, 101.7)) stock.trades <- stock.trades[order(stock.trades$timestamp),] library(plyr) mystock.prices <- ldply(option.trades$timestamp, function(tstamp) tail(subset(stock.trades, timestamp <= tstamp), 1)) names(mystock.prices)[1] <- "stock.timestamp" myres <- cbind(option.trades, mystock.prices) This method works. But for large dataframes, it is very slow. Is there a way to speed up the merge? Thanks, Naresh ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.