Hallo And missing value interpolation is rather tricky business dependent on what is underlying process.
Maybe na.locf from zoo package? Or approxfun?, splinefun? Cheers Petr > -----Original Message----- > From: R-help <r-help-boun...@r-project.org> On Behalf Of javad bayat > Sent: Wednesday, August 31, 2022 8:09 AM > To: r-help@r-project.org > Subject: [R] Combine two dataframe with different row number and > interpolation between values > > Dear all, > I am trying to combine two large dataframe in order to make a dataframe with > exactly the dimension of the second dataframe. > The first df is as follows: > > df1 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 2920), d = > rep(c(1:365,1:365,1:365,1:365,1:365),each=8), > h = rep(c(seq(3,24, by = 3),seq(3,24, by = 3),seq(3,24, by = 3),seq(3,24, by = > 3),seq(3,24, by = 3)),365), > ws = rnorm(1:14600, mean=20)) > > head(df1) > y d h ws > 1 2010 1 3 20.71488 > 2 2010 1 6 19.70125 > 3 2010 1 9 21.00180 > 4 2010 1 12 20.29236 > 5 2010 1 15 20.12317 > 6 2010 1 18 19.47782 > > The data in the "ws" column were measured with 3 hours frequency and I need > data with one hour frequency. I have made a second df as follows with one hour > frequency for the "ws" column. > > df2 = data.frame(y = rep(c(2010,2011,2012,2013,2014), each = 8760), d = > rep(c(1:365,1:365,1:365,1:365,1:365),each=24), > h = rep(c(1:24,1:24,1:24,1:24,1:24),365), ws = "NA") > > head(df2) > y d h ws > 1 2010 1 1 NA > 2 2010 1 2 NA > 3 2010 1 3 NA > 4 2010 1 4 NA > 5 2010 1 5 NA > 6 2010 1 6 NA > > What I am trying to do is combine these two dataframes so as to the rows in > df1 (based on the values of "y", "d", "h" columns) that have values exactly > similar to df2's rows copied in its place in the new df (df3). > For example, in the first dataframe the first row was measured at 3 o'clock on > the first day of 2010 and this row must be placed on the third row of the second > dataframe which has a similar value (2010, 1, 3). Like the below > table: > y d h ws > 1 2010 1 1 NA > 2 2010 1 2 NA > 3 2010 1 3 20.71488 > 4 2010 1 4 NA > 5 2010 1 5 NA > 6 2010 1 6 19.70125 > > But regarding the values of the "ws" column for df2 that do not have value (at 4 > and 5 o'clock), I need to interpolate between the before and after values to fill in > the missing data of the "ws". > I have tried the following codes but they did not work correctly. > > > df3 = merge(df1, df2, by = "y") > Error: cannot allocate vector of size 487.9 Mb or > > library(dplyr) > > df3<- df1%>% full_join(df2) > > > Is there any way to do this? > Sincerely > > > > > > -- > Best Regards > Javad Bayat > M.Sc. Environment Engineering > Alternative Mail: bayat...@yahoo.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.