The date you get using as.Date on a POSIXct value depends on the timezone. That is, as.Date only pays attention to the underlying UTC seconds-since-epoch value, so it ignores the timezone which can be unexpected for most people.
TL;DR as.Date is not the same as as.POSIXct( trunc( dtm, units="days" ) ) unless you are using GMT. On April 13, 2021 10:55:04 AM PDT, Bert Gunter <bgunter.4...@gmail.com> wrote: >(Revealing my ignorance): > >Simpler still than the as.POSIXct() idiom is just to use the as.Date >version: > >out <- with(out, out [order(Group, id, as.Date(Date)),]) > >## all else the same... > >Bert Gunter > >"The trouble with having an open mind is that people keep coming along >and >sticking things into it." >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > >On Tue, Apr 13, 2021 at 10:47 AM Bert Gunter <bgunter.4...@gmail.com> >wrote: > >> It may not be necessary to insert the rows in that order -- R can >identify >> and use the information from the rows in in most cases without it. >> So to combine the results as you described (the code you sent got >garbled >> a bit btw -- you should proofread more carefully in future), all you >would >> need to do is: >> >> ## with train and test your train and test data frames of course >> out <- na.omit(rbind(train, cbind(test[,c(1,3,4)], Value = >> test[,"value"]))) >> ## Note that the cbind() stuff is needed to create the correct >"Value" >> column for rbind(). See ?rbind for details >> >> If you insist that you need the row ordering as you specified, then >follow >> this by: >> >> out <- with(out, out[order(Group, id, as.POSIXct(Date,format = >"%D%")), ]) >> >> What this does is to first convert your text data column to POSIXct >("See >> ?DateTimeClasses for details) which gives them the desired calendar >> ordering. The order() function (see ?order for details) then gives >the >> permutation ordering them from early to late within groups and id's, >which >> are then used as the row subscripts to reorder the rows in the data >frame. >> >> DO NOTE: For this to work reliably, your Date column must be >consistent >> and correct in its formatting! >> >> Other note: It probably makes more sense to convert your Date column >to a >> POSIXct or POSIXlt dates from the beginning, as this will make things >like >> plotting in date order straightforward. There are also date-time >packages >> (in the "tidyverse" suite, I think, as well as others) that simplify >such >> things. I am pretty ignorant about date-time stuff, so I can't really >be >> more specific. https://cran.r-project.org/web/views/TimeSeries.html >will >> have lots of info on this if you need it. As well as searching, of >course. >> >> HTH >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming >along and >> sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Tue, Apr 13, 2021 at 3:26 AM Elahe chalabi via R-help < >> r-help@r-project.org> wrote: >> >>> Hi all, >>> >>> I have the prediction for my test set which are forecasted Value for >>> "4/1/2020" for each match of "id" and "Group". I would like to add a >fourth >>> row to each group by (Group,id) in my train set and the values for >this row >>> should come from test set : >>> >>> my train set: >>> >>> structure(list(Date = c("1/1/2020", "2/1/2020", "3/1/2020", >>> "1/1/2020", >>> "2/1/2020", "3/1/2020", "1/1/2020", "2/1/2020", "3/1/2020", "" >>> ), Value = c(3.5, 2.7, 4, 2.5, 3.7, 0, 3, 0, 1, NA), Group = >c("A", >>> "A", "A", "B", "B", "B", "C", "C", "C", ""), id = c(1L, 1L, 1L, >>> 101L, 101L, 101L, 100L, 100L, 100L, NA)), class = "data.frame", >>> row.names = c(NA, >>> -10L)) >>> >>> test set: >>> >>> structure(list(Date = c("4/1/2020", "4/1/2020", "4/1/2020", "" >>> ), Value = c(3.5, 2.5, 3, NA), Group = c("A", "B", "C", ""), >>> id = c(1L, 101L, 100L, NA), value = c(0.2, 0.7, 0.9, NA)), class >= >>> "data.frame", row.names = c(NA, >>> -4L))structure(list(Date = c("4/1/2020", "4/1/2020", >"4/1/2020", "" >>> ), Value = c(3.5, 2.5, 3, NA), Group = c("A", "B", "C", ""), >>> id = c(1L, 101L, 100L, NA)), class = "data.frame", row.names = >c(NA, >>> -4L)) >>> >>> desired output: >>> >>> structure(list(Date = c("1/1/2020", "2/1/2020", "3/1/2020", >>> "4/1/2020", >>> "1/1/2020", "2/1/2020", "3/1/2020", "4/1/2020", "1/1/2020", >>> "2/1/2020", >>> "3/1/2020", "4/1/2020"), Value = c(3.5, 2.7, 4, 0.2, 2.5, 3.7, >>> 0, 0.7, 3, 0, 1, 0.9), Group = c("A", "A", "A", "A", "B", "B", >>> "B", "B", "C", "C", "C", "C"), id = c(1L, 1L, 1L, 1L, 101L, >101L, >>> 101L, 101L, 100L, 100L, 100L, 100L)), class = "data.frame", >row.names >>> = c(NA, >>> -12L)) >>> >>> Data is dummy and I have milions of records in original data set. >>> >>> Thanks for any help. >>> Elahe >>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.