On Mon, 1 Oct 2012, Vindoggy ! wrote:
Sorry for the lack of reproducible data, but this seems to be a problem
inherent to my dataset and I can't figure out where the issue is.
I have several data frames set up as a time series with identical POSIXct date
formats. If I keep the original data in data frame format and merge them using
base merge- everything is perfect and everyone is happy.
If I transform the data frames to zoo objects, and then do a merge.zoo- the
data seem to become uncoupled from the original data. Even more unusual is that
some dates in the new merged data set are prior to the original data set. I've
attempted bellow to show what this looks like, and I hope someone has a
suggestion as to what may be causing the problem.
Here is one set of data in data.frame format
head(Vup)
Date Velocity_m/s
1 2010-01-21 07:42:00 1.217943
2 2010-01-21 07:43:00 1.624395
3 2010-01-21 07:44:00 1.526379
4 2010-01-21 07:45:00 1.456831
5 2010-01-21 07:46:00 1.245390
6 2010-01-21 07:47:00 1.374330
str(Vup)
'data.frame': 7168 obs. of 2 variables:
$ Date : POSIXct, format: "2010-01-21 07:42:00" "2010-01-21 07:43:00" ...
$ Velocity_m/s: num 1.22 1.62 1.53 1.46 1.25 ...
And here is a second in data.frame format:
head(PAS)
Date PAS
1 2010-01-21 05:01:00 0.0013938
2 2010-01-21 05:02:00 0.0015331
3 2010-01-21 05:03:00 0.0016725
4 2010-01-21 05:04:00 0.0016725
5 2010-01-21 05:05:00 0.0012265
6 2010-01-21 05:06:00 0.0015889
str(PAS)
'data.frame': 5520 obs. of 2 variables:
$ Date : POSIXct, format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" ...
$ PAS: num 0.00139 0.00153 0.00167 0.00167 0.00123 ...
Using zoo:
PASmin<-zoo(as.matrix(PAS[,2]),as.POSIXct(PAS[,1],format="%Y-%m-%d
%H:%M:%S",tz="UTC"))
str(PASmin)
?zoo? series from 2010-01-21 05:01:00 to 2010-01-27 13:01:00
Data: num [1:5520, 1] 0.00139 0.00153 0.00167 0.00167 0.00123 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "PAS"
Index: POSIXct[1:5520], format: "2010-01-21 05:01:00" "2010-01-21 05:02:00"
"2010-01-21 05:03:00" ...
ADP_UPmin<-zoo(as.matrix(Vup[,2]),as.POSIXct(Vup[,1], format="%Y-%m-%d
%H:%M",tz="UTC"))
str(ADP_UPmin)
?zoo? series from 2010-01-21 07:42:00 to 2010-01-26 20:12:00
Data: num [1:7168, 1] 1.22 1.62 1.53 1.46 1.25 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "UP_Velocity_m/s"
Index: POSIXct[1:7168], format: "2010-01-21 07:42:00" "2010-01-21 07:43:00"
"2010-01-21 07:44:00" ...
And if I merge the two zoo objects I get this:
M<-merge(ADP_UPmin,PASmin)
head(M)
UP_Velocity_m/s PAS
2010-01-20 21:01:00 NA 0.0013938
2010-01-20 21:02:00 NA 0.0015331
2010-01-20 21:03:00 NA 0.0016725
2010-01-20 21:04:00 NA 0.0016725
2010-01-20 21:05:00 NA 0.0012265
2010-01-20 21:06:00 NA 0.0015889
?zoo? series from 2010-01-20 21:01:00 to 2010-01-27 05:01:00
Data: num [1:8499, 1:2] NA NA NA NA NA NA NA NA NA NA ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "UP_Velocity_m/s" "PAR"
Index: POSIXct[1:8499], format: "2010-01-20 21:01:00" "2010-01-20 21:02:00"
"2010-01-20 21:03:00" ...
For some reason I can not figure out, even though both the PAS data frame and
PAS zoo object starts at 2010-01-21 05:01:00, once merged the PAS data starts a
day earlier at 2010-01-20 21:01:00. The actual numeric data looks good, but
both variables have no come uncoupled from the time series dates (The Velocity
data is similarity uncoupled). And as stated before, doing an non-zoo merge on
the data.frame data works fine.
Anyone got any ideas what's going on?
My guess is that you create both zoo series with time zone UTC but that
the TZ attribute gets lost upon the merge. Then, the time is displayed in
your systems time zone (which you haven't told us) which apparently is a
couple of hours before UTC.
On my system (which is in CET) I can create a series with UTC times
R> x <- zoo(1:2, as.POSIXct(c("2012-01-01 00:00:00",
+ "2012-01-01 01:00:00"), format = "%Y-%m-%d %H:%M:%S", tz = "UTC"))
R> x
2012-01-01 00:00:00 2012-01-01 01:00:00
1 2
The times are in UTC as requested, but applying the c() method, they get
dropped. See ?c.POSIXct.
R> time(x)
[1] "2012-01-01 00:00:00 UTC" "2012-01-01 01:00:00 UTC"
R> c(time(x))
[1] "2012-01-01 01:00:00 CET" "2012-01-01 02:00:00 CET"
Hence:
R> merge(x, x)
x x
2012-01-01 01:00:00 1 1
2012-01-01 02:00:00 2 2
But you can set the system time in your R session to UTC which gives the
desired result:
R> Sys.setenv(TZ = "UTC")
R> merge(x, x)
x x
2012-01-01 00:00:00 1 1
2012-01-01 01:00:00 2 2
hth,
Z
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.