I'm having problems with timezones using lubridate, but it's not clear to me 
the difficulty is in lubridate.
---------------------------------
> r2 <- parse_date_time("1970-01-01 00:01:00", "ymd HMS", tz="PST")
> r2
[1] "1970-01-01 08:01:00 PST"  ## Oops: midnight has turned in 8am
> as.numeric(r2)
[1] 28860
> 8*3600 # seconds in 8 hours
[1] 28800
------------------------------------
lubridate accepts PST as the time zone, and the result prints "PST" for 
timezone.  Further, lubridate seems to be using the tz properly since it gets 
the 8 hour offset from UTC correct.

The problem is the value that is printed gives a UTC time of 08:01 despite 
having the PST suffix.  So the time appears to have jumped 8 hours ahead from 
the value parsed.

PST appears not to be a legal timezone (in spite of lubridate inferring the 
correct offset from it):
---------------------------------------------------
> Sys.timezone()
[1] "America/Los_Angeles"

> (grep("PST", OlsonNames(), value=TRUE))
[1] "PST8PDT"         "SystemV/PST8"    "SystemV/PST8PDT"
-------------------------------------
https://www.r-bloggers.com/2018/07/a-tour-of-timezones-troubles-in-r/ says 
lubridate will complain if given an invalid tz, though I don't see that 
explicitly in the current man page 
https://lubridate.tidyverse.org/reference/parse_date_time.html.  As shown 
above, parse_date_time() does not complain about the timezone, and does use it 
to get the correct offset.

Using America/Los_Angeles produces the expected results:
---------------------------------------
> r4 <- parse_date_time("1970-01-01 00:01:00", "ymd HMS", tz=Sys.timezone())
> r4
[1] "1970-01-01 00:01:00 PST"  # still prints PST.  This time it's true!
> as.numeric(r4)
[1] 28860
----------------------------------------------------

I suppose I can just use "America/Los_Angeles" as the time zone; this would 
have the advantage of making all my timezones the same, which apparently what R 
requires for a vector of datetimes.  But the behavior seems odd, and the "fix" 
also requires me to ignore the time zone specified in my inputs, which look 
like "2022-03-01 15:54:30 PST" or PDT, depending on time of year.

1. Why this strange behavior in which PST or PDT is used to construct the 
proper offset from UTC, and then kind of forgotten on output?
2. Is this a bug in lubridate or base POSIXct, particularly its print routine?

My theory on 1 is that lubridate understands PST and constructs an appropriate 
UTC time.  POSIXct time does not understand a tz of "PST" and so prints out the 
UTC value for the time, "decorating" it with the not understood tz value.  

For 2, on one hand, lubridate is constructing POSIXct dates with invalid tz 
values; lubridate probably shouldn't.  On the other hand, POSIXct is printing a 
UTC time but labeling it with a tz it doesn't understand, so it looks if it's 
in that local time even though it isn't.  In the context above that seems like 
a bug, but it's possible a lot of code that depends on it.

Under these theories, the problems only arise because the set of tz values 
understood by lubridate differs from the tz value understood by POSIXct.

Versions:
R 3.5.2
lubridate 1.7.4
Debian GNU/Linux 10 aka buster (amd64 flavor)

Thanks.
Ross Boylan
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to