On Wed, 18 Oct 2017 18:09:41 +0200 Martin Maechler <maech...@stat.math.ethz.ch> wrote:
>>>>>> Martin Maechler <maech...@stat.math.ethz.ch> >>>>>> on Mon, 16 Oct 2017 19:13:31 +0200 writes: (I also included a reply to part of this response of yours below.) >>>>>> Stephen Berman <stephen.ber...@gmx.net> >>>>>> on Sun, 15 Oct 2017 01:53:12 +0200 writes: > >> > (I reported the test failure mentioned below to R-help but was advised >> > that this list is the right one to address the issue; in the meantime I >> > investigated the matter somewhat more closely, including searching >> > recent R-devel postings, since I haven't been following this list.) >> >> > Last May there were two reports here of problems with Sys.timezone, one >> > where the zoneinfo directory is in a nonstandard location >> > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074267.html) and the >> > other where the system lacks the file /etc/localtime >> > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074275.html). My >> > system exhibits a third case: it lacks /etc/timezone and does not set >> TZ >> > systemwide, but it does have /etc/localtime, which is a copy of, rather >> > than a symlink to, a file under zoneinfo. On this system >> Sys.timezone() >> > returns NA and the Sys.timezone test in reg-tests-1d fails. However, >> on >> > my system I can get the (abbreviated) timezone in R by using >> as.POSIXlt, >> > e.g. as.POSIXlt(Sys.time())$zone. If Sys.timezone took advantage of >> > this, e.g. as below, it would be useful on such systems as mine and the >> > regression test would pass. >> >> > my.Sys.timezone <- >> > function (location = TRUE) >> > { >> > tz <- Sys.getenv("TZ", names = FALSE) >> > if (!location || nzchar(tz)) >> > return(Sys.getenv("TZ", unset = NA_character_)) >> > lt <- normalizePath("/etc/localtime") >> > if (grepl(pat <- "^/usr/share/zoneinfo/", lt) || >> > grepl(pat <- "^/usr/share/zoneinfo.default/", lt)) >> > sub(pat, "", lt) >> > else if (lt == "/etc/localtime") >> > if (!file.exists("/etc/timezone")) >> > return(as.POSIXlt(Sys.time())$zone) >> > else if (dir.exists("/usr/share/zoneinfo") && { >> > info <- file.info(normalizePath("/etc/timezone"), >> extra_cols = FALSE) >> > (!info$isdir && info$size <= 200L) >> > } && { >> > tz1 <- tryCatch(readBin("/etc/timezone", "raw", 200L), >> > error = function(e) raw(0L)) >> > length(tz1) > 0L && all(tz1 %in% as.raw(c(9:10, 13L, >> 32:126))) >> > } && { >> > tz2 <- gsub("^[[:space:]]+|[[:space:]]+$", "", >> rawToChar(tz1)) >> > tzp <- file.path("/usr/share/zoneinfo", tz2) >> > file.exists(tzp) && !dir.exists(tzp) && >> > identical(file.size(normalizePath(tzp)), >> file.size(lt)) >> > }) >> > tz2 >> > else NA_character_ >> > } >> >> > One problem with this is that the zone component of as.POSIXlt only >> > holds the abbreviated timezone, not the Olson name. >> >> Yes, indeed. So, really only for Sys.timezone(location = FALSE) this >> should be given, for the default location = TRUE it should >> still give NA (i.e. NA_character_) in your setup. >> >> Interestingly, the Windows versions of Sys.timezone(location = >> FALSE) uses something like your proposal, and I tend to think that >> -- again only for location=FALSE -- this should be used on >> on-Windows as well, at least instead of returning NA then. >> >> Also for me on 3 different Linuxen (Fedora 24, F. 26, and ubuntu >> 14.04 LTS), I get >> >> > Sys.timezone() >> [1] "Europe/Zurich" >> > Sys.timezone(FALSE) >> [1] NA >> > >> >> whereas on Windows I get Europe/Berlin for the first (why on >> earth - I'm really in Zurich) and get "CEST" ("Central European Summer >> Time") >> for the 2nd one instead of NA ... simply using a smarter version >> of your proposal. The windows source is >> in R's source at src/library/base/R/windows/system.R : >> >> Sys.timezone <- function(location = TRUE) >> { >> tz <- Sys.getenv("TZ", names = FALSE) >> if(nzchar(tz)) return(tz) >> if(location) return(.Internal(tzone_name())) >> z <- as.POSIXlt(Sys.time()) >> zz <- attr(z, "tzone") >> if(length(zz) == 3L) zz[2L + z$isdst] else zz[1L] >> } >> >> >From what I read, the last three lines also work in your setup >> where it seems zz would be of length 1, right ? Those line do indeed work here, but zz has three elements: > attributes(as.POSIXlt(Sys.time()))$tzone [1] "" "CET" "CEST" >> I'd really propose to use these 3 lines in the non-Windows >> version of Sys.timezone .. at the end *instead* of NA_character_ >> (or a slightly safer version which gives NA_character_ if zz is >> of length 0 {e.g. if there is no "tzone" attribute}. >> >> > i don't know how to >> > get the Olson name using only R functions, but maybe it would be good >> > enough to return the abbreviated timezone where possible, e.g. as >> above. >> > (On my system I can get the Olson name of the timezone in R with a >> shell >> > pipeline, e.g.: system("find /usr/share/zoneinfo/ -type f | xargs >> md5sum >> > | grep $(md5sum /etc/localtime | cut -d ' ' -f 1) | head -n 1 | cut -d >> > '/' -f 5,6"), but the last part of this is tailored to my configuration >> > and the whole thing is not OS-neutral, so it isn't suitable for >> > Sys.timezone.) >> >> > Steve Berman >> >> Definitely not. I still recommend you think of a more portable >> solution for the `location = TRUE` (default) case in Sys.timezone(). >> Returning the non-location form (e.g "CEST") when something like >> "Europe/Zurich" is expected is really not a good idea, >> and you are lucky that the regression test passes "accidentally" ... >> >> Martin > > In the mean time, I have committed a common version (Windows and > non-Windows) of Sys.timezone() to the R development sources > (aka "R-devel"). > > That now uses as.POSIXlt(Sys.time()) very similarly to the > above "Windows only" case, but __only__ for 'location=FALSE' > which is not the default. Thanks, I think that's definitely better than returning NA when `location' is false... > The most current development source is always available (via > 'svn' or alternatively for browsing via your web browser) from > > https://svn.r-project.org/R/trunk/src/library/base/R/datetime.R ...however, I tried the test that failed for me during `make check' now with this new definition of Sys.timezone() by pasting the definition (as new.Sys.timezone()) and the two lines of the test code into the R console, and this is what happened: > new.Sys.timezone() > new.Sys.timezone(FALSE) [1] "CEST" > (S.t <- new.Sys.timezone()) NULL > if(is.na(S.t) || !nzchar(S.t)) stop("could not get timezone") Error in if (is.na(S.t) || !nzchar(S.t)) stop("could not get timezone") : missing value where TRUE/FALSE needed In addition: Warning message: In is.na(S.t) : is.na() applied to non-(list or vector) of type 'NULL' This is because `location' is true but all the if-clauses in the body following `if(location)' are false, so it returns NULL. If you add the line `else NA_character_' below the line `tz2', then NA is returned and the test fails as before instead of as above. > As you say yourself, the above system("... xargs md5sum ...") > using workaround is really too platform specific but I'd guess > there should be a less error prone way to get the long timezone > name on your system ... If I understand the zic(8) man page, the files in /usr/share/zoneinfo should contain this information, but I don't know how to extract it, since these are compiled files. And since on my system /etc/localtime is a copy of one of these compiled files, I don't know of any other way to recover the location name without comparing it to those files. > If that remains "contained" (i.e. small) and works with files > and R's files tools -- e.g. file.*() ones [but not system()], > I'd consider a patch to the above source file > (sent by you to the R-devel mailing list --- or after having > gotten an account there by asking, via bug report & patch > attachment at https://bugs.r-project.org/ ) If comparing file size sufficed, that would be easy to do in R; unfortunately, it is not sufficient, since some files designating different time zones in /usr/share/zoneinfo do have the same size. So the only alternative I can think of is to compare bytes, e.g. with md5sum or with cmp. Is there some way to do this in R without using system()? Steve Berman ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel