On 6/22/2011 2:28 PM, Alexander Shenkin wrote:
On 6/22/2011 4:09 PM, Brian Diggs wrote:On 6/22/2011 1:37 PM, Alexander Shenkin wrote:On 6/22/2011 3:34 PM, Brian Diggs wrote:On 6/22/2011 12:09 PM, Luke Miller wrote:For what it's worth, I cannot reproduce this problem under a nearly identical instance of R (R 2.12.1, Win 7 Pro 64-bit). I also can't reproduce the problem with R 2.13.0. You've got something truly weird going on with your particular instance of R.is.na(strptime("5/3/1992", format="%m/%d/%Y"))[1] FALSEis.na(strptime("5/2/1992", format="%m/%d/%Y"))[1] FALSEsessionInfo()R version 2.12.1 (2010-12-16) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rj_0.5.2-1 lattice_0.19-17 loaded via a namespace (and not attached): [1] grid_2.12.1 rJava_0.8-8 tools_2.12.1Like Luke, I can not reproduce what you see in (an old installation of) R 2.12.1 (and it also didn't have rj, lattice, grid, rJava, or tools attached or loaded in any way). My vague gut feeling is it might be a timezone/daylight savings time related issue (though usually times have to be involved). At least, that is a common problem with weird things happening with dates. What do you get as output for the following? Sys.timezone() Sys.info() conflicts() dput(strptime("5/3/1992", format="%m/%d/%Y")) dput(as.POSIXct(strptime("5/3/1992", format="%m/%d/%Y"))) dput(strptime("5/2/1992", format="%m/%d/%Y")) dput(as.POSIXct(strptime("5/2/1992", format="%m/%d/%Y")))Sys.timezone()[1] "COT"Sys.info()sysname release version nodename machine "Windows" "7 x64" "build 7601, Service Pack 1" "machine_name" "x86" login user "username" "username"conflicts()[1] "untangle.specials" "body<-" "format.pval" "round.POSIXt" "trunc.POSIXt" "units"dput(strptime("5/3/1992", format="%m/%d/%Y"))structure(list(sec = 0, min = 0L, hour = 0L, mday = 3L, mon = 4L, year = 92L, wday = 0L, yday = 123L, isdst = -1L), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst" ), class = c("POSIXlt", "POSIXt"))dput(as.POSIXct(strptime("5/3/1992", format="%m/%d/%Y")))structure(NA_real_, class = c("POSIXct", "POSIXt"), tzone = "")dput(strptime("5/2/1992", format="%m/%d/%Y"))structure(list(sec = 0, min = 0L, hour = 0L, mday = 2L, mon = 4L, year = 92L, wday = 6L, yday = 122L, isdst = 0L), .Names = c("sec", "min", "hour", "mday", "mon", "year", "wday", "yday", "isdst" ), class = c("POSIXlt", "POSIXt"))dput(as.POSIXct(strptime("5/2/1992", format="%m/%d/%Y")))structure(704782800, class = c("POSIXct", "POSIXt"), tzone = "")Fun :) So, not being familiar with COT, I looked it up to see what/when the daylight savings times switch overs are/were. http://www.timeanddate.com/worldclock/timezone.html?n=41&syear=1990 Daylight savings time started (in 1992 only) on "Midnight between Saturday, May 2 and Sunday, May 3" and ended (in 1993) on "Midnight between Saturday, April 3 and Sunday, April 4". In particular, it went from Saturday, May 2, 1992 11:59:59 PM to Sunday, May 3 1992 1:00:00 AM. So there was no midnight on May 3. So when strptime converts the date, it, by default, sets the time to midnight. Except that is not valid according to the DST rules (which is why isdst gets set to -1). When converting to a POSIXct, it becomes NA. There are probably a lot of places in R that assume midnight is a valid time, and so I don't know what all will or will not work in that timezone (you probably will also have problems with seq and cut on POSIXct/POSIXlt's in that timezone at least). I'd recommend using a different timezone. Or, if you don't need times, using Date (which doesn't have timezones and so avoids this): as.Date("5/3/1992", format="%m/%d/%Y")Thanks for your detective work, Brian! Nice one. I am now using "date", and so _my_ problem is solved. However, it must be the case that others have and will continue to run across this problem (and perhaps won't even realize it, thus tossing away data). Indeed, it seems like there are quite a number of places that have DST switching at midnight: http://www.google.com/search?q=Midnight+site%3Ahttp%3A%2F%2Fwww.timeanddate.com%2Fworldclock%2Ftimezone.html . I assume all these timezones would come across a similar problem as mine? What would be the best route to try to get this smoothed over in R-core?
No one else has chimed in, so I filed a bug report https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14615
On Wed, Jun 22, 2011 at 2:40 PM, Alexander Shenkin<ashen...@ufl.edu> wrote:On 6/22/2011 1:34 PM, Sarah Goslee wrote:On Wed, Jun 22, 2011 at 2:28 PM, David Winsemius<dwinsem...@comcast.net> wrote:On Jun 22, 2011, at 2:03 PM, Sarah Goslee wrote:Hi, On Wed, Jun 22, 2011 at 11:40 AM, Alexander Shenkin<ashen...@ufl.edu> wrote:is.na(strptime("5/2/1992", format="%m/%d/%Y"))[1] FALSEis.na(strptime("5/3/1992", format="%m/%d/%Y"))[1] TRUEI can't reproduce your problem on R 2.13.0 on linux:I also cannot reproduce it on a Mac with 2.13.0 betaWhich strongly suggests that you should start by upgrading your R installation if at all possible. I'd also recommend trying it on a default R session, with no extra packages loaded, and no items in your workspace. It's possible that something else is interfering. On linux, that's achieved by typing R --vanilla at the command line. I'm afraid I don't know how to do it for Windows, but should be similarly straightforward.Thanks Sarah. Still getting the problem. I should surely upgrade, but still, not a bad idea to get to the bottom of this, or at least have it documented as a known issue. BTW, I'm on Windows 7 Pro x64. (running Rgui.exe --vanilla):is.na(strptime("5/3/1992", format="%m/%d/%Y"))[1] TRUEis.na(strptime("5/2/1992", format="%m/%d/%Y"))[1] FALSEsessionInfo()R version 2.12.1 (2010-12-16) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Brian S. Diggs, PhD Senior Research Associate, Department of Surgery Oregon Health & Science University ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.