Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, William Dunlap wrote: To find the lines in the file, tfile, with bogus dates, try readLines(tfile)[ is.na(dataFrame$DateTime) ] Bill, Thanks for another lesson. Regards, Rich __ R-help@r-project.org mailing list -- To UN

Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, Rich Shepard wrote: Thank you all. I found the typos which covered a single day toward the end of the dataframe. FWIW, all these data came from PDF reports and had to be manually highlighted and pasted into a text file. Given 29 years of hourly (and sometimes half-hourl

Re: [R] Locating data source error in large file

2018-07-20 Thread William Dunlap via R-help
To find the lines in the file, tfile, with bogus dates, try readLines(tfile)[ is.na(dataFrame$DateTime) ] Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Jul 20, 2018 at 1:30 PM, Rich Shepard wrote: > On Fri, 20 Jul 2018, David Winsemius wrote: > > I don't think you read Bill's message

Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, William Dunlap wrote: You mean each line in the file, not row in data.frame, has the form "year-month-day,hour:min,numericValue". Try the following, where tfile names your file: Bill, Yes, I was looking at the data file in one emacs buffer and my R session in another on

Re: [R] Locating data source error in large file

2018-07-20 Thread William Dunlap via R-help
> And each dataframe row has this format: >2015-10-01,00:00,90.6689 >2015-10-01,01:00,90.6506 >2015-10-01,02:00,90.6719 >2015-10-01,03:00,90.6506 You mean each line in the file, not row in data.frame, has the form "year-month-day,hour:min,numericValue". Try the following, where tfile names your

Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, David Winsemius wrote: wy2016$dt_time <- with( wy2016, as.POSIXct( paste( date, time ) , format= "%Y-%m-%d %H:%M") ) David/Bill/Eric: Thank you all. I found the typos which covered a single day toward the end of the dataframe. Carpe weekend, Rich

Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, Eric Berger wrote: This may not be the most efficient but it will identify the offenders. foo <- paste(wy2016$date, wy2016$time)) uu <- sapply(1:length(foo), function(i) { a <- try(as.POSIXct(foo[i]),silent=TRUE) "POSIXct" %in% class(a) }) which

Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, David Winsemius wrote: I don't think you read Bill's message properly. David, Obviously not. He was not saying that there were NA's; he was telling you to use a format specification in your as.POSIXct call and the the result of that call would have NA's. wy2016$dt_ti

Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, William Dunlap wrote: Which format did you use when you used is.na on the output of as.POSIXlt(strings, format=someFormat) and found none? Did the resulting dates look OK? Perhaps all is well. Bill, All dates here are kept as -mm-dd. And each dataframe row ha

Re: [R] Locating data source error in large file

2018-07-20 Thread David Winsemius
> On Jul 20, 2018, at 11:58 AM, Rich Shepard wrote: > > On Fri, 20 Jul 2018, William Dunlap wrote: > >> The problem occurs because no commonly used format works on all your date >> strings. If you give as.POSIXlt the format you want to use then items that >> don't match the format will be trea

Re: [R] Locating data source error in large file

2018-07-20 Thread William Dunlap via R-help
Which format did you use when you used is.na on the output of as.POSIXlt(strings, format=someFormat) and found none? Did the resulting dates look OK? Perhaps all is well. Note the the common American format month/day/year is not one that is tested when you don't supply a format - xx/yy/ i

Re: [R] Locating data source error in large file

2018-07-20 Thread Eric Berger
Hi Rich, This may not be the most efficient but it will identify the offenders. > foo <- paste(wy2016$date, wy2016$time)) > uu <- sapply(1:length(foo), function(i) { a <- try(as.POSIXct(foo[i]),silent=TRUE) "POSIXct" %in% class(a) }) > which(!uu) HTH, Eric On Fri, J

Re: [R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
On Fri, 20 Jul 2018, William Dunlap wrote: The problem occurs because no commonly used format works on all your date strings. If you give as.POSIXlt the format you want to use then items that don't match the format will be treated as NA's. Use is.na() to find them. Bill, No NAs found using

Re: [R] Locating data source error in large file

2018-07-20 Thread William Dunlap via R-help
The problem occurs because no commonly used format works on all your date strings. If you give as.POSIXlt the format you want to use then items that don't match the format will be treated as NA's. Use is.na() to find them. > d <- c("2017-12-25", "2018-01-01", "10/31/2018") > as.POSIXlt(d) Error i

[R] Locating data source error in large file

2018-07-20 Thread Rich Shepard
The structure of the dataframe is str(wy2016) 'data.frame': 8784 obs. of 4 variables: $ date : chr "2015-10-01" "2015-10-01" "2015-10-01" "2015-10-01" ... $ time : chr "00:00" "01:00" "02:00" "03:00" ... $ elev : num 90.7 90.7 90.7 90.7 90.7 ... $ myDate: Date, format: "2015-10-01"