> On Oct 19, 2016, at 4:12 PM, David Winsemius <dwinsem...@comcast.net> wrote: > > >> On Oct 19, 2016, at 1:54 PM, Rich Shepard <rshep...@appl-ecosys.com> wrote: >> >> The file, daily_records.dat, contains these data: >> >> "station","date","amount" >> "0.3E",2014-01-01, >> "0.3E",2014-01-02, >> "0.3E",2014-01-03,0.01 >> "0.3E",2014-01-04,0.00 >> "0.3E",2014-01-05,0.00 >> "0.3E",2014-01-06,0.00 >> "0.3E",2014-01-07,0.10 >> "0.3E",2014-01-08,0.22 >> "0.3E",2014-01-09,0.49 >> >> Using read.table("daily_records.dat", header = TRUE, sep = ",", quote = >> "\"\"") the data are assigned to a data.frame named 'rain.' >> >> I expect the structure to show station and date as factors with amount as >> numeric, but they're all factors: > > I got both station and amounts as numeric: > > dat <- read.table(text='"station","date","amount" > "0.3E",2014-01-01, > "0.3E",2014-01-02, > "0.3E",2014-01-03,0.01 > "0.3E",2014-01-04,0.00 > "0.3E",2014-01-05,0.00 > "0.3E",2014-01-06,0.00 > "0.3E",2014-01-07,0.10 > "0.3E",2014-01-08,0.22 > "0.3E",2014-01-09,0.49', header = TRUE, sep = ",",quote = > "\"\"") > str(dat) > 'data.frame': 9 obs. of 3 variables: > $ station: num 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 > $ date : Factor w/ 9 levels "2014-01-01","2014-01-02",..: 1 2 3 4 5 6 7 8 9 > $ amount : num NA NA 0.01 0 0 0 0.1 0.22 0.49 > > > Why aren't you using colClasses?
'station' comes over as numeric because the 'E' is presumed to be for scientific notation in the limited data copied here. It appears that the actual data file has a 'W' suffix, presumably for a directional designation (East versus West), as seen below in Rich's str() output. > str(type.convert("0.3E")) num 0.3 > str(type.convert("0.3W")) Factor w/ 1 level "0.3W": 1 > str(type.convert(c("0.3E", "0.3W"))) Factor w/ 2 levels "0.3E","0.3W": 1 2 As David and Duncan experienced, 'amount' came over as numeric for me as well, again with the limited data here. So as Duncan noted, there is likely a value somewhere in that column that results in the coercion to factor when ?type.convert is applied to the column, because the value is not a proper number. Regards, Marc Schwartz > > >> >> str(rain) >> 'data.frame': 341 obs. of 3 variables: >> $ station: Factor w/ 6 levels "0.3E","0.6W",..: 1 1 ... >> $ date : Factor w/ 62 levels "2013-12-01","2013-12-02",..: 32 33 34 ... >> $ amount : Factor w/ 48 levels "","0.00","0.01",..: 1 1 3 2 ... >> >> Why is amount taken as a factor rather than numeric? I do not recall >> having numbers read as factors before this. >> >> I expect to need to convert dates using as.Date() but not to convert >> numbers. >> >> TIA, >> >> Rich > > ______________ ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.