The data file begins this way:
year,month,day,hour,min,fps
2016,03,03,12,00,1.74
2016,03,03,12,10,1.75
2016,03,03,12,20,1.76
2016,03,03,12,30,1.81
2016,03,03,12,40,1.79
2016,03,03,12,50,1.75
2016,03,03,13,00,1.78
2016,03,03,13,10,1.81

The script to process it:
library('tidyverse')
vel <- read.csv('../data/water/vel.dat', header = TRUE, sep = ',', 
stringsAsFactors = FALSE)
vel$year <- as.integer(vel$year)
vel$month <- as.integer(vel$month)
vel$day <- as.integer(vel$day)
vel$hour <- as.integer(vel$hour)
vel$min <- as.integer(vel$min)
vel$fps <- as.double(vel$fps, length = 6)

# use dplyr to filter() by year, month, day; summarize() to get monthly
# means
vel_by_month = vel %>%
    group_by(year, month) %>%
    summarize(flow = mean(fps, na.rm = TRUE))

R's display after running the script:
source('vel.R')
`summarise()` has grouped output by 'year'. You can override using the 
`.groups` argument.
Warning messages:
1: In eval(ei, envir) : NAs introduced by coercion
2: In eval(ei, envir) : NAs introduced by coercion
3: In eval(ei, envir) : NAs introduced by coercion

The dataframe created by the read.csv() command:
head(vel)
  year month day hour min  fps
1 2016     3   3   12   0 1.74
2 2016     3   3   12  10 1.75
3 2016     3   3   12  20 1.76
4 2016     3   3   12  30 1.81
5 2016     3   3   12  40 1.79
6 2016     3   3   12  50 1.75

and the resulting grouping:
vel_by_month
# A tibble: 67 × 3
# Groups:   year [8]
    year month   flow
   <int> <int>  <dbl>
 1     0    NA NaN
 2  2016     3   2.40
 3  2016     4   3.00
 4  2016     5   2.86
 5  2016     6   2.51
 6  2016     7   2.18
 7  2016     8   1.89
 8  2016     9   1.38
 9  2016    10   1.73
10  2016    11   2.01
# … with 57 more rows

I cannot find why line 1 is there. Other data sets don't produce this
result.

TIA,

Rich

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to