I have a long file with hourly precipitation from 2000 to 2018. I would
like to select only on e year or even half of a year and plot the
cumulative precipitation of it in order to compare it with the
simulation data that I have.
So far I was able only to read all the file:
dati <- read.csv(file="116.txt", header=FALSE, sep="," ,
na.strings="-999",skip = 6)
and to plot the entire cumulative:
P <- cumsum(dati$PREC)
plot(dati$DATAORA, P)
How can I choose only, for example, 2013 in order to have P?
On Mon, 28 Jan 2019 at 02:36, Jeff Newmiller
<>> wrote:
I have no idea what you mean when you say "select starting date and
date properly form [sic] datai$DATA". For one thing there is no column
called DATA, and for another I don't know what starting dates and
dates you might be interested in. If you need help to subset by time,
perhaps you should ask a question about that instead.
Here is a reproducible example of making monthly data and
manipulating it
using artificial data:
Sys.setenv( TZ = "GMT" )
dati <- data.frame( DATAORA = as.POSIXct( "2012-01-01" )
+ as.difftime( seq( 0, 365*3*24
), units="hours" )
# terrible simulation of precipitation
dati$PREC <- 0.1 * trunc( 50 * rbeta( nrow( dati ), 1, 80 ) )
dati$ym <- as.yearmon( dati$DATAORA )
# aggregate usually reduces the number of rows given to it
datim <- aggregate( list( PREC = dati$PREC ) # data to summarize
, dati[ , "ym", drop=FALSE ] # columns to group on
, FUN = sum # calculation on data
plot(PREC ~ ym, data=datim) # This is how I would usually look at it
as.year <- function(x) floor( as.numeric( x ) ) # from help file on
datim$y <- as.year( datim$ym )
# ave typically does not change the number of rows given to it
datim$PMES <- ave( datim$PREC, datim$y, FUN = cumsum)
plot(PMES ~ ym, data=datim) # My guess as to what you asked for?
> Very succinct, Rui!
> One warning to Diego.... automatic data recorders tend to
use the local standard timezone year-round. R by
> default assumes that timestamps converted from character to
POSIXct using the current timezone on your
> computer... which may not be in the same zone that the
logger was in but even more commonly the computer
> follows daylight savings time. This leads to NAs showing up
in your converted timestamps in spring and
> duplicated values in autumn as the data are misinterpreted.
The easiest solution can be to use
> Sys.setenv( TZ="GMT" )
> though if you need the actual timezone you can use a zone
name of the form "Etc/GMT+5" (5 hrs west of GMT).
> Note that Rui's solution will only work correctly near the
month transition if you pretend the data timezone
> is GMT or UTC. (Technically these are different so your
mileage may vary but most implementations treat them
> as identical and I have not encountered any cases where
they differ.)
