Às 01:49 de 03/11/2023, roslinazairimah zakaria escreveu:
Hi all,
This is the data:
dput(head(dt1,20))structure(list(StationName = c("PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
"PALO ALTO CA / CAMBRIDGE #1"), date = c("1/14/2016", "1/14/2016",
"1/14/2016", "1/15/2016", "1/15/2016", "1/15/2016", "1/15/2016",
"1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016",
"1/16/2016", "1/16/2016", "1/17/2016", "1/17/2016", "1/17/2016",
"1/17/2016", "1/17/2016", "1/18/2016"), time = c("12:09", "19:50",
"20:22", "8:25", "14:23", "18:17", "21:46", "10:19", "12:12",
"14:12", "16:22", "19:16", "19:19", "20:24", "9:54", "12:16",
"13:53", "19:03", "22:00", "8:58"), EnergykWh = c(4.680496, 6.272414,
1.032782, 11.004884, 10.096824, 6.658797, 4.808874, 1.469384,
2.996239, 0.303222, 4.988339, 8.131804, 0.117156, 3.285669, 1.175608,
3.677487, 1.068393, 8.820755, 8.138583, 9.0575)), row.names = c(NA,
20L), class = "data.frame")
I would like to sum EnergykW data by the date. E.g. all values for
EnergykWh on 1/14/2016
On Fri, Nov 3, 2023 at 8:10 AM jim holtman <[email protected]> wrote:
How about send a 'dput' of some sample data. My guess is that your date
is 'character' and not 'Date'.
Thanks
Jim Holtman
*Data Munger Guru*
*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*
On Thu, Nov 2, 2023 at 4:24 PM roslinazairimah zakaria <
[email protected]> wrote:
Dear all,
I have this set of data. I would like to sum the EnergykWh according date
sequences.
head(dt1,20) StationName date time EnergykWh
1 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09 4.680496
2 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50 6.272414
3 PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22 1.032782
4 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 8:25 11.004884
5 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824
6 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17 6.658797
7 PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46 4.808874
8 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19 1.469384
9 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12 2.996239
10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12 0.303222
11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22 4.988339
12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16 8.131804
13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19 0.117156
14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24 3.285669
15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 9:54 1.175608
16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16 3.677487
17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53 1.068393
18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03 8.820755
19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00 8.138583
20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016 8:58 9.057500
I have tried this:
library(dplyr)
sums <- dt1 %>%
group_by(date) %>%
summarise(EnergykWh = sum(EnergykWh))
head(sums,20)
The date is not by daily sequence but by year sequence.
head(sums,20)# A tibble: 20 × 2
date EnergykWh
<chr> <dbl> 1 1/1/2017 25.3 2 1/1/2018 61.0 3
1/1/2019 0.627 4 1/1/2020 10.7 5 1/10/2017 69.4 6
1/10/2018 54.5 7 1/10/2019 49.1 8 1/10/2020 45.9 9
1/11/2017 73.9 10 1/11/2018 53.3 11 1/11/2019 93.5 12
1/11/2020 66.7 13 1/12/2017 78.6 14 1/12/2018 42.2 15
1/12/2019 22.7 16 1/12/2020 80.9 17 1/13/2017 85.6 18
1/13/2018 46.4 19 1/13/2019 40.0 20 1/13/2020 121.
Thank you very much for any help given.
--
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*
*Email: [email protected] <[email protected]>;
[email protected] <[email protected]>*
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hello,
Here are two solutions.
1. Base R
Though I don't coerce the date column to class "Date", it seems to work.
aggregate(EnergykWh ~ date, dt1, sum)
#> date EnergykWh
#> 1 1/14/2016 11.98569
#> 2 1/15/2016 32.56938
#> 3 1/16/2016 21.29181
#> 4 1/17/2016 22.88083
#> 5 1/18/2016 9.05750
2. Package dplyr.
First column date is coerced from class "character" to class "Date".
Then the grouped sums are computed.
suppressPackageStartupMessages(
library(dplyr)
)
dt1 %>%
mutate(date = as.Date(date, "%m/%d/%Y")) %>%
summarise(EnergykWh = sum(EnergykWh), .by = date)
#> date EnergykWh
#> 1 2016-01-14 11.98569
#> 2 2016-01-15 32.56938
#> 3 2016-01-16 21.29181
#> 4 2016-01-17 22.88083
#> 5 2016-01-18 9.05750
As you can see, the results are the same.
Also, this exact problem is one of the most asked on StackOverflow.
Maybe you could try searching there for a solution. My code above is
also exactly the code in [1], though I had already this answer written.
I only checked after :(.
[1]
https://stackoverflow.com/questions/61548758/r-how-sum-values-by-group-by-date
Hope this helps,
Rui Barradas
--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença
de vírus.
www.avg.com
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.