[R] Rolling 7 day incidence

2021-08-17 Thread Dr Eberhard Lisse

Hi,

I am loading the coronavirus dataset everyday which looks something like:


 as_tibble(coronavirus) %>%
 filter(country=="Namibia" & type=="confirmed") %>%
 arrange(desc(date)) %>%
 print(n=10)

 # A tibble: 573 × 7
 date   province country   lat  long type  cases

  1 2021-08-16 ""   Namibia -23.0  18.5 confirmed76
  2 2021-08-15 ""   Namibia -23.0  18.5 confirmed   242
  3 2021-08-14 ""   Namibia -23.0  18.5 confirmed   130
  4 2021-08-13 ""   Namibia -23.0  18.5 confirmed   280
  5 2021-08-12 ""   Namibia -23.0  18.5 confirmed   214
  6 2021-08-11 ""   Namibia -23.0  18.5 confirmed96
  7 2021-08-10 ""   Namibia -23.0  18.5 confirmed   304
  8 2021-08-09 ""   Namibia -23.0  18.5 confirmed   160
  9 2021-08-08 ""   Namibia -23.0  18.5 confirmed   229
 10 2021-08-07 ""   Namibia -23.0  18.5 confirmed   319
 # … with 563 more rows

How do I do a rolling 7 day incidence (ie sum the cases over 7 days) but
rolling, ie from the last day to 7 (or 6?)  days before the end of the
dataset, so I get pairs of date/7-Day-Incidence?

I know it's probably re-inventing the plot as it were but I can't find
R code to do that.

I want to plot it per 10 but that I can do.

greetings, el


--
To email me replace 'nospam' with 'el'

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rolling 7 day incidence

2021-08-17 Thread PIKAL Petr
Hi.

There are several ways how to do it. You could find them easily using Google. 
e.g.

https://stackoverflow.com/questions/19200841/consecutive-rolling-sums-in-a-vector-in-r

where you find several options.

Cheers
Petr



> -Original Message-
> From: R-help  On Behalf Of Dr Eberhard
> Lisse
> Sent: Tuesday, August 17, 2021 12:25 PM
> To: r-help@r-project.org
> Subject: [R] Rolling 7 day incidence
> 
> Hi,
> 
> I am loading the coronavirus dataset everyday which looks something like:
> 
> 
>as_tibble(coronavirus) %>%
>filter(country=="Namibia" & type=="confirmed") %>%
>arrange(desc(date)) %>%
>print(n=10)
> 
># A tibble: 573 × 7
>date   province country   lat  long type  cases
>   
> 1 2021-08-16 ""   Namibia -23.0  18.5 confirmed76
> 2 2021-08-15 ""   Namibia -23.0  18.5 confirmed   242
> 3 2021-08-14 ""   Namibia -23.0  18.5 confirmed   130
> 4 2021-08-13 ""   Namibia -23.0  18.5 confirmed   280
> 5 2021-08-12 ""   Namibia -23.0  18.5 confirmed   214
> 6 2021-08-11 ""   Namibia -23.0  18.5 confirmed96
> 7 2021-08-10 ""   Namibia -23.0  18.5 confirmed   304
> 8 2021-08-09 ""   Namibia -23.0  18.5 confirmed   160
> 9 2021-08-08 ""   Namibia -23.0  18.5 confirmed   229
>10 2021-08-07 ""   Namibia -23.0  18.5 confirmed   319
># … with 563 more rows
> 
> How do I do a rolling 7 day incidence (ie sum the cases over 7 days) but
> rolling, ie from the last day to 7 (or 6?)  days before the end of the 
> dataset, so
> I get pairs of date/7-Day-Incidence?
> 
> I know it's probably re-inventing the plot as it were but I can't find R code 
> to
> do that.
> 
> I want to plot it per 10 but that I can do.
> 
> greetings, el
> 
> 
> --
> To email me replace 'nospam' with 'el'
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Including percentage values inside columns of a histogram

2021-08-17 Thread Rui Barradas

Hello,

I had forgotten about plot.histogram, it does make everything simpler.
To have percentages on the bars, in the code below I use package scales.

Note that it seems to me that you do not want densities, to have 
percentages,  the proportions of counts are given by any of


h$counts/sum(h$counts)
h$density*diff(h$breaks)



# One histogram for all dates
h <- hist(datasetregs$Amount, plot = FALSE)
plot(h, labels = scales::percent(h$counts/sum(h$counts)),
 ylim = c(0, 1.1*max(h$counts)))



# Histograms by date
sp <- split(datasetregs, datasetregs$Date)
old_par <- par(mfrow = c(1, 3))
h_list <- lapply(seq_along(sp), function(i){
  hist_title <- paste("Histogram of", names(sp)[i])
  h <- hist(sp[[i]]$Amount, plot = FALSE)
  plot(h, main = hist_title, xlab = "Amount",
   labels = scales::percent(h$counts/sum(h$counts)),
   ylim = c(0, 1.1*max(h$counts)))
})
par(old_par)


Hope this helps,

Rui Barradas

Às 01:49 de 17/08/21, Bert Gunter escreveu:

I may well misunderstand, but proffered solutions seem more complicated
than necessary.
Note that the return of hist() can be saved as a list of class "histogram"
and then plotted with  plot.histogram(), which already has a "labels"
argument that seems to be what you want. A simple example is"

dat <- runif(50, 0, 10)
myhist <- hist(dat, freq = TRUE, breaks ="Sturges")

plot(myhist, col = "darkgray",
  labels = as.character(round(myhist$density*100,1) ),
  ylim = c(0, 1.1*max(myhist$counts)))
## note that this is plot.histogram because myhist has class "histogram"

Note that I expanded the y axis a bit to be sure to include the labels. You
can, of course, plot your separate years as Rui has indicated or via e.g.
?layout.

Apologies if I have misunderstood. Just ignore this in that case.
Otherwise, I leave it to you to fill in details.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal  wrote:


Dear Jim,

Thank you so much for your kind reply. Yes, this is what I am looking for,
however, can´t see clearly how the bars correspond to the bins in the
x-axis. Maybe there is a way to align the amounts so that they match the
columns, sorry if I sound picky, but just want to learn if there is a way
to accomplish this.

Best regards,

Paul

El lun, 16 ago 2021 a las 17:57, Jim Lemon ()
escribió:


Hi Paul,
I just worked out your first request:

datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
"factor"),
 Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
 15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
 15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
 15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
 16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
 15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
 15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
 15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class =
"data.frame")
histval<-with(datasetregs, hist(Amount, groups=Date, scale="frequency",
  breaks="Sturges", col="darkgray"))
library(plotrix)
histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
barlabels(histval$mids,histval$counts,histpcts)

I think that's what you asked for:

Jim

On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal 
wrote:


This is way better, now, how could I put the frequency labels in the
columns as a percentage, instead of presenting them as counts?

Thank you so much.

Paul

El lun, 16 ago 2021 a las 17:33, Rui Barradas ()
escribió:


Hello,

You forgot to cc the list.

Here are two ways, both of them apply hist() and text() to Amount

split

by Date. The return value of hist is saved because it's a list with
members the histogram's bars midpoints and the counts. Those are used

to

know where to put the text labels.
A vector lbls is created to get rid of counts of zero.

The main difference between the two ways is the histogram's titles.


old_par <- par(mfrow = c(1, 3))
h_list <- with(datasetregs, tapply(Amount, Date, function(x){
h <- hist(x)
lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
text(h$mids, h$counts/2, labels = lbls)
}))
par(old_par)



old_par <- par(mfrow = c(1, 3))
sp <- split(datasetregs, datasetregs$Date)
h_list <- lapply(seq_along(sp), function(i){
hist_title <- paste("Histogram of", names(sp)[i])
h <- hist(sp[[i]]$Amount, main = hist_title)
lbls <- ife

Re: [R] Rolling 7 day incidence

2021-08-17 Thread Dr Eberhard Lisse

Petr,

thank you very much, this pointed me in the right direction (to refine
my Google search :-)-O):

 library(tidyverse)
 library(coronavirus)
 library(zoo)

 as_tibble(coronavirus) %>%
 filter(country=='Namibia' & type=="confirmed") %>%
 mutate(rollsum = rollapplyr(cases, 7, sum, partial=TRUE)) %>%
 arrange(desc(date)) %>%
 mutate(R7=rollsum / 25.4 )  %>%
 select(date,R7)

gives me something like

 # A tibble: 573 × 2
 date  R7
  
  1 2021-08-16  52.8
  2 2021-08-15  56.1
  3 2021-08-14  55.6
  4 2021-08-13  63.1
  5 2021-08-12  62.8
  6 2021-08-11  63.7
  7 2021-08-10  67.3
  8 2021-08-09  69.3
  9 2021-08-08  69.2
 10 2021-08-07  74.5
 # … with 563 more rows

which seems to be correct :-)-O so I can now play with ggplot2 over the
weekend :-)-O

greetings, el

On 17/08/2021 12:46, PIKAL Petr wrote:

Hi.

There are several ways how to do it.  You could find them easily using
Google.  e.g.

https://stackoverflow.com/questions/19200841/consecutive-rolling-sums-in-a-vector-in-r

where you find several options.

Cheers
Petr

[...]


--
To email me replace 'nospam' with 'el'

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rolling 7 day incidence

2021-08-17 Thread PIKAL Petr
Hi

You're wellcome. You probably know 

https://www.repidemicsconsortium.org/projects/

as a collection of tools for epidemy evaluation.

Cheers
Petr

> -Original Message-
> From: R-help  On Behalf Of Dr Eberhard
> Lisse
> Sent: Tuesday, August 17, 2021 2:30 PM
> To: r-help@r-project.org
> Subject: Re: [R] Rolling 7 day incidence
> 
> Petr,
> 
> thank you very much, this pointed me in the right direction (to refine my
> Google search :-)-O):
> 
>library(tidyverse)
>library(coronavirus)
>library(zoo)
> 
>as_tibble(coronavirus) %>%
>filter(country=='Namibia' & type=="confirmed") %>%
>mutate(rollsum = rollapplyr(cases, 7, sum, partial=TRUE))
> %>%
>arrange(desc(date)) %>%
>mutate(R7=rollsum / 25.4 )  %>%
>select(date,R7)
> 
> gives me something like
> 
># A tibble: 573 × 2
>date  R7
> 
> 1 2021-08-16  52.8
> 2 2021-08-15  56.1
> 3 2021-08-14  55.6
> 4 2021-08-13  63.1
> 5 2021-08-12  62.8
> 6 2021-08-11  63.7
> 7 2021-08-10  67.3
> 8 2021-08-09  69.3
> 9 2021-08-08  69.2
>10 2021-08-07  74.5
># … with 563 more rows
> 
> which seems to be correct :-)-O so I can now play with ggplot2 over the
> weekend :-)-O
> 
> greetings, el
> 
> On 17/08/2021 12:46, PIKAL Petr wrote:
> > Hi.
> >
> > There are several ways how to do it.  You could find them easily using
> > Google.  e.g.
> >
> > https://stackoverflow.com/questions/19200841/consecutive-rolling-sums-
> > in-a-vector-in-r
> >
> > where you find several options.
> >
> > Cheers
> > Petr
> [...]
> 
> 
> --
> To email me replace 'nospam' with 'el'
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cars2

2021-08-17 Thread George Bellas
Hi R Core team,

I was looking at this link for data sets,

https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html

and wanted to use cars2. I found cars2 earlier this week, but now when I look 
at the list, it's not there!

Can you tell me what happened to cars2? I have tried calling it on my version 
of R4.0.5 (backdated so Shiny works), but I can't find it anymore.

Thanks,
George


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cars2

2021-08-17 Thread Ivan Krylov
On Tue, 17 Aug 2021 09:50:54 +
George Bellas  wrote:

> I found cars2 earlier this week, but now when I look at the list,
> it's not there!

That sounds like a dataset provided by a contributed package. Does
https://search.r-project.org/ help find it again?

-- 
Best regards,
Ivan

>   [[alternative HTML version deleted]]

P.S. When you compose hybrid HTML + plain text e-mail, the HTML version
gets stripped by the mailing list software and everyone gets the plain
text version, which may look wildly different from the HTML original.
Please stick to plain text e-mail to avoid nasty surprises.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Including percentage values inside columns of a histogram

2021-08-17 Thread Bert Gunter
Inline below.



On Tue, Aug 17, 2021 at 4:09 AM Rui Barradas  wrote:
>
> Hello,
>
> I had forgotten about plot.histogram, it does make everything simpler.
> To have percentages on the bars, in the code below I use package scales.
>
> Note that it seems to me that you do not want densities, to have
> percentages,  the proportions of counts are given by any of

Under the default of equal width bins -- which is what Sturges gives
if I read the docs correctly -- since the densities sum to 1, they are
already the proportion of counts in each histogram bin, no?

-- Bert


>
> h$counts/sum(h$counts)
> h$density*diff(h$breaks)
>
>
>
> # One histogram for all dates
> h <- hist(datasetregs$Amount, plot = FALSE)
> plot(h, labels = scales::percent(h$counts/sum(h$counts)),
>   ylim = c(0, 1.1*max(h$counts)))
>
>
>
> # Histograms by date
> sp <- split(datasetregs, datasetregs$Date)
> old_par <- par(mfrow = c(1, 3))
> h_list <- lapply(seq_along(sp), function(i){
>hist_title <- paste("Histogram of", names(sp)[i])
>h <- hist(sp[[i]]$Amount, plot = FALSE)
>plot(h, main = hist_title, xlab = "Amount",
> labels = scales::percent(h$counts/sum(h$counts)),
> ylim = c(0, 1.1*max(h$counts)))
> })
> par(old_par)
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 01:49 de 17/08/21, Bert Gunter escreveu:
> > I may well misunderstand, but proffered solutions seem more complicated
> > than necessary.
> > Note that the return of hist() can be saved as a list of class "histogram"
> > and then plotted with  plot.histogram(), which already has a "labels"
> > argument that seems to be what you want. A simple example is"
> >
> > dat <- runif(50, 0, 10)
> > myhist <- hist(dat, freq = TRUE, breaks ="Sturges")
> >
> > plot(myhist, col = "darkgray",
> >   labels = as.character(round(myhist$density*100,1) ),
> >   ylim = c(0, 1.1*max(myhist$counts)))
> > ## note that this is plot.histogram because myhist has class "histogram"
> >
> > Note that I expanded the y axis a bit to be sure to include the labels. You
> > can, of course, plot your separate years as Rui has indicated or via e.g.
> > ?layout.
> >
> > Apologies if I have misunderstood. Just ignore this in that case.
> > Otherwise, I leave it to you to fill in details.
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal  wrote:
> >
> >> Dear Jim,
> >>
> >> Thank you so much for your kind reply. Yes, this is what I am looking for,
> >> however, can´t see clearly how the bars correspond to the bins in the
> >> x-axis. Maybe there is a way to align the amounts so that they match the
> >> columns, sorry if I sound picky, but just want to learn if there is a way
> >> to accomplish this.
> >>
> >> Best regards,
> >>
> >> Paul
> >>
> >> El lun, 16 ago 2021 a las 17:57, Jim Lemon ()
> >> escribió:
> >>
> >>> Hi Paul,
> >>> I just worked out your first request:
> >>>
> >>> datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> >>> 2L,
> >>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> >>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> >>> 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> >>> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> >>> 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
> >>> "factor"),
> >>>  Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
> >>>  15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
> >>>  15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
> >>>  15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
> >>>  15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
> >>>  16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
> >>>  15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
> >>>  15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
> >>>  15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class =
> >>> "data.frame")
> >>> histval<-with(datasetregs, hist(Amount, groups=Date, scale="frequency",
> >>>   breaks="Sturges", col="darkgray"))
> >>> library(plotrix)
> >>> histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
> >>> barlabels(histval$mids,histval$counts,histpcts)
> >>>
> >>> I think that's what you asked for:
> >>>
> >>> Jim
> >>>
> >>> On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal 
> >>> wrote:
> 
>  This is way better, now, how could I put the frequency labels in the
>  columns as a percentage, instead of presenting them as counts?
> 
>  Thank you so much.
> 
>  Paul
> 
>  El lun, 16 ago 2021 a las 17:33, Rui Barradas ()
>  escribió:
> 
> > Hello,
> >
> > You forgot to cc the list.
> >
> > Here are two ways,

Re: [R] Cars2

2021-08-17 Thread Rui Barradas

Hello,

That seems to be a subset of dataset cars.
Are you sure that it wasn't created before by code?


Hope this helps,

Rui Barradas

Às 10:50 de 17/08/21, George Bellas escreveu:

Hi R Core team,

I was looking at this link for data sets,

https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html

and wanted to use cars2. I found cars2 earlier this week, but now when I look 
at the list, it's not there!

Can you tell me what happened to cars2? I have tried calling it on my version 
of R4.0.5 (backdated so Shiny works), but I can't find it anymore.

Thanks,
George


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Including percentage values inside columns of a histogram

2021-08-17 Thread Rui Barradas

Hello,



Às 19:28 de 17/08/21, Bert Gunter escreveu:

Inline below.



On Tue, Aug 17, 2021 at 4:09 AM Rui Barradas  wrote:


Hello,

I had forgotten about plot.histogram, it does make everything simpler.
To have percentages on the bars, in the code below I use package scales.

Note that it seems to me that you do not want densities, to have
percentages,  the proportions of counts are given by any of


Under the default of equal width bins -- which is what Sturges gives


Right.

if I read the docs correctly -- since the densities sum to 1, 


The "densities" do not sum to 1. From ?hist, section Value:

density 
values f^(x[i]), as estimated density values. If all(diff(breaks) == 1), 
they are the relative frequencies counts/n and in general satisfy

sum[i; f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] = breaks[i].


If all(diff(breaks) == 1) is FALSE, the density list member must be 
multiplied by diff(.$breaks)



h <- hist(datasetregs$Amount, plot = FALSE)
sum(h$density)
#[1] 1e-04
diff(h$breaks)
#[1] 1 1 1 1 1 1 1 1 1 1
sum(h$density*diff(h$breaks))
#[1] 1


Hope this helps,

Rui Barradas

they are

already the proportion of counts in each histogram bin, no?

-- Bert




h$counts/sum(h$counts)
h$density*diff(h$breaks)



# One histogram for all dates
h <- hist(datasetregs$Amount, plot = FALSE)
plot(h, labels = scales::percent(h$counts/sum(h$counts)),
   ylim = c(0, 1.1*max(h$counts)))



# Histograms by date
sp <- split(datasetregs, datasetregs$Date)
old_par <- par(mfrow = c(1, 3))
h_list <- lapply(seq_along(sp), function(i){
hist_title <- paste("Histogram of", names(sp)[i])
h <- hist(sp[[i]]$Amount, plot = FALSE)
plot(h, main = hist_title, xlab = "Amount",
 labels = scales::percent(h$counts/sum(h$counts)),
 ylim = c(0, 1.1*max(h$counts)))
})
par(old_par)


Hope this helps,

Rui Barradas

Às 01:49 de 17/08/21, Bert Gunter escreveu:

I may well misunderstand, but proffered solutions seem more complicated
than necessary.
Note that the return of hist() can be saved as a list of class "histogram"
and then plotted with  plot.histogram(), which already has a "labels"
argument that seems to be what you want. A simple example is"

dat <- runif(50, 0, 10)
myhist <- hist(dat, freq = TRUE, breaks ="Sturges")

plot(myhist, col = "darkgray",
   labels = as.character(round(myhist$density*100,1) ),
   ylim = c(0, 1.1*max(myhist$counts)))
## note that this is plot.histogram because myhist has class "histogram"

Note that I expanded the y axis a bit to be sure to include the labels. You
can, of course, plot your separate years as Rui has indicated or via e.g.
?layout.

Apologies if I have misunderstood. Just ignore this in that case.
Otherwise, I leave it to you to fill in details.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal  wrote:


Dear Jim,

Thank you so much for your kind reply. Yes, this is what I am looking for,
however, can´t see clearly how the bars correspond to the bins in the
x-axis. Maybe there is a way to align the amounts so that they match the
columns, sorry if I sound picky, but just want to learn if there is a way
to accomplish this.

Best regards,

Paul

El lun, 16 ago 2021 a las 17:57, Jim Lemon ()
escribió:


Hi Paul,
I just worked out your first request:

datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
"factor"),
  Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
  15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
  15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
  15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
  15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
  16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
  15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
  15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
  15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class =
"data.frame")
histval<-with(datasetregs, hist(Amount, groups=Date, scale="frequency",
   breaks="Sturges", col="darkgray"))
library(plotrix)
histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
barlabels(histval$mids,histval$counts,histpcts)

I think that's what you asked for:

Jim

On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal 
wrote:


This is way better, now, how could I put the frequency labels in the
columns as a percentage, instead of presenting them as counts?

[R] Package for "design graphs"

2021-08-17 Thread madsmh
Hi,

in our course littrature a "design graph" of two factors R and S with
associated maps s : I -> S and f : I -> S where I is some finite index
set, is a graph with factor labeles as vertices and lines f(i) to s(i)
for all observations i in I. Is there a package on CRAN that can draw
graphs like this automatically?

I haven't been able to find anyting by searching.

Regards, Mads

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Including percentage values inside columns of a histogram

2021-08-17 Thread Bert Gunter
Ah yes. Duhhh...  Thanks Rui.

So h$density *diff(h$breaks) *100 will give the percentages. No need
for arithmetic beyond that.

Bert

On Tue, Aug 17, 2021 at 12:03 PM Rui Barradas  wrote:
>
> Hello,
>
>
>
> Às 19:28 de 17/08/21, Bert Gunter escreveu:
> > Inline below.
> >
> >
> >
> > On Tue, Aug 17, 2021 at 4:09 AM Rui Barradas  wrote:
> >>
> >> Hello,
> >>
> >> I had forgotten about plot.histogram, it does make everything simpler.
> >> To have percentages on the bars, in the code below I use package scales.
> >>
> >> Note that it seems to me that you do not want densities, to have
> >> percentages,  the proportions of counts are given by any of
> >
> > Under the default of equal width bins -- which is what Sturges gives
>
> Right.
>
> > if I read the docs correctly -- since the densities sum to 1,
>
> The "densities" do not sum to 1. From ?hist, section Value:
>
> density
> values f^(x[i]), as estimated density values. If all(diff(breaks) == 1),
> they are the relative frequencies counts/n and in general satisfy
> sum[i; f^(x[i]) (b[i+1]-b[i])] = 1, where b[i] = breaks[i].
>
>
> If all(diff(breaks) == 1) is FALSE, the density list member must be
> multiplied by diff(.$breaks)
>
>
> h <- hist(datasetregs$Amount, plot = FALSE)
> sum(h$density)
> #[1] 1e-04
> diff(h$breaks)
> #[1] 1 1 1 1 1 1 1 1 1 1
> sum(h$density*diff(h$breaks))
> #[1] 1
>
>
> Hope this helps,
>
> Rui Barradas
>
> they are
> > already the proportion of counts in each histogram bin, no?
> >
> > -- Bert
> >
> >
> >>
> >> h$counts/sum(h$counts)
> >> h$density*diff(h$breaks)
> >>
> >>
> >>
> >> # One histogram for all dates
> >> h <- hist(datasetregs$Amount, plot = FALSE)
> >> plot(h, labels = scales::percent(h$counts/sum(h$counts)),
> >>ylim = c(0, 1.1*max(h$counts)))
> >>
> >>
> >>
> >> # Histograms by date
> >> sp <- split(datasetregs, datasetregs$Date)
> >> old_par <- par(mfrow = c(1, 3))
> >> h_list <- lapply(seq_along(sp), function(i){
> >> hist_title <- paste("Histogram of", names(sp)[i])
> >> h <- hist(sp[[i]]$Amount, plot = FALSE)
> >> plot(h, main = hist_title, xlab = "Amount",
> >>  labels = scales::percent(h$counts/sum(h$counts)),
> >>  ylim = c(0, 1.1*max(h$counts)))
> >> })
> >> par(old_par)
> >>
> >>
> >> Hope this helps,
> >>
> >> Rui Barradas
> >>
> >> Às 01:49 de 17/08/21, Bert Gunter escreveu:
> >>> I may well misunderstand, but proffered solutions seem more complicated
> >>> than necessary.
> >>> Note that the return of hist() can be saved as a list of class "histogram"
> >>> and then plotted with  plot.histogram(), which already has a "labels"
> >>> argument that seems to be what you want. A simple example is"
> >>>
> >>> dat <- runif(50, 0, 10)
> >>> myhist <- hist(dat, freq = TRUE, breaks ="Sturges")
> >>>
> >>> plot(myhist, col = "darkgray",
> >>>labels = as.character(round(myhist$density*100,1) ),
> >>>ylim = c(0, 1.1*max(myhist$counts)))
> >>> ## note that this is plot.histogram because myhist has class "histogram"
> >>>
> >>> Note that I expanded the y axis a bit to be sure to include the labels. 
> >>> You
> >>> can, of course, plot your separate years as Rui has indicated or via e.g.
> >>> ?layout.
> >>>
> >>> Apologies if I have misunderstood. Just ignore this in that case.
> >>> Otherwise, I leave it to you to fill in details.
> >>>
> >>> Bert Gunter
> >>>
> >>> "The trouble with having an open mind is that people keep coming along and
> >>> sticking things into it."
> >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>>
> >>>
> >>> On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal  
> >>> wrote:
> >>>
>  Dear Jim,
> 
>  Thank you so much for your kind reply. Yes, this is what I am looking 
>  for,
>  however, can´t see clearly how the bars correspond to the bins in the
>  x-axis. Maybe there is a way to align the amounts so that they match the
>  columns, sorry if I sound picky, but just want to learn if there is a way
>  to accomplish this.
> 
>  Best regards,
> 
>  Paul
> 
>  El lun, 16 ago 2021 a las 17:57, Jim Lemon ()
>  escribió:
> 
> > Hi Paul,
> > I just worked out your first request:
> >
> > datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
> > 2L,
> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class =
> > "factor"),
> >   Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
> >   15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
> >   15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
> >   15000, 15000, 15000, 15000, 

Re: [R] Package for "design graphs"

2021-08-17 Thread Bert Gunter
Have you looked at the gR Task View on CRAN:
https://cran.r-project.org/web/views/gR.html

(I have no idea whether it's relevant to your query, though).

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Aug 17, 2021 at 1:02 PM  wrote:
>
> Hi,
>
> in our course littrature a "design graph" of two factors R and S with
> associated maps s : I -> S and f : I -> S where I is some finite index
> set, is a graph with factor labeles as vertices and lines f(i) to s(i)
> for all observations i in I. Is there a package on CRAN that can draw
> graphs like this automatically?
>
> I haven't been able to find anyting by searching.
>
> Regards, Mads
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package for "design graphs"

2021-08-17 Thread Jeff Newmiller
Perhaps igraph or DiagrammeR?

On August 16, 2021 9:55:50 AM PDT, mad...@gmail.com wrote:
>Hi,
>
>in our course littrature a "design graph" of two factors R and S with
>associated maps s : I -> S and f : I -> S where I is some finite index
>set, is a graph with factor labeles as vertices and lines f(i) to s(i)
>for all observations i in I. Is there a package on CRAN that can draw
>graphs like this automatically?
>
>I haven't been able to find anyting by searching.
>
>Regards, Mads
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [External] Package for "design graphs"

2021-08-17 Thread Richard M. Heiberger
can you post an example of the graph?


From: R-help  on behalf of mad...@gmail.com 

Sent: Tuesday, August 17, 2021 16:02
To: r-help@r-project.org
Subject: [External] [R] Package for "design graphs"

Hi,

in our course littrature a "design graph" of two factors R and S with
associated maps s : I -> S and f : I -> S where I is some finite index
set, is a graph with factor labeles as vertices and lines f(i) to s(i)
for all observations i in I. Is there a package on CRAN that can draw
graphs like this automatically?

I haven't been able to find anyting by searching.

Regards, Mads

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Crmh%40temple.edu%7C00a59081bd66463e433d08d961b9e105%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637648273248735184%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=dNwMcJa6okIxlK97dprkVkCk4g5mn6hYsBr1Hez7m%2F8%3D&reserved=0
PLEASE do read the posting guide 
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Crmh%40temple.edu%7C00a59081bd66463e433d08d961b9e105%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637648273248745142%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YZlqRVki%2FaQMdaGQgddyLLmAH3bCsnhPTuhlnrSb1PE%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.