This seems to work. A couple of fine points, including handling duplicated Pct
values right, which is easier if you do the reversed cumsum.
> dd2 <- dummydata[order(dummydata$Pct),]
> dd2$Cum <- rev(cumsum(rev(dd2$Totpop)))
> use <- !duplicated(dd2$Pct)
> approx(dd2$Pct[use], dd2$Cum[use], ctof,
Sorry, misstatements. It should (of course) read:
If one makes the reasonable assumption that Pct is much larger than
Cutoff, sorting Pct is the expensive part e.g O(nlog2(n) for
Quicksort (n = length Pct). I believe looping is O(n^2).
etc.
On Mon, Oct 16, 2023 at 7:48 AM Bert Gunter wrote:
>
>
If one makes the reasonable assumption that Pct is much larger than
Cutoff, sorting Cutoff is the expensive part e.g O(nlog2(n) for
Quicksort (n = length Cutoff). I believe looping is O(n^2). Jeff's
approach using findInterval may be faster. Of course implementation
details matter.
-- Bert
On Mo
Dear Jason,
The code could look something like:
dummyData = data.frame(Tract=seq(1, 10, by=1),
Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))
# Define the cutoffs
# - allow for duplicate entries;
by = 0.03; # by
Dear Jason,
I do not think that the solution based on aggregate offered by GPT was
correct. That quasi-solution only aggregates for every individual level.
As I understand, you want the cumulative sum. The idea was proposed by
Bert; you need only to sort first based on the cutoff (e.g. usin
the result
result
So thanks to all for considering this query�we're in a brave new world of
AI-generated coding.
Message: 3
Date: Fri, 13 Oct 2023 20:13:56 +
From: "Jason Stout, M.D."
To: "r-help@r-project.org"
Subject: [R] Create new data frame with conditi
Pre-compute the per-interval answers and use findInterval to look up the
per-row answers...
dat <- read.table( text=
"Tract Pct Totpop
1 0.054000
2 0.033500
3 0.014500
4 0.124100
5 0.
Well, here's one way to do it:
(dat is your example data frame)
Cutoff <- seq(0, .15, .01)
Pop <- with(dat, sapply(Cutoff, \(p)sum(Totpop[Pct >= p])))
I think there must be a more efficient way to do it with cumsum(), though.
Cheers,
Bert
On Sat, Oct 14, 2023 at 12:53 AM Jason Stout, M.D. wrot
This seems like it should be simple but I can't get it to work properly. I'm
starting with a data frame like this:
Tract Pct Totpop
1 0.054000
2 0.033500
3 0.014500
4 0.124100
5 0.21
9 matches
Mail list logo