Dear Jason,
The code could look something like:
dummyData = data.frame(Tract=seq(1, 10, by=1),
Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))
# Define the cutoffs
# - allow for duplicate entries;
by = 0.03; # by = 0.01;
cutoffs <- seq(0, 0.20, by = by)
# Create a new column with cutoffs
dummyData$Cutoff <- cut(dummyData$Pct, breaks = cutoffs,
labels = cutoffs[-1], ordered_result = TRUE)
# Sort data
# - we could actually order only the columns:
# Totpop & Cutoff;
dummyData = dummyData[order(dummyData$Cutoff), ]
# Result
cs = cumsum(dummyData$Totpop)
# Only last entry:
# - I do not have a nice one-liner, but this should do it:
isLast = rev(! duplicated(rev(dummyData$Cutoff)))
data.frame(Total = cs[isLast],
Cutoff = dummyData$Cutoff[isLast])
Sincerely,
Leonard
On 10/15/2023 7:41 PM, Leonard Mada wrote:
Dear Jason,
I do not think that the solution based on aggregate offered by GPT was
correct. That quasi-solution only aggregates for every individual level.
As I understand, you want the cumulative sum. The idea was proposed by
Bert; you need only to sort first based on the cutoff (e.g. using an
ordered factor). And then only extract the last value for each level.
If Pct is unique, than you can skip this last step and use directly
the cumsum (but on the sorted data set).
Alternatives: see the solutions with loops or with sapply.
Sincerely,
Leonard
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.