Dear Jason,

The code could look something like:

dummyData = data.frame(Tract=seq(1, 10, by=1),
    Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
    Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))

# Define the cutoffs
# - allow for duplicate entries;
by = 0.03; # by = 0.01;
cutoffs <- seq(0, 0.20, by = by)

# Create a new column with cutoffs
dummyData$Cutoff <- cut(dummyData$Pct, breaks = cutoffs,
    labels = cutoffs[-1], ordered_result = TRUE)

# Sort data
# - we could actually order only the columns:
#   Totpop & Cutoff;
dummyData = dummyData[order(dummyData$Cutoff), ]

# Result
cs = cumsum(dummyData$Totpop)

# Only last entry:
# - I do not have a nice one-liner, but this should do it:
isLast = rev(! duplicated(rev(dummyData$Cutoff)))

data.frame(Total = cs[isLast],
    Cutoff = dummyData$Cutoff[isLast])


Sincerely,

Leonard


On 10/15/2023 7:41 PM, Leonard Mada wrote:
Dear Jason,


I do not think that the solution based on aggregate offered by GPT was correct. That quasi-solution only aggregates for every individual level.


As I understand, you want the cumulative sum. The idea was proposed by Bert; you need only to sort first based on the cutoff (e.g. using an ordered factor). And then only extract the last value for each level. If Pct is unique, than you can skip this last step and use directly the cumsum (but on the sorted data set).


Alternatives: see the solutions with loops or with sapply.


Sincerely,


Leonard



______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to