If one makes the reasonable assumption that Pct is much larger than Cutoff, sorting Cutoff is the expensive part e.g O(nlog2(n) for Quicksort (n = length Cutoff). I believe looping is O(n^2). Jeff's approach using findInterval may be faster. Of course implementation details matter.
-- Bert On Mon, Oct 16, 2023 at 4:41 AM Leonard Mada <leo.m...@syonic.eu> wrote: > > Dear Jason, > > The code could look something like: > > dummyData = data.frame(Tract=seq(1, 10, by=1), > Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03), > Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800)) > > # Define the cutoffs > # - allow for duplicate entries; > by = 0.03; # by = 0.01; > cutoffs <- seq(0, 0.20, by = by) > > # Create a new column with cutoffs > dummyData$Cutoff <- cut(dummyData$Pct, breaks = cutoffs, > labels = cutoffs[-1], ordered_result = TRUE) > > # Sort data > # - we could actually order only the columns: > # Totpop & Cutoff; > dummyData = dummyData[order(dummyData$Cutoff), ] > > # Result > cs = cumsum(dummyData$Totpop) > > # Only last entry: > # - I do not have a nice one-liner, but this should do it: > isLast = rev(! duplicated(rev(dummyData$Cutoff))) > > data.frame(Total = cs[isLast], > Cutoff = dummyData$Cutoff[isLast]) > > > Sincerely, > > Leonard > > > On 10/15/2023 7:41 PM, Leonard Mada wrote: > > Dear Jason, > > > > > > I do not think that the solution based on aggregate offered by GPT was > > correct. That quasi-solution only aggregates for every individual level. > > > > > > As I understand, you want the cumulative sum. The idea was proposed by > > Bert; you need only to sort first based on the cutoff (e.g. using an > > ordered factor). And then only extract the last value for each level. > > If Pct is unique, than you can skip this last step and use directly > > the cumsum (but on the sorted data set). > > > > > > Alternatives: see the solutions with loops or with sapply. > > > > > > Sincerely, > > > > > > Leonard > > > > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.