Just a comment: You wish to draw subsets of size n with or without replacement -- and I suspect without replacement is simpler than with -- from a set of positive integers that sum to a fixed value T. This sounds related to the so-called subset sum problem in computational complexity theory: Given a set S of positive integers and positive total, T, is there *any* subset of S (i.e. a sample without replacement) whose sum is T? This is known to be NP-complete. So this means that listing all such subsets must be NP-complete. I don't know whether specifying that, as you have, the size of the subsets must be a fixed number (obviously?) simplifies the problem sufficiently to make it computationally tractable; computational wizards or suitable web searches might tell you this. But for these reasons, your query about how to sample the set of all such subsets -- both those drawn with or without replacement -- seems computationally difficult to me.
Cheers, Bert "An educated person is one who can entertain new ideas, entertain others, and entertain herself." On Mon, Apr 14, 2025 at 8:37 AM Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > On 2025-04-14 7:26 a.m., Brian Smith wrote: > > Hi, > > > > For my analytical work, I need to draw a sample of certain sample size > > from a denied population, where population members are marked by > > non-negative integers, such that sum of sample members if fixed. For > > example, > > > > Population = 0:100 > > Sample_size = 10 > > Sample_Sum = 20 > > > > Under this setup if my sample members are X1, X2, ..., X10 then I > > should have X1+X2+...+X10 = 20 > > > > Sample drawing scheme may be with/without replacement > > > > Is there any R function to achieve this? One possibility is to employ > > naive trial-error approach, but this doesnt seem to be practical as it > > would take long time to get the final sample with desired properties. > > > > Any pointer would be greatly appreciated. > > One general way to think of this problem is that you are defining a > distribution on the space of all possible samples of size 10, such that > the probability of a sample is X if the sum is 20, and zero otherwise, > and you want to sample from this distribution. > > There's probably a slick method to do that for your example, but if > you've got a general population instead of that special one, I doubt it. > > What I would do is the following: > > Define another distribution on samples that has probabilities that > depend on the sum of the sample, with the highest probabilities attached > to ones with the correct sum, and probabilities for other sums declining > with distance from the sum. For example, maybe > > P(sum) = Y/(1 + abs(sum - 20)) > > for some constant Y. > > You can use MCMC to sample from that distribution and then only keep the > samples where the sum is exactly equal to the target sum. If you do > that, you don't need to care about the value of Y. but you do need to > think about how proposed moves are made, and you probably need to use a > different function than the example above for acceptable efficiency. > > Duncan Murdoch > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.