Look for a vectorized solution such as ?table or ?aggregate to obtain a list of combination counts, followed by logical indexing or ?subset to get a set of valid combinations, and then use ?sample to get your random selections of locations/surveyors and then process only those combinations from your original data set. Keep in mind that constraints that you apply late in the selection process that can be applied earlier or on less data will usually become more efficient.
If you provide actual sample statements that you have already tried, you may get a less abstract response. See the posting guide mentioned at the end of every posting for more guidance. "Ben Holt" <bh...@bio.ku.dk> wrote: >I have data similar to this: > >Location Surveyor Result >A 1 83 >A 2 76 >A 3 45 >B 1 71 >B 4 67 >C 2 23 >C 5 12 >D 3 34 >E 4 75 >F 4 46 >G 5 90 >etc (5 million records in total) > >I need to divide the data to many subsets then randomly select 5 different >locations and 5 different surveyors (one at each of the 5 randomly selected >locations) for each subset. > >The function I have written basically picks five locations and then 1 surveyor >in each location, checks that there are five different surveyors and if there >isn't tries again. The problem is that for some subsets this doesn't work. > >Some subsets don't have enough locations/surveyors or both, but this can be >checked for easily. The problem subsets do have enoughs locations and >surveyors but still cannot produce 5 locations each with a different surveyor. > The matrix below demonstrates such a subset: > > locations > A B C D E > 1 1 0 0 0 0 >Surveyors 2 1 0 0 0 0 > 3 1 0 0 0 0 > 4 1 0 0 0 0 > 5 1 1 1 1 1 > >I cannot think of a way to check for such a situation and therefore I have >simply programmed the function to give up after 100 attempts if it can't find >a solution. This is not very satisfactory however as the analysis takes a >very long time to run and it would also be very useful useful for me to know >how many suitable solution there are. > >I reckon some of you clever folk out there must be able to think of a better >solution. > >Any advice appreciated, > >Ben > >______________________________________________ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.