I can imagine it in CollectionUtils if it's generified. On Tue, Aug 27, 2013 at 2:12 AM, Phil Steitz <phil.ste...@gmail.com> wrote: > On 8/26/13 4:37 AM, Gilles wrote: >> On Sun, 25 Aug 2013 17:14:37 -0700, Phil Steitz wrote: >>> On 8/25/13 9:59 AM, Gilles wrote: >>>> On Sun, 25 Aug 2013 09:19:41 -0700, Phil Steitz wrote: >>>>> On 8/25/13 8:10 AM, Gilles wrote: >>>>>> On Sat, 24 Aug 2013 21:55:36 -0000, pste...@apache.org wrote: >>>>>> >>>>>>> [...] >>>>>> >>>>>> I wonder whether the utility should involve the creation of >>>>>> an instance of a bona fide "Combination" class. >>>>>> I.e. instead of a "naked" >>>>>> Iterator<int[]> combinationIterator(int n, int k) >>>>>> there would be >>>>>> Iterator<Combination> combinationIterator(int n, int k) >>>>>> where "Combination" would provide an accessor >>>>>> public int[] toArray() { >>>>>> // ... >>>>>> } >>>>> >>>>> I need the "naked arrays" generated very quickly with minimal >>>>> overhead for the K-S use case. >>>> >>>> The only "overhead" would be the creation of an instance for >>>> wrapping the "int[]". >>>> [As allocation is fairly cheap in Java, that should not be a >>>> problem.] >>>> >>>>> My use case uses the arrays >>>>> directly, but others could use the indices to extract objects from >>>>> lists. For example, what I thought would be natural would be to >>>>> add >>>>> public static <T> Iterator<List<T>> combinationsIterator(List<T> >>>>> list, int k) >>>> >>>> Would this be the base implementation, from which your case would >>>> be implemented by retrieving "Integer" from the list and >>>> copy/convert >>>> them into an "int[]"? >>> >>> No, the opposite actually. The base impl is what exists now. It >>> works efficiently directly on an int[] array to compute the indices >>> within the source list (in my use case, {0, 1, ..., n-1} which is >>> actually canonical). Then if you want to source subsets of objects >>> from a list of objects, you can compute the int[] array that >>> actually *is* the combination and extract the corresponding elements >>> from the list. >> >> I got it. >> >>>> >>>>> >>>>> What you would get out of this is element lists. And the >>>>> implementation could postpone copy / creation until the indices >>>>> have >>>>> been selected using the existing method. >>>> >>>> I don't follow ... >>> >>> See above. I am probably still not explaining clearly what I have >>> in mind. To me, a "combination" is a subset of the integers {0, >>> ..., n-1}. Technically it is a set, not a list, but there is a >>> natural order available and enumeration can work easier if you use >>> that order and adopt the convention that you represent combinations >>> as ordered lists of integers. The LexicographicCombinationIterator >>> that I committed does that, using primitive arrays to represent the >>> lists. The iterator itself uses a primitive array of int[] to >>> maintain state and create successive iterates. Use of a single >>> primitive array eliminates all object creation / garbage collection, >>> which is not necessary to just enumerate the combinations. Once you >>> have a "true" combination represented as an array of int[], you can >>> use it to specify a "combination" (really sample or subset) of a >>> list of any kind of object. >> >> So, IIUC it would be useful to have a utility like >> ----- >> public static List<T> getSample(List<T> elements, >> int[] combination) { >> final List<T> samples = new ArrayList<T>(combination.length); >> for (int index : combination) { >> samples.add(elements.get(index)); >> } >> } > > Yes, that is what the List version would use. Not sure how > generically useful it would be, but I can imagine other use cases. > It might be best to call it filter or subList or something and put > it in MathArrays. Not sure where it would belong as a generic utility. > > >> ----- >> >>>> >>>>> I am not yet sold on the >>>>> value of "Combination" as an abstraction, since what you really >>>>> use >>>>> is just the list of selected elements. >>>> >>>> Maybe "Combination" is not worth implementing indeed; but what >>>> about >>>> a "Combinations" class: >>>> >>>> public class Combinations implements Iterable<int[]> { >>>> private final int n; >>>> private final int k; >>>> >>>> public Combinations(int n, int k) { >>>> // ... >>>> } >>>> >>>> public Iterator<int[]> iterator() { >>>> return new LexicographicCombinationIterator(n, k); >>>> } >>>> >>>> public Comparator<int[]> comparator() { >>>> return new LexicographicCombinationComparator(n, k); >>>> } >>>> >>>> private static class LexicographicCombinationIterator implements >>>> Iterator<int[]> { >>>> // [Your implementation] >>>> } >>>> >>>> private static class LexicographicCombinationComparator >>>> implements Comparator<int[]> { >>>> public int compare(int[], int[]) { >>>> // ... using e.g. "lexNorm" implementation. >>>> } >>>> } >>>> } >>>> >>>> Being "Iterable" allows nice usage in a "for" loop: >>>> >>>> for (int[] combination : new Combinations(4, 2)) { >>>> // ... etc. >>>> } >>> >>> I thought about that and that is what I meant by the original >>> suggestion that we possibly make the >>> LexicographicCombinationsIterator a public class. I just don't see >>> the practical use cases for it as a standalone class. >> >> The "for"-loop example above uses the iterator functionality but >> in a much less verbose way (no explicit calls to "hasNext()" and >> "next()"). >> This is standard practice for everything that can be handled as a >> collection. >> >> Then, we can also have your iterator factory method: >> ----- >> public static Iterator<int[]> combinationsIterator(int n, int k) { >> return new Combinations(4, 2).iterator(); >> } >> ----- >> >> Then, we also don't hide some code in "test" where it could be made >> visible and be used elsewhere (cf. "lexNorm"). > > OK, I see your point. >> >>> What I have >>> immediate practical need for is just a way to get an Iterator<int[]> >>> based on <n,k>. As you note above, it is the Iterator that is >>> useful. I can imagine use cases for the object-list version >>> public static <T> Iterator<List<T>> combinationsIterator(List<T> >>> list, int k ) though. If we do extract a public class, I would >>> prefer that it be called something like CombinationsIterator. >> >> I think that a design based on the "Iterable" interface is >> preferrable. >> From it you can always get the iterator functionality (by >> definition of >> the interface), while the reverse is not true. > > Agreed. >> >>> If we >>> have other iteration strategies that we want to implement, we could >>> make it an abstract parent for LexicographicCombinationsIterator. I >>> would recommend holding off complicating the design / structure >>> though until we have these though. >> >> Sorry to insist but it is not a complication, it is a >> _simplification_ ;-) >> Instead of a specifically named method ("combinationsIterator"), >> we rely >> on the standard API ("Iterable") of the Java language. > > Yeah, I get it now. Go for it or open a ticket so we don't forget. > > Phil >> >> >> Gilles >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org >
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org