On 8/26/13 4:37 AM, Gilles wrote: > On Sun, 25 Aug 2013 17:14:37 -0700, Phil Steitz wrote: >> On 8/25/13 9:59 AM, Gilles wrote: >>> On Sun, 25 Aug 2013 09:19:41 -0700, Phil Steitz wrote: >>>> On 8/25/13 8:10 AM, Gilles wrote: >>>>> On Sat, 24 Aug 2013 21:55:36 -0000, pste...@apache.org wrote: >>>>> >>>>>> [...] >>>>> >>>>> I wonder whether the utility should involve the creation of >>>>> an instance of a bona fide "Combination" class. >>>>> I.e. instead of a "naked" >>>>> Iterator<int[]> combinationIterator(int n, int k) >>>>> there would be >>>>> Iterator<Combination> combinationIterator(int n, int k) >>>>> where "Combination" would provide an accessor >>>>> public int[] toArray() { >>>>> // ... >>>>> } >>>> >>>> I need the "naked arrays" generated very quickly with minimal >>>> overhead for the K-S use case. >>> >>> The only "overhead" would be the creation of an instance for >>> wrapping the "int[]". >>> [As allocation is fairly cheap in Java, that should not be a >>> problem.] >>> >>>> My use case uses the arrays >>>> directly, but others could use the indices to extract objects from >>>> lists. For example, what I thought would be natural would be to >>>> add >>>> public static <T> Iterator<List<T>> combinationsIterator(List<T> >>>> list, int k) >>> >>> Would this be the base implementation, from which your case would >>> be implemented by retrieving "Integer" from the list and >>> copy/convert >>> them into an "int[]"? >> >> No, the opposite actually. The base impl is what exists now. It >> works efficiently directly on an int[] array to compute the indices >> within the source list (in my use case, {0, 1, ..., n-1} which is >> actually canonical). Then if you want to source subsets of objects >> from a list of objects, you can compute the int[] array that >> actually *is* the combination and extract the corresponding elements >> from the list. > > I got it. > >>> >>>> >>>> What you would get out of this is element lists. And the >>>> implementation could postpone copy / creation until the indices >>>> have >>>> been selected using the existing method. >>> >>> I don't follow ... >> >> See above. I am probably still not explaining clearly what I have >> in mind. To me, a "combination" is a subset of the integers {0, >> ..., n-1}. Technically it is a set, not a list, but there is a >> natural order available and enumeration can work easier if you use >> that order and adopt the convention that you represent combinations >> as ordered lists of integers. The LexicographicCombinationIterator >> that I committed does that, using primitive arrays to represent the >> lists. The iterator itself uses a primitive array of int[] to >> maintain state and create successive iterates. Use of a single >> primitive array eliminates all object creation / garbage collection, >> which is not necessary to just enumerate the combinations. Once you >> have a "true" combination represented as an array of int[], you can >> use it to specify a "combination" (really sample or subset) of a >> list of any kind of object. > > So, IIUC it would be useful to have a utility like > ----- > public static List<T> getSample(List<T> elements, > int[] combination) { > final List<T> samples = new ArrayList<T>(combination.length); > for (int index : combination) { > samples.add(elements.get(index)); > } > }
Yes, that is what the List version would use. Not sure how generically useful it would be, but I can imagine other use cases. It might be best to call it filter or subList or something and put it in MathArrays. Not sure where it would belong as a generic utility. > ----- > >>> >>>> I am not yet sold on the >>>> value of "Combination" as an abstraction, since what you really >>>> use >>>> is just the list of selected elements. >>> >>> Maybe "Combination" is not worth implementing indeed; but what >>> about >>> a "Combinations" class: >>> >>> public class Combinations implements Iterable<int[]> { >>> private final int n; >>> private final int k; >>> >>> public Combinations(int n, int k) { >>> // ... >>> } >>> >>> public Iterator<int[]> iterator() { >>> return new LexicographicCombinationIterator(n, k); >>> } >>> >>> public Comparator<int[]> comparator() { >>> return new LexicographicCombinationComparator(n, k); >>> } >>> >>> private static class LexicographicCombinationIterator implements >>> Iterator<int[]> { >>> // [Your implementation] >>> } >>> >>> private static class LexicographicCombinationComparator >>> implements Comparator<int[]> { >>> public int compare(int[], int[]) { >>> // ... using e.g. "lexNorm" implementation. >>> } >>> } >>> } >>> >>> Being "Iterable" allows nice usage in a "for" loop: >>> >>> for (int[] combination : new Combinations(4, 2)) { >>> // ... etc. >>> } >> >> I thought about that and that is what I meant by the original >> suggestion that we possibly make the >> LexicographicCombinationsIterator a public class. I just don't see >> the practical use cases for it as a standalone class. > > The "for"-loop example above uses the iterator functionality but > in a much less verbose way (no explicit calls to "hasNext()" and > "next()"). > This is standard practice for everything that can be handled as a > collection. > > Then, we can also have your iterator factory method: > ----- > public static Iterator<int[]> combinationsIterator(int n, int k) { > return new Combinations(4, 2).iterator(); > } > ----- > > Then, we also don't hide some code in "test" where it could be made > visible and be used elsewhere (cf. "lexNorm"). OK, I see your point. > >> What I have >> immediate practical need for is just a way to get an Iterator<int[]> >> based on <n,k>. As you note above, it is the Iterator that is >> useful. I can imagine use cases for the object-list version >> public static <T> Iterator<List<T>> combinationsIterator(List<T> >> list, int k ) though. If we do extract a public class, I would >> prefer that it be called something like CombinationsIterator. > > I think that a design based on the "Iterable" interface is > preferrable. > From it you can always get the iterator functionality (by > definition of > the interface), while the reverse is not true. Agreed. > >> If we >> have other iteration strategies that we want to implement, we could >> make it an abstract parent for LexicographicCombinationsIterator. I >> would recommend holding off complicating the design / structure >> though until we have these though. > > Sorry to insist but it is not a complication, it is a > _simplification_ ;-) > Instead of a specifically named method ("combinationsIterator"), > we rely > on the standard API ("Iterable") of the Java language. Yeah, I get it now. Go for it or open a ticket so we don't forget. Phil > > > Gilles > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org