On Sep 2, 6:49 pm, Travis Parks <jehugalea...@gmail.com> wrote: > On Sep 2, 4:09 pm, Ian Kelly <ian.g.ke...@gmail.com> wrote: > > > > > > > On Fri, Sep 2, 2011 at 10:59 AM, Travis Parks <jehugalea...@gmail.com> > > wrote: > > > Hello: > > > > I am working on an algorithms library. It provides LINQ like > > > functionality to Python iterators. Eventually, I plan on having > > > feaures that work against sequences and mappings. > > > > I have the code up athttp://code.google.com/p/py-compass. > > > > This is my first project in Python, so I'd like some feedback. I want > > > to know if I am following conventions (overall style and quality of > > > code). > > > Sure, here are my comments. > > > In the "forever" and "__forever" functions, your use of the term > > "generator" is confusing. "__forever" is a generator function, > > because it has a yield statement. Its argument, called "generator", > > appears to be a callable, not a generator or even necessarily a > > generator function. Also, note that __forever(lambda: value) is > > functionally equivalent to the more efficient itertools.repeat(value). > > > The staticmethod __next(iterator) accesses the class it is defined in, > > which suggests that it might be better written as a classmethod > > __next(cls, iterator). > > > Each of the LINQ-style methods is divided into two parts: the public > > method that contains the docstring and some argument checks, and a > > private staticmethod that contains the implementation. I'm not > > certain what the purpose of that is. If it's to facilitate overriding > > the implementation in subclasses, then you need to change the names of > > the private methods to start with only one _ character so that they > > won't be mangled by the compiler. > > > The comments before each method that only contain the name of the > > immediately following method are redundant. > > > aggregate: the default aggregator is unintuitive to me. I would make > > it a required field and add a separate method called sum that calls > > aggregate with the operator.add aggregator. > > Also, the implementation doesn't look correct. Instead of passing in > > each item to the aggregator, you're passing in the number of items > > seen so far? The LINQ Aggregate method is basically reduce, so rather > > than reinvent the wheel I would suggest this: > > > # MISSING is a unique object solely defined to represent missing arguments. > > # Unlike None we can safely assume it will never be passed as actual data. > > MISSING = object() > > > def aggregate(self, aggregator, seed=MISSING): > > if seed is self.MISSING: > > return reduce(aggregator, self._iterable) > > else: > > return reduce(aggregator, self._iterable, seed) > > > Note for compatibility that in Python 3 the reduce function has been > > demoted from a builtin to a member of the functools module. > > > any: the name of this method could cause some confusion with the "any" > > builtin that does something a bit different. > > > compare: the loop would more DRY as a for loop: > > > def __compare(first, second, comparison): > > for firstval, secondval in itertools.izip_longest(first, second, > > fillvalue=self.MISSING): > > if firstval is self.MISSING: > > return -1 > > elif secondval is self.MISSING: > > return 1 > > else: > > result = comparison(firstval, secondval) > > if result != 0: > > return result > > return 0 > > > concatenate: again, no need to reinvent the wheel. This should be > > more efficient: > > > def concatenate(self, other): > > return extend(itertools.chain(self.__iterable, other)) > > > equals: could be just "return self.compare(other, comparison) == 0" > > > __last: the loop could be a for loop: > > > # assume we're looking at the last item and try moving to the next > > item = result.Value > > for item in iterator: pass > > return item > > > lastOrDefault: there's a lot of repeated logic here. This could just be: > > > def lastOrDefault(self, default=None): > > try: > > return self.last() > > except ValueError: > > return default > > > map / forEach: .NET has to separate these into separate methods due to > > static typing. It seems a bit silly to have both of them in Python. > > Also, map would be more efficient as "return itertools.imap(mapper, > > self.__iterable)" > > > max / min: it would be more efficient to use the builtin: > > def max(self, key): > > return max(self.__iterable, key=key) > > If somebody really needs to pass a comparison function instead of a > > key function, they can use functools.cmp_to_key. > > > randomSamples: a more canonical way to pass the RNG would be to pass > > an instance of random.Random rather than an arbitrary function. Then > > to get a random integer you can just call generator.randrange(total). > > Note that for the default you can just use the random module itself, > > which provides default implementations of all the Random methods. > > Also, for loops: > > > def __randomSamples(iterable, sampleCount, generator): > > samples = [] > > iterator = iter(iterable) > > # fill up the samples with the items from the beginning of the > > iterable > > for i, value in zip(xrange(sampleCount), iterator): > > samples.append(value) > > # replace items if the generated number is less than the total > > total = len(samples) > > for value in iterator: > > total += 1 > > index = generator.randrange(total) > > if index < len(samples): > > samples[index] = result > > return samples > > > __reverse: you could just "return reversed(list(iterable))" > > > __rotateLeft: > > def __rotateLeft(iterable, shift): > > iterator = iter(iterable) > > front = list(itertools.islice(iterator, shift)) > > return itertools.chain(iterator, front) > > > skipWhile: suggest using itertools.dropwhile > > take: suggest using itertools.islice > > takeWhile: suggest using itertools.takewhile > > __toList: "return list(iterable)" > > __toSet: "return set(iterable)" > > __toTuple: "return tuple(iterable)". Note that as currently written > > this actually returns a generator, not a tuple. > > __where: suggest using itertools.ifilter > > __zip: suggest using a for loop over itertools.izip(first, second) > > > Lookup: is inconsistent. The overridden __iter__ method returns an > > iterator over the values of the groups, but all the inherited methods > > are going to iterate over the keys. Probably you want to pass > > groups.values() to the superclass __init__ method. > > > Cheers, > > Ian > > Awesome tips. I appreciate the time you spent commenting on just about > every function. I really like your suggestions about using itertools > more, and for loops. I was feeling like some things were becoming way >too complicated. > > I also noted the bugs you discovered. I will incorporate your > suggestions. > > Thanks again!- Hide quoted text - > > - Show quoted text -
You commented that the itertools algorithms will perform faster than the hand-written ones. Are these algorithms optimized internally? -- http://mail.python.org/mailman/listinfo/python-list