New submission from Sven Rahmann <svenrahm...@googlemail.com>: The syntax of generator expressions suggests that they can be used similarly to lists (at least when iterated over). However, as was pointed out to me, the resulting generators are iterators and can be used only once. This is inconvenient in situations where some function expects an iterable argument but needs to iterate over it more than once.
Consider the following function (see also attached file reusable_generators.py for a complete example) def secondmax(iterable): """return the second largest value in iterable""" m = max(iterable) return max(i for i in iterable if i<m) It works fine when passed a list or other iterable container, but consider the following situation. We have a huge matrix A (list of lists) and want to pass a column to the function. Using a list works fine, but requires copying the column's values and needs additional memory: col2_list = [a[2] for a in A] # new list created from column 2 There is no reason why we shouldn't be able to create an iterable object that returns, one by one, the values from the colums: col2_gen = (a[2] for a in A) The problem is that secondmax(col2_gen) does not work; try the attached file: col2_gen can be iterated over only once. I can imagine many situations where I need or want to iterate over such a "view" object several times; I don't see a reason why it shouldn't be possible or why it would be unwanted. We can do the following, but it is not elegant: Wrap the generator expression into a closure and a class. class ReusableGenerator(): def __init__(self,g): self.g = g def __iter__(self): return self.g() col2_re = ReusableGenerator(lambda: (a[2] for a in A)) # I want this! This works, but it is not a generator object (e.g., it doesn't have a next method). We also need the lambda detour for this to work. Note that in some situations, the "problem" I describe does not occur or can be easily circumvented. For example instead of writing col2 = (a[2] for a in A) for x in col2: foo(x) for x in col2: foo(x) # doesn't work we could just repeat the generator expression (and create a new iterator whenever we need it): for x in (a[2] for a in A): foo(x) for x in (a[2] for a in A): foo(x) # works fine But exactly this is impossible if I want to pass the generator expression or generator function to another function (such as secondmax()). I believe this contradicts Python philosophy that functions can be passed around just like other objects. My proposal is probably unrealistic, but I would like to see generator functions and generator expressions change in a way that they return not iterators, but iterables, so the problem described here does not occur, and wrapper classes are unnecessary. In Java that distinction is very clear, in Python less so I think (which is good because iterators are a pain to use in Java). Admittedly, I have no idea why generator functions and expressions are implemented as they are; there are probably lots of good reasons, and it may not be possible to change this any time soon or at all. However, I think the change would make Python a more consistent language. ---------- files: reusable_generators.py messages: 87473 nosy: svenrahmann severity: normal status: open title: re-usable generators / generator expressions should return iterables type: feature request versions: Python 3.1 Added file: http://bugs.python.org/file13936/reusable_generators.py _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5973> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com