"Raymond Hettinger" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > [EMAIL PROTECTED] wrote: > > izip's uses can be partitioned two ways: > > 1. All iterables have equal lengths > > 2. Iterables have different lengths. > > > > Case 1 is no problem obviously. > > In Case 2 there are two sub-cases: > > > > 2a. You don't care what values occur in the other iterators > > after then end of the shortest. > > 2b. You do care. > > > > In my experience 1 and 2b are the cases I encounter the most. > > Seldom do I need case 2a. That is, when I can have iterators > > of unequal length, usually I want to do *something* with the > > extra items in the longer iterators. Seldom do I want to just > > ignore them. > > That is a reasonable use case that is not supported by zip() or izip() > as currently implemented.
I haven't thought a lot about zip because I haven't needed to. I would phrase this as "...not supported by the itertools module...". If it makes sense to extend izip() to provide end-of-longest iteration, fine. If not that adding an izip_longest() to itertools (and perhaps a coresponding imap and whatever else shares the terminate-at-shortest behavior.) > > The whole point of using izip is to make the code shorter, > > more concise, and easier to write and understand. > > That should be the point of using anything in Python. The specific > goal for izip() was for an iterator version of zip(). Unfortunately, > neither tool fits your problem. At the root of it is the iterator > protocol not having an unget() method for pushing back unused elements > of the data stream. I don't understand this. Why do you need look ahead? (I mean that literally, I am not disagreeing in a veiled way.) This is my (mis?)understanding of how izip works: - izip is a class - when instantiated, it returns another iterator object, call it "x". - the x object (being an iterator) has a next method that returns a list of the next values returned by all the iterators given when x was created. So why can't izip's next method collect the results of it's set of argument iterators, as I presume it does now, except when one of them starts generating StopIteration exceptions, an alternate value is placed in the result list. When all the iterators start generating exceptions, izip itself raises a StopIteration to signal that all the iterators have reached exhaustion. This is what the code I posted in a message last night does. Why is something like that not acceptable? All this talk of pushbacks and returning shorter lists of unexhausted iterators makes me think I am misunderstanding something. > > This should be pointed out in the docs, > > I'll add a note to the docs. > > > However, it would be better if izip could be made useful > > fot case 2b situations. Or maybe, an izip2 (or something) > > added. > > Feel free to submit a feature request to the SF tracker (surprisingly, > this behavior has not been previously reported, nor have there any > related feature requests, nor was the use case contemplated in the PEP > discussions: http://www.python.org/peps/pep-0201 ). Yes, this is interesting. In the print multiple columns" example I presented, I felt the use of izip() met the "one obvious way" test. The resulting code was simple and clear. The real-world case where I ran into the problem was comparing two files until two different lines were found. Again, izip was the "one obvious way". So yes it is surprising and disturbing that these use cases were not identified. I wonder what other features that "should" be in Python, were similarly missed? And more importantly what needs to change, to fix the problem? -- http://mail.python.org/mailman/listinfo/python-list