Raymond Hettinger <[EMAIL PROTECTED]> wrote: > > > > History of zip() > > > ---------------- > > > PEP 201 (lock-step iteration) documents that a fill-in feature was > > > contemplated and rejected for the zip() built-in introduced in Py2.0. > > > In the years before and after, SourceForge logs show no requests for a > > > fill-in feature. > > > > My perception is that many people view the process > > of advocating for a library addition as > > 1. Very time consuming due to the large amount of > > work involved in presenting and defending a proposal. > > I would characterize it as time consuming due to the amount of > research, discussion, and analysis it takes to determine whether or not > a proposal is a good idea. > > > 2. Having a very small chance of acceptance. > > It is less a matter of chance and more a matter of quality. Great > ideas usually make it. Crummy ideas have no chance unless no one takes > the time to think them through.
Great and crummy are not the problem, since the answer in those cases is obvious. It is the middle ground where the answer is not clear, where different people can hold different views, that are the problem. > > I do not know whether this is really the case or even if my > > perception is correct, but if it is, it could account for the > > lack of feature requests. > > I've been monitoring and adjudicating feature requests for five years. > Pythonistas are not known for the lack of assertiveness. If a core > feature has usability problems, we tend to hear about it quickly. > Also, at PyCon, people are not shy about discussing issues that have > arisen. Yet these are the people both most familiar with the library as it exists and the most able to easily work around any limitations, maybe without even thinking about it. So I am not surprised that this might not have come up. To me, the izip solution for my use case was "obvious". None of the other solutions posted here were. Of course that could be fixed with documentation. > The lack of requests is not a definitive answer; however, it does > suggest that there is not an strong unmet need. The lack of examples > in the standard library and other code scans corroborates that notion. > This newsgroup query with further serve to gauge the level of interest > and to ferret-out real-word use cases. The jury is still out. Comments at end re use cases. > > How well correlated in the use of map()-with-fill with the > > (need for) the use of zip/izip-with-fill? > > Close to 100%. A non-iterator version of izip_longest() is exactly > equivalent to map(None, it1, it2, ...). Isn't non-iterator and iterator very significant? If I use map() I can trivially determine the arguments lengths and deal with unequal length before map(). With iterators that is more difficult. So I can imagine many cases where izip might be applicable but map not, and a lack of map use cases not representative of izip use cases. > Since "we already got one", the real issue is whether it has been so > darned useful that it warrants a second variant with two new features > (returns an iterator instead of a list and allows a user-specifiable > fill value). I don't see it as having one and adding a second variant. I see it as having 1/2 and adding the other 1/2. > > FWIW, the OP's use case involved printing files in multiple > > > columns: > > > > > > for f, g in itertools.izip_longest(file1, file2, fillin_value=''): > > > print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip()) > . . . > > > Actuall my use case did not have quite so much > > perlish line noise :-) > > The code was not intended to recapitulate your thread; instead, it was > a compact way of summarizing the problem context that first suggested > some value to izip_longest(). I realize that. I just thought that having a lot extraneous stuff like the formatting made it look at first glance, messier than it should. > > for i1, i2 in itertools.izip (iterable_1, iterable_2): > > print '%-20s\t|\t%-20s' % (i1.rstrip(), i2.rstrip()) > > > > can be replaced by: > > while 1: > > i1 = iterable_1.next() > > i2 = iterable_2.next() > > print '%-20s\t|\t%-20s' % (i1.rstrip(), i2.rstrip()) > > > > yet that was not justification for rejecting izip()'s > > inclusion in itertools. > > Two thoughts: > > 1) The easily-coded-simple-alternative argument applies less strongly > to common cases (equal sequence lengths and finite sequences mixed with > infinite suppliers) than it does to less common cases (unequal sequence > lengths where order is important and missing data elements have > meaning). > > 2) The replacement code is not quite accurate -- the StopIteration > exception needs to be trapped. Yes, but I don't think that negates the point. > > The other use case I had was a simple file diff. > > All I cared about was if the files were the same or > > not, and if not, what were the first differing lines. > > Did you look at difflib? Yes, but it was way overkill for what I needed. > Raymond ~~~ Thanks for your response but I'm curious why you mailed it rather than posted? I am still left with a difficult to express feeling of dissatifaction at this process. Plese try to see it from the point of view of someone who it not a expert at Python: Here is izip(). My conception is it takes two sequence generators and matches up the items from each. (I am talking overall coceptual models here, not details.) Here is my problem. I have two files that produce lines and I want to compare each line. Seems like a perfect fit. So I read that izip() only goes to shortest itereable, I think, "why only the shortest? why not the longest? what's so special about the shortest?" At this point explanations involving lack of uses cases are not very convincing. I have a use. All the alternative solutions are more code, less clear, less obvious, less right. But most importantly, there seems to be a symmetry between the two cases (shortest vs longest) that makes the lack of support for matching-to-longest somehow a defect. Now if there is something fundamental about matching items in parallel lists that makes it a sensible thing to do only for equal lists (or to the shortest list) that's fine. You seem to imply that's the case by referencing Haskell, ML, etc. If so, that needs to be pointed out in izip's docs. (Though nothing I have read in this thread has been convincing.) If it is the case that a matching-longest izip is easily handled by adding a line or to code using izip-shortest that should be pointed out in the doc. But if the answer is to write out an equivalent generator in basic python, I cannot see izip but as being excessively specialized, and needing to be fixed. Re use-cases... Uses cases seem to be sought from readers of c.l.p. and python-dev. That is a pretty small percentage of python users, and those that choose to respond are self-selecting. I would expect the distribution of responders to be skewed toward advanced users for example. The other source seems to be a search of the standard libraries but isn't that also likely not representative of all the code out in the wild? Also, can anyone really remember their code well enough to recall when some proposed enhancement would be beneficial? What I am suggesting is that use cases are important but it also should be realized is that they may not always give an accurate quantitative picture, and that some things still might be good ideas even without use cases (and the converse of course), not because the use cases don't exist, but because they may not be seen by the current use case solicitation process. -- http://mail.python.org/mailman/listinfo/python-list