New submission from Raphael Michel:

The documentation given for itertools.zip_longest contains a "roughly 
equivalent" pure-python implementation of the function that is intended to help 
the user understand what zip_longest does on a functional level.

However, the given implementation is very complicated to read for newcomers and 
experienced Python programmers alike, as it uses a custom-defined exception for 
control flow handling, a nested function, a condition that always is true if 
any arguments are passed ("while iterators"), as well as two other non-trivial 
functions from itertools (chain and repeat).

For future reference, this is the currently given implementation:

    def zip_longest(*args, **kwds):
        # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
        fillvalue = kwds.get('fillvalue')
        iterators = [iter(it) for it in args]

        while True:
            exhausted = 0
            values = []

            for it in iterators:
                try:
                    values.append(next(it))
                except StopIteration:
                    values.append(fillvalue)
                    exhausted += 1

            if exhausted < len(args):
                yield tuple(values)
            else:
                break

This is way more complex than necessary to teach the concept of zip_longest. 
With this issue, I will submit a pull request with a new example implementation 
that seems to be the same level of "roughly equivalent" but is much easier to 
read, since it only uses two loops and now complicated flow 

    def zip_longest(*args, **kwds):
        # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
        fillvalue = kwds.get('fillvalue')
        iterators = [iter(it) for it in args]

        while True:
            exhausted = 0
            values = []

            for it in iterators:
                try:
                    values.append(next(it))
                except StopIteration:
                    values.append(fillvalue)
                    exhausted += 1

            if exhausted < len(args):
                yield tuple(values)
            else:
                break


Looking at the C code of the actual implementation, I don't see that any one of 
the two implementations is obviously "more equivalent". I'm unsure about 
performance -- I haven't tried them on that but I don't think that's the point 
of this learning implementation.

I ran all tests from Lib/test/test_itertools.py against both the old and the 
new implementation. The new implementation fails at 3 tests, while the old 
implementation failed at four. Two of the remaining failures are related to 
TypeErrors not being thrown on invalid input, one of them is related to 
pickling the resulting object. I believe all three of them are fine to ignore 
in this sample, as it is not relevant to the documentation purpose.

Therefore, I believe the documentation should be changed like suggested. I'd be 
happy for any feedback or further ideas to improve its readability!

----------
assignee: docs@python
components: Documentation
messages: 300788
nosy: docs@python, rami
priority: normal
severity: normal
status: open
title: Simplify documentation of itertools.zip_longest
type: enhancement
versions: Python 3.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue31270>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to