[Python-ideas] Re: extended for-else, extended continue, and a rant about zip()

Andrew Barnert via Python-ideas Mon, 27 Apr 2020 16:41:54 -0700

On Apr 27, 2020, at 14:38, Soni L. <[email protected]> wrote:

[snipping a long unanswered reply]


> The explicit case for zip is if you *don't* want it to consume anything after 
> the stop.

Sure, but *when do you want that*? What’s an example of code you want to write 
that would be more readable, or easier to write, or whatever, if you could work 
around consuming anything after the stop?

> btw: I suggest reading the whole post as one rather than trying to pick it 
> apart.

I did read the whole post, and then went back to reply to each part in-line. 
You can tell by the fact that I refer to things later in the post. For example, 
when I refer to your proposed code being better than “the ugly mess that you 
posted below“ as the current alternative, it should be pretty clear that I’ve 
already read the ugly mess that you posted below.

So why did I format it as replies inline? Because that’s standard netiquette 
that goes back to the earliest days of email lists. Most people find it 
confusing (and sometimes annoying) to read a giant quote and then a giant reply 
and try to figure out what’s being referred to where, so when you have a giant 
message to reply to, it’s helpful to reply inline.

But as a bonus, writing a reply that way makes it clear to yourself if you’ve 
left out anything important. You didn’t reply to multiple issues that I raised, 
and I doubt that it’s because you don’t have any answers and are just trying to 
hide that fact to trick people into accepting your proposal anyway, but rather 
than you just forgot to get to some things because it’s easy to miss important 
stuff when you’re not replying inline.

> the purpose of the proposal, as a whole, is to make it easier to pick things 
> - generators in particular - apart. I tried to make that clear but clearly I 
> failed.

No, you did make that part clear; what you didn’t make clear is (a) what 
exactly you’re trying to pick apart from the generators and why, (b) what 
actual problems look like, (c) how your proposal could make that code better, 
and (d) why existing solutions (like manually nexting iterators in a while 
loop, or using tools like peekable) don’t already solve the problem.

Without any of that, all you’re doing is offering something abstract that might 
conceivably be useful, but it’s not clear where or why or even whether it would 
ever come up, so for all we know it’ll *never* actually be useful. Nobody’s 
likely to get on board with such a change.

> Side note, here's one case where it'd be better than using zip_longest:

Your motivating example should not be a “side note”, it should be the core of 
any proposal.

But it should also be a real example, not a meaningless toy example. Especially 
not one where even you can’t imagine an actual similar use case. “We should add 
this feature because it would let you write code that I can’t imagine ever 
wanting to write” isn’t a rationale that’s going to attract much support.

> for a, b, c, d, e, f, g in zip(*[iter(x)]*7): # this pattern is suggested by 
> the zip() docs, btw.
>    use_7x_algorithm(a, b, c, d, e, f, g)
> else as x: # leftovers that didn't fit the 7-tuple.
>    use_slow_variable_arity_algorithm(*x)

Why do you want to unpack into 7 variables with meaningless names just to pass 
those 7 variables? And if you don’t need that part, why can’t you just write 
this with zip_skip (which, as mentioned in the other thread, is pretty easy to 
write around zip_longest)?

The best guess I can come up with is that in a real life example maybe that 
would have some performance cost that’s hard to see in this toy. But then if 
that’s the case, given that x is clearly not an iterator, is it a sequence? You 
could then presumably get much more optimization by looping over slices instead 
of using the grouper idiom in the first place. Or, as you say, by using numpy.

> I haven't found a real use-case for this yet, tho.
> SIMD is handled by numpy, which does a better job than you could ever hope 
> for in plain python, and for SIMD you could use zip_longest with a suitable 
> dummy instead. but... yeah, not really useful.

> (actually: why do the docs for zip() even suggest this stuff anyway? seems 
> like something nobody would actually use.)

That grouping idiom is useful for all kinds of things that _aren’t_ about 
optimization. Maybe the zip docs aren’t the best place for it (but it’s also in 
the itertools recipes, which probably is the best place for it), but it’s 
definitely useful. In fact, I used it less than a week ago. We’ve got this tool 
that writes a bunch of 4-line files, and someone concatenated a bunch of them 
together and wrote this horrible code to pull them back apart in another 
language I won’t mention here, and rather than debug their code, I just rewrote 
it in Python like this:

   with open(path) as f:
       for entry in chunkify(f, 4):
           process(entry)

I used a function called chunkify because I think that’s a lot easier to 
understand (especially for colleagues who don’t use Python very often), and we 
already had it lying around in a utils module, but it’s just implemented as 
zip(*[iter(it)]*n).

Also, compare this other example for processing a different file format:

   with open(path) as f:
       for entry in split(f, '\n'):
           process(entry)

It’s pretty obvious what the difference is here: one is reading entries that 
are groups of 4 lines; the other is reading entries that are groups of 
arbitrary numbers of lines but separated by blank lines. At most you might need 
to look at the help for chunkify and split to be absolutely sure they mean what 
you think they mean. (Although maybe I should have used functions from 
more-itertools rather than our own custom functions that do effectively the 
same thing but are kind of weird and probably not so well tested and whose 
names don’t come up in a web search.)

_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/3NZEE5GENYAX44BXADA72Z7FI6ANMHK6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: extended for-else, extended continue, and a rant about zip()

Reply via email to