[Python-ideas] Re: Keyword arguments self-assignment
On 2020-04-19 07:23, Richard Damon wrote: There is also the issue that if we are building a function that might be used with another function, we will have an incentive to name our keyword parameters that there is a reasonable chance would also be passed to that other function with the same keyword name, even if that might not be the most descriptive name. Many of the function keyword parameters I deal with are data property names; so it makes sense that the data has the same name throughout the codebase. The incentive to align our variable names would be a good thing. Consider pymysql, and the connect parameters > connect( > host=host, > port=port, > user=username, > passwd=password > ) With the proposal, right, or wrong, there would be an incentive for me to write the caller to use pymysql property names, and the callers of that caller to also use the same property names. This will spread until the application has a standard name for username and password: There is less guessing about the property names. I have done this in ES6 code, and it looks nice. Maybe aligning variable names with function keyword parameteres is an anti-pattern, but I have not seen it. I reviewed my code: of 20,360 keyword arguments, 804 (4%) are have the x=x format. I do not know if this is enough to justify such a proposal, but I would suggest that is a minimum: Currently there is no incentive to have identical names for identical things through a call chain; an incentive will only increase the use of this pattern. ___ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/UQGVKME6PO3RYJE4S2BTRB7L66UB4B6B/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: zip(x, y, z, strict=True)
On Wed, Apr 22, 2020 at 12:23 PM David Mertz wrote: > On Wed, Apr 22, 2020, 4:24 AM Antoine Pitrou > >> But, as far as I'm concerned, the number of times where I took >> advantage of zip()'s current acceptance of heteregenously-sized inputs >> is extremely small. In most of my uses of zip(), a size difference >> would have been a logic error that deserves noticing and fixing. >> > > Your experience is very different from mine. > I'm in Antoine's camp on this one. A lot of our work is data analysis, where we get for example simulation results as X, Y, Z components then zip them up into coordinate triples, so any mismatch is a bug. Having zip_equal as a first-class function would replace zip in easily 90% of our use cases, but it needs to be fast as we often do this sort of thing in an inner loop... ___ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/7HQKZ6RHJG57FM43JHLOAEXZSMWMECUC/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Keyword arguments self-assignment
On Thu, 23 Apr 2020 14:47:48 -0400 Kyle Lahnakoski wrote: > > On 2020-04-19 07:23, Richard Damon wrote: > > There is also the issue that if we are building a function that might be > > used with another function, we will have an incentive to name our > > keyword parameters that there is a reasonable chance would also be > > passed to that other function with the same keyword name, even if that > > might not be the most descriptive name. > > Many of the function keyword parameters I deal with are data property > names; so it makes sense that the data has the same name throughout the > codebase. The incentive to align our variable names would be a good > thing. Consider pymysql, and the connect parameters > > > connect( > > host=host, > > port=port, > > user=username, > > passwd=password > > ) > > With the proposal, right, or wrong, there would be an incentive for me > to write the caller to use pymysql property names, and the callers of > that caller to also use the same property names. This will spread until > the application has a standard name for username and password: There is > less guessing about the property names. I have done this in ES6 code, > and it looks nice. (We'll assume, for the sake of discussion, that you meant either (a) to use the same name for user and username and for passwd and password, or (2) to use host and port in your explanatory paragraph. IMO, either way, you're disproving your own point.) > Maybe aligning variable names with function keyword parameteres is an > anti-pattern, but I have not seen it. Sure, that works great. Until you decide to change from pymysql to pywhizbangdb, and its connect function looks like this: connect(uri, user_id, password) Yes, there will be "layers," or "functional blocks," or whatever your architectural units might be called this week, inside of which it may make sense to have a unified name for certain properties. But the whole application? That sounds like a recipe for future inflexibility. -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan ___ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/MQUI7WE75X66PIZHGEAAKFBLDDM5XSOY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Keyword arguments self-assignment
On Thu, Apr 23, 2020 at 04:24:18PM -0400, Dan Sommers wrote: > On Thu, 23 Apr 2020 14:47:48 -0400 > Kyle Lahnakoski wrote: > > Many of the function keyword parameters I deal with are data property > > names; so it makes sense that the data has the same name throughout the > > codebase. The incentive to align our variable names would be a good > > thing. Consider pymysql, and the connect parameters > > > > > connect( > > > host=host, > > > port=port, > > > user=username, > > > passwd=password > > > ) > > > > With the proposal, right, or wrong, there would be an incentive for me > > to write the caller to use pymysql property names, and the callers of > > that caller to also use the same property names. This will spread until > > the application has a standard name for username and password: There is > > less guessing about the property names. I have done this in ES6 code, > > and it looks nice. > > (We'll assume, for the sake of discussion, that you meant either (a) to > use the same name for user and username and for passwd and password, or > (2) to use host and port in your explanatory paragraph. IMO, either > way, you're disproving your own point.) Kyle is explicitly discussing how this proposal will encourage names to be aligned in the future, where today they are pointlessly using synonyms. The point is that user/username and passwd/password are not currently aligned, but this proposal will encourage them to become so. So, no, you should not be assuming that Kyle made a mistake with those two pairs of names, as that would defeat the purpose of his comment. As for your second point, host and port are already aligned. How does the existance of aligned names today disprove the point that unaligned names will, in the future, become aligned? Kyle: "Yesterday I ate lunch. Tomorrow I intend to eat lunch." You: "You ate lunch yesterday? That disproves your claim that you will eat lunch tomorrow." > > Maybe aligning variable names with function keyword parameteres is an > > anti-pattern, but I have not seen it. > > Sure, that works great. Until you decide to change from pymysql to > pywhizbangdb, and its connect function looks like this: This counter-point would be more credible if pywhizbangdb actually existed, but okay, let's assume it exists. > connect(uri, user_id, password) Sounds to me that this is actually a point in favour of aligning variables. Then changing to pywhizbangdb would be a relatively simple "change variable name" refactoring for "username" to "user_id". (There are refactoring tools in IDEs such as PyCharm that will do this for you, rather than needing to do a search and replace yourself. But I haven't used them and I cannot tell you how good they are.) Whereas the change from host+port to uri is a more difficult (in the relative sense, not in any absolute sense) two line change: * use host and port to generate a new variable `uri` * change the call to connect to use `uri` instead of host + port. Either way, this doesn't seem like a particularly onerous migration to me. If only all migrations of the backend were that simple! > Yes, there will be "layers," or "functional blocks," or whatever your > architectural units might be called this week, inside of which it may > make sense to have a unified name for certain properties. But the whole > application? That sounds like a recipe for future inflexibility. Fortunately, this proposal doesn't make it mandatory for people to use the same variable name throughout their entire application, and the function call syntax `func(somename=anothername)` will remain valid. -- Steven ___ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/F7LNHW4E52Q3U77EJFJU2BO2K77K43SZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Keyword arguments self-assignment
On Fri, 24 Apr 2020 07:46:43 +1000 Steven D'Aprano wrote: > On Thu, Apr 23, 2020 at 04:24:18PM -0400, Dan Sommers wrote: > > On Thu, 23 Apr 2020 14:47:48 -0400 > > Kyle Lahnakoski wrote: > > > > Many of the function keyword parameters I deal with are data property > > > names; so it makes sense that the data has the same name throughout the > > > codebase. The incentive to align our variable names would be a good > > > thing. Consider pymysql, and the connect parameters > > > > > > > connect( > > > > host=host, > > > > port=port, > > > > user=username, > > > > passwd=password > > > > ) > > > > > > With the proposal, right, or wrong, there would be an incentive for me > > > to write the caller to use pymysql property names, and the callers of > > > that caller to also use the same property names. This will spread until > > > the application has a standard name for username and password: There is > > > less guessing about the property names. I have done this in ES6 code, > > > and it looks nice. > > > > (We'll assume, for the sake of discussion, that you meant either (a) to > > use the same name for user and username and for passwd and password, or > > (2) to use host and port in your explanatory paragraph. IMO, either > > way, you're disproving your own point.) > > Kyle is explicitly discussing how this proposal will encourage names to > be aligned in the future, where today they are pointlessly using > synonyms. The point is that user/username and passwd/password > are not currently aligned, but this proposal will encourage them to > become so. Hold that thought. My point is that aligning them for the sake of aligning them leads to more work in the future, not less. > So, no, you should not be assuming that Kyle made a mistake with those > two pairs of names, as that would defeat the purpose of his comment. Fair enough. I'm glad I said something, because Kyle's point was apparently not as clear to me as it was to you. > > Sure, that works great. Until you decide to change from pymysql to > > pywhizbangdb, and its connect function looks like this: > > This counter-point would be more credible if pywhizbangdb actually > existed, but okay, let's assume it exists. > > > connect(uri, user_id, password) > > Sounds to me that this is actually a point in favour of aligning > variables. Then changing to pywhizbangdb would be a relatively simple > "change variable name" refactoring for "username" to "user_id". Given the definition of pymysql.connect and pywhizbangdb.connect, what would I call the name of the user? user_id or user? When I switch back ends, why should I refactor/rename anything? And what happens when I have to support both back ends, rather than simply changing from one to the other once for all time? (I probably should have presented this use case first.) > (There are refactoring tools in IDEs such as PyCharm that will do this > for you, rather than needing to do a search and replace yourself. But I > haven't used them and I cannot tell you how good they are.) Yeah, they made me use Java at my last paying job, and IMO all such tools do is encourage pointless renaming and huge commits where you can't tell what's been changed vs. what's only been renamed. Sorry, I digress. > Whereas the change from host+port to uri is a more difficult (in the > relative sense, not in any absolute sense) two line change: > > * use host and port to generate a new variable `uri` > * change the call to connect to use `uri` instead of host + port. > > Either way, this doesn't seem like a particularly onerous migration to > me. If only all migrations of the backend were that simple! We agree that it's not a particularly onerous migration. My point remains that by adopting a style that propagates any particular back end's naming conventions into the rest of your application, any migration is more work than it has to be. > > Yes, there will be "layers," or "functional blocks," or whatever your > > architectural units might be called this week, inside of which it may > > make sense to have a unified name for certain properties. But the whole > > application? That sounds like a recipe for future inflexibility. > > Fortunately, this proposal doesn't make it mandatory for people to use > the same variable name throughout their entire application, and the > function call syntax `func(somename=anothername)` will remain valid. Absolutely, but it seemed to me that Kyle claimed that using the same name throughout an entire application "looks nice," and I pointed out that it does look nice until it makes what should be a tiny change into a larger one. -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan ___ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mail
[Python-ideas] Re: zip(x, y, z, strict=True)
On Thu, Apr 23, 2020 at 3:50 PM Eric Fahlgren
wrote:
> On Wed, Apr 22, 2020 at 12:23 PM David Mertz wrote:
>
>> On Wed, Apr 22, 2020, 4:24 AM Antoine Pitrou
>>
>>> But, as far as I'm concerned, the number of times where I took
>>> advantage of zip()'s current acceptance of heteregenously-sized inputs
>>> is extremely small. In most of my uses of zip(), a size difference
>>> would have been a logic error that deserves noticing and fixing.
>>>
>>
>> Your experience is very different from mine.
>>
>
> I'm in Antoine's camp on this one. A lot of our work is data analysis,
> where we get for example simulation results as X, Y, Z components then zip
> them up into coordinate triples, so any mismatch is a bug. Having
> zip_equal as a first-class function would replace zip in easily 90% of our
> use cases, but it needs to be fast as we often do this sort of thing in an
> inner loop...
>
+1
I write a lot of standalone data-munging scripts, and expecting zipped
inputs to have equal length is a common pattern.
How, for example, to collate lines from 3 potentially large files while
ensuring they match in length (without an external dependency)? The best I
can think of is rather ugly:
with open('a.txt') as a, open('b.txt') as b, open('c.txt') as c:
for lineA, lineB, lineC in zip(a, b, c):
do_something_with(lineA, lineB, lineC)
assert next(a, None) is None
assert next(b, None) is None
assert next(c, None) is None
Changing the zip() call to zip(aF, bF, cF, strict=True) would remove the
necessity of the asserts. Moreover, the concept of strict zip or zip_equal
should be intuitive to beginners, whereas my solution of next() with a
sentinel is not. (Oh, an alternative would be checking if a.readline(),
b.readline(), and c.readline() are nonempty, but that's not much better and
wouldn't generalize to non-file iterators.)
Nathan
___
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/WZY55AKLDLGO3GSHSNALNFUV45JBJHSJ/
Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Keyword arguments self-assignment
Kyle Lahnakoski writes:
> Maybe aligning variable names with function keyword parameteres is an
> anti-pattern, but I have not seen it.
I consider this an anti-pattern for my use case. I review a lot of
different students' similar code for different applications. They
borrow from each other, cargo cult fashion, which I consider a good
thing. However, it's only a good thing to a point, because at some
high enough level they're doing different things. I want variable
names to reflect those differences, pretty much down to the level of
the computational algorithms or network protocols.
Despite Steven d'Aprano, I also agree with Dan Sommers:
> Sure, that works great. Until you decide to change from pymysql to
> pywhizbangdb, and its connect function looks like this:
The reason I disagree with Steven is that almost certainly pymysql and
pywhizbangdb *already have copied the APIs of the libraries they
wrap*. So the abbreviated keyword arguments do not go "all the way
down", and *never will*, because the underlying libraries aren't
implemented in Python. Nor will the implementers be thinking "we
should choose the names our callers are using" -- for one thing, at
that point there won't be any callers yet! So the coordination can
only go so far.
Then consider something like Mailman. In Mailman 2, there is no
abstract concept of "user". There are email addresses, there are
lists, and there is the relation "subscription" between addresses and
lists. Mailman 2 APIs frequently use the term "member" to indicate a
subscriber (ie, email address), and this is not ambiguous: member =
subscriber. In Mailman 3, this simple structure has become
unsupportable. We now have concepts of user, who may have multiple
email addresses, and role, where users may fulfil multiple roles
(subscriber via one of their addresses, moderator, owner) in a list.
For reasons I don't know, in Mailman 3 a member is any user associated
with a list, who need not be subscribed. (Nonmembers are email
addresses that post to the list but do not have proper users
associated with them.) This confused me as a Mailman developer, let
alone list owners who never thought about these internals until they
upgraded. I don't think either definition ("subscriber" vs.
"associated user") is "unnatural", but they are different, this is
confusing, and yet given the history I don't see how the term "member"
can be avoided.
So I see situations (such as the proverbial 3-line wrapper) where
coordinating names is natural, obvious, and to be encouraged, others
where it's impossible (Python wrappers of external libraries), and
still others where thinking about names should be encouraged, and it's
an antipattern to encourage coordinating them with function arguments
by allowing abbreviated actual arguments.
___
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/MEWFODIQQVDZ3UC4CJPUXH53D5KI3VJ3/
Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: Keyword arguments self-assignment
On Thu, 2020-04-23 at 14:47 -0400, Kyle Lahnakoski wrote: > I reviewed my code: of 20,360 keyword arguments, 804 (4%) are have > the > x=x format. I do not know if this is enough to justify such a > proposal, Is that script somewhere? I got a bit curious and wasted some time on making my own script to search projects (link at end), and then applied it on cpython and the scientific python stack. Likely I simply missed an earlier email that already did this, and hopefully it is all correct. For cpython I got: Scanned 985 python files. Total kwargs: 10105 out which identical: 1407 Thus 13.9% are identical. Table of most common identical kwargs: name | value -|-- file |43 encoding |42 context |36 loop |33 argname |31 name |31 errors |24 limit |21 For the scientific python stack (NumPy, SciPy, pandas, astropy, sklearn, matplotlib, skimage), I got: Overall Scanned 1884 python files. Total kwargs: 39381 out which identical: 12229 Thus 31.1% are identical. Table of most common identical kwargs: name | value -|-- axis | 606 dtype | 471 copy | 296 out | 224 name | 205 mode | 122 fill_value | 115 random_state | 114 sample_weight | 109 verbose | 106 For example including tests, etc. reduces the percentages considerably to 10.5% and 15.1% respectively. These are focusing on the main namespace(s) for the scientific python projects, which exaggerates things by ~7% as well (your numbers will vary depending on which files to scan/analyze). Many of the projects have around 30% in their main namespace, pandas has 40% outside of tests. Since I somewhat liked the arguments for `=arg` compared to `arg=` and I noticed that it is not uncommon to have something like: function(..., dtype=arr.dtype) I included such patterns for fun. The percentages increase to 18.9% or 36.6% (again no tests, etc.) respectively. Not really suggesting it here, just thought it was interesting :). It would be interesting how common the patterns are for dictionary literals if this discussion continues (although I may have missed it, maybe it was already done). Another more trickier but very interesting thing, would be to see how many positional arguments of identical name are passed. If such an argument is not the first or second, it may be cleaner as kwargs, but someone opted to not use them, maybe due to being verbose. Anyway, I am not advocating anything, I was mainly curious. Personally, I am not yet convinced that any of the syntax proposals are nice when considering the additional noise of having more (less-common) syntax. Cheers, Sebastian PS: My hacky script is at: https://gist.github.com/seberg/548a2fa9187739ff33ec406e933fa8a4 > but I would suggest that is a minimum: Currently there is no > incentive > to have identical names for identical things through a call chain; > an > incentive will only increase the use of this pattern. > > ___ > Python-ideas mailing list -- [email protected] > To unsubscribe send an email to [email protected] > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/[email protected]/message/UQGVKME6PO3RYJE4S2BTRB7L66UB4B6B/ > Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/4FD7LQHTK3LQ7ED2KHAJSMPC2VAYPTRG/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Re: zip(x, y, z, strict=True)
> On Apr 22, 2020, at 14:09, Steven D'Aprano wrote: > > On Wed, Apr 22, 2020 at 10:33:24AM -0700, Andrew Barnert via Python-ideas > wrote: > >> If that is your long-term goal, I think you could do it in three steps. > > I think the first step is a PEP. This is not a small change that can be > just done on a whim. Yes, I agree. Each of the three steps will very likely require a PEP. And not only that, the PEP for this first step has to make it clear that it’s useful on its own—not just to people like Serhiy who eventually want to replace zip and see it as a first step, but also to people who do not want zip to ever change but do want a convenient way to opt in to checking zips (and don’t find more-itertools convenient enough) and see this as the _only_ step. >> And of course after the first two steps you can proselytize for the >> next one. If you can convince lots of people that they should care >> about the choice more often and get them using the explicit functions, >> it’ll be a lot harder to argue that everyone is happy with today’s >> behavior. > > If they need to be *convinced* to use the new function, then they don't > really need it and didn't want it. I had to be convinced that I wanted str.format. (The guy who convinced me was enthusiastic enough that he went through the effort of writing a __format__ method for my Fixed1616 class to show how easily extensible it is.) But really, I did want it, and just didn’t know it yet. Hell, I had to be convinced to use Python instead of sticking with Perl and Tcl, but it turned out I did want it. Let’s assume that the proponents of adding zip_strict are right that using it will often give you early failures on some common uses that are today painful to debug. If so, most people don’t know that today, and aren’t going to think of it just because a new function shows up in itertools, or a new flag on a builtin, or whatever. Someone will have to convince them to use it. But then, one evening, they’ll get an exception and realize, “Whoa, that would have taken me hours to debug otherwise, if I’d even spotted the bug…”, and they’ll realize they needed it, just as much as the handful who noticed the need in advance and went looking. The proponents of the bigger, longer-term change of eventually making this the default behavior for zip may be right too. If so, many of the people who were convinced to use zip_strict will find it helpful so often, and zip_shortest so unusual in their code, that they start asking why the hell strict isn’t the default instead of shortest. And then it’ll be a lot easier for Serhiy or whoever to sell such a big change. Of course if that doesn’t ever happen, it’ll be a lot harder to sell the change—but in that case, the change would be a mistake, so that’s good too. ___ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/PPSOSLWFLGV4KF2X44THDJ53XPIOSZTY/ Code of Conduct: http://python.org/psf/codeofconduct/
