Re: Seeking deeper understanding of python equality (==)

2022-05-14 Thread Jonathan Kaczynski
Trying some new searches, I came across slotdefs in ./Objects/typeobject.c,
and those are used in the resolve_slotdups function.

The comment preceding the function says, "Note that multiple names may map
to the same slot (e.g. __eq__, __ne__ etc. all map to tp_richcompare)".

So, I'm still wondering how Py_TYPE(v)->tp_richcompare resolves to __eq__
on a user-defined class. Conversely, my understanding is, for a type
defined in cpython, like str, there is usually an explicitly
defined tp_richcompare function.

Thank you,
Jonathan


On Fri, May 13, 2022 at 8:23 PM Jonathan Kaczynski <
jonathan.kaczyn...@guildeducation.com> wrote:

> Thank you for your responses, Sam and Greg.
>
> The do_richcompare function is where my research originally took me, but I
> feel like I'm still missing some pieces to the puzzle.
>
> Here is my updated research since you posted your responses (I'll attach a
> pdf copy too):
> https://docs.google.com/document/d/10zgOMetEQtZCiYFnSS90pDnNZD7I_-MFohSy83pOieA/edit#
> The summary section, in the middle, is where I've summarized my reading of
> the source code.
>
> Greg, your response here,
>
>> Generally what happens with infix operators is that the interpreter
>> first looks for a dunder method on the left operand. If that method
>> doesn't exist or returns NotImplemented, it then looks for a dunder
>> method on the right operand.
>
> reads like the contents of the do_richcompare function.
>
> What I think I'm missing is how do the dunder methods relate to
> the tp_richcompare function?
>
> Thank you,
> Jonathan
>
>
> On Fri, May 6, 2022 at 11:55 PM Greg Ewing 
> wrote:
>
>> On 7/05/22 12:22 am, Jonathan Kaczynski wrote:
>> > Stepping through the code with gdb, we see it jump from the compare
>> > operator to the dunder-eq method on the UUID object. What I want to be
>> able
>> > to do is explain the in-between steps.
>>
>> Generally what happens with infix operators is that the interpreter
>> first looks for a dunder method on the left operand. If that method
>> doesn't exist or returns NotImplemented, it then looks for a dunder
>> method on the right operand.
>>
>> There is an exception if the right operand is a subclass of the
>> left operand -- in that case the right operand's dunder method
>> takes precedence.
>>
>> > Also, if you change `x == y` to `y
>> > == x`, you still see the same behavior, which I assume has to do with
>> > dunder-eq being defined on the UUID class and thus given priority.
>>
>> No, in that case the conparison method of str will be getting
>> called first, but you won't see that in pdb because it doesn't
>> involve any Python code. Since strings don't know how to compare
>> themselves with uuids, it will return NotImplemented and the
>> interpreter will then call uuid's method.
>>
>> --
>> Greg
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Seeking deeper understanding of python equality (==)

2022-05-14 Thread Eryk Sun
On 5/14/22, Jonathan Kaczynski  wrote:
>
> So, I'm still wondering how Py_TYPE(v)->tp_richcompare resolves to __eq__
> on a user-defined class. Conversely, my understanding is, for a type
> defined in cpython, like str, there is usually an explicitly
> defined tp_richcompare function.

Sometimes it's simplest to directly examine an object using a native
debugger (e.g. gdb in Linux; cdb/windbg in Windows).

With a debugger attached to the interpreter, create two classes, one
that doesn't override __eq__() and one that does:

>>> class C:
... pass
...
>>> class D:
... __eq__ = lambda s, o: False
...

In CPython, the id() of an object is its address in memory:

>>> hex(id(C))
'0x2806a705790'
>>> hex(id(D))
'0x2806a6bbfe0'

Break into the attached debugger to examine the class objects:

>>> kernel32.DebugBreak()

(1968.1958): Break instruction exception - code 8003 (first chance)
KERNELBASE!wil::details::DebugBreak+0x2:
7ffd`8818fd12 cc  int 3

Class C uses the default object_richcompare():

0:000> ?? *((python310!PyTypeObject *)0x2806a705790)->tp_richcompare
 0x7ffd`55cac288
 _object*  python310!object_richcompare+0(
_object*,
_object*,
int)

Class D uses slot_tp_richcompare():

0:000> ?? *((python310!PyTypeObject *)0x2806a6bbfe0)->tp_richcompare
 0x7ffd`55cdef1c
 _object*  python310!slot_tp_richcompare+0(
_object*,
_object*,
int)

Source code of slot_tp_richcompare():

https://github.com/python/cpython/blob/v3.10.4/Objects/typeobject.c#L7610-L7626
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Changing calling sequence

2022-05-14 Thread dn
On 12/05/2022 01.33, Michael F. Stemper wrote:
> I have a function that I use to retrieve daily data from a
> home-brew database. Its calling sequence is;
> 
> def TempsOneDay( year, month, date ):
> 
> After using it (and its friends) for a few years, I've come to
> realize that there are times where it would be advantageous to
> invoke it with a datetime.date as its single argument.
> 
> As far as I can tell, there are three ways for me to proceed:
> 1. Write a similar function that takes a single datetime.date
>    as its argument.
> 2. Rewrite the existing function so that it takes a single
>    argument, which can be either a tuple of (year,month,date)
>    or a datetime.date argument.
> 3. Rewrite the existing function so that its first argument
>    can be either an int (for year) or a datetime.date. The
>    existing month and date arguments would be optional, with
>    default=None. But, if the first argument is an int, and
>    either of month or date is None, an error would be raised.
> 
> The first would be the simplest. However, it is obviously WET
> rather than DRY.
> 
> The second isn't too bad, but a change like this would require that
> I find all places that the function is currently used and insert a
> pair of parentheses. Touching this much code is risky, as well
> as being a bunch of work. (Admittedly, I'd only do it once.)
> 
> The third is really klunky, but wouldn't need to touch anything
> besides this function.
> 
> What are others' thoughts? Which of the approaches above looks
> least undesirable (and why)? Can anybody see a fourth approach?


Reading the above, it seems that the options are limited to using
positional-arguments only. Because I keep tripping-over my long, grey,
beard; I'm reminded that relying upon my/human memory is, um, unreliable
(at least in my case). Accordingly, by the time a function's definition
reaches three parameters, I'll be converting it to use keyword-arguments
as a matter of policy. YMMV!

Remember: if keyword arguments are not used (ie existing/legacy code),
Python will still use positional logic.

Once the function's signature has been changed, we could then add
another keyword-parameter to cover the datetime option.


That said, a function which starts with a list of ifs-buts-and-maybes*
which are only there to ascertain which set of arguments have been
provided by the calling-routine; obscures the purpose/responsibility of
the function and decreases its readability (perhaps not by much, but
varying by situation).

Accordingly, if the function is actually a method, recommend following
@Stefan's approach, ie multiple-constructors. Although, this too can
result in lower readability.

Assuming it is a function, and that there are not many alternate
APIs/approaches (here we're discussing only two), I'd probably create a
wrapper-function which has the sole task of re-stating the datetime
whilst calling the existing three-parameter function. The readability
consideration here, is to make a good choice of (new) function-name!


* Python version >= 10? Consider using match-case construct keyed on
parameter-type
-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Changing calling sequence

2022-05-14 Thread 2QdxY4RzWzUUiLuE
On 2022-05-15 at 10:22:15 +1200,
dn  wrote:

> That said, a function which starts with a list of ifs-buts-and-maybes*
> which are only there to ascertain which set of arguments have been
> provided by the calling-routine; obscures the purpose/responsibility
> of the function and decreases its readability (perhaps not by much,
> but varying by situation).

Agreed.

> Accordingly, if the function is actually a method, recommend following
> @Stefan's approach, ie multiple-constructors. Although, this too can
> result in lower readability.

(Having proposed that approach myself (and having used it over the
decades for functions, methods, procedures, constructors, ...), I also
agree.)

Assuming good names,¹ how can this lead to lower readability?  I guess
if there's too many of them, or programmers have to start wondering
which one to use?  Or is this in the same generally obfuscating category
as the ifs-buts-and-maybes at the start of a function?

¹ and properly invalidated caches
-- 
https://mail.python.org/mailman/listinfo/python-list


Mypy alternatives

2022-05-14 Thread Dan Stromberg
Hello people.

I've used Mypy and liked it in combination with MonkeyType.

I've heard there are alternatives to Mypy that are faster, and I'm looking
at using something like this on a 457,000 line project.

Are there equivalents to MonkeyType that will work with these alternatives
to Mypy?

And has Mypy become the defacto standard for how type annotations should
look?  That is, are there other tools that assume Mypy's format too, and
does most doc about type annotations assume Mypy's style?

Thanks!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Changing calling sequence

2022-05-14 Thread dn
On 15/05/2022 11.34, 2qdxy4rzwzuui...@potatochowder.com wrote:
> On 2022-05-15 at 10:22:15 +1200,
> dn  wrote:
> 
>> That said, a function which starts with a list of ifs-buts-and-maybes*
>> which are only there to ascertain which set of arguments have been
>> provided by the calling-routine; obscures the purpose/responsibility
>> of the function and decreases its readability (perhaps not by much,
>> but varying by situation).
> 
> Agreed.
> 
>> Accordingly, if the function is actually a method, recommend following
>> @Stefan's approach, ie multiple-constructors. Although, this too can
>> result in lower readability.
> 
> (Having proposed that approach myself (and having used it over the
> decades for functions, methods, procedures, constructors, ...), I also
> agree.)
> 
> Assuming good names,¹ how can this lead to lower readability?  I guess
> if there's too many of them, or programmers have to start wondering
> which one to use?  Or is this in the same generally obfuscating category
> as the ifs-buts-and-maybes at the start of a function?
> 
> ¹ and properly invalidated caches

Allow me to extend the term "readability" to include "comprehension".
Then add the statistical expectation that a class has only __init__().
Thus, assuming this is the first time (or, ... for a while) that the
class is being employed, one has to read much further to realise that
there are choices of constructor.


Borrowing from the earlier example:

>   This would be quite pythonic. For example, "datetime.date"
>   has .fromtimestamp(timestamp), .fromordinal(ordinal),
>   .fromisoformat(date_string), ...

Please remember that this is only relevant if the function is actually a
module - which sense does not appear from the OP (IMHO).

The alternatives' names are well differentiated and (apparently#)
appropriately named*.


* PEP-008 hobgoblins will quote:
"Function names should be lowercase, with words separated by underscores
as necessary to improve readability.
Variable names follow the same convention as function names."
- but this is a common observation/criticism of code that has been in
the PSL for a long time.

# could also criticise as not following the Software Craftsmanship/Clean
Code ideal of 'programming to the interface rather than the
implementation' - which we see in PEP-008 as "usage rather than
implementation"
(but please don't ask me how to differentiate between them, given that
the only reason for the different interfaces is the
function's/parameters' implementation!)

NB usual caveats apply to PEP-008 quotations!


So, I agree with you - it comes down to those pernicious
'ifs-buts-and-maybes'. If the interface/parameter-processing starts to
obfuscate the function's actual purpose, maybe it can be 'farmed-out' to
a helper-function. However, that would start to look very much like the
same effort (and comprehension-challenge) as having a wrapper-function!


Continuing the 'have to read further' criticism (above), it could
equally-well be applied to my preference for keyword-arguments, in that
I've suggested defining four parameters but the user will only call the
function with either three or one argument(s). Could this be described
as potentially-confusing?


Given that the OP wouldn't want to have to redefine the existing
interface, the next comment may not be applicable - but in the interests
of completeness: anyone contemplating such architecture might like to
consider "Single-dispatch generic functions"
(https://peps.python.org/pep-0443/). At least the decorators signal that
there are alternative-choices...
-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Changing calling sequence

2022-05-14 Thread Chris Angelico
On Sun, 15 May 2022 at 14:27, dn  wrote:
>
> On 15/05/2022 11.34, 2qdxy4rzwzuui...@potatochowder.com wrote:
> > On 2022-05-15 at 10:22:15 +1200,
> > dn  wrote:
> >
> >> That said, a function which starts with a list of ifs-buts-and-maybes*
> >> which are only there to ascertain which set of arguments have been
> >> provided by the calling-routine; obscures the purpose/responsibility
> >> of the function and decreases its readability (perhaps not by much,
> >> but varying by situation).
> >
> > Agreed.
> >
> >> Accordingly, if the function is actually a method, recommend following
> >> @Stefan's approach, ie multiple-constructors. Although, this too can
> >> result in lower readability.
> >
> > (Having proposed that approach myself (and having used it over the
> > decades for functions, methods, procedures, constructors, ...), I also
> > agree.)
> >
> > Assuming good names,¹ how can this lead to lower readability?  I guess
> > if there's too many of them, or programmers have to start wondering
> > which one to use?  Or is this in the same generally obfuscating category
> > as the ifs-buts-and-maybes at the start of a function?
> >
> > ¹ and properly invalidated caches
>
> Allow me to extend the term "readability" to include "comprehension".
> Then add the statistical expectation that a class has only __init__().

(Confusing wording here: a class usually has far more than just
__init__, but I presume you mean that the signature of __init__ is the
only way to construct an object of that type.)

> Thus, assuming this is the first time (or, ... for a while) that the
> class is being employed, one has to read much further to realise that
> there are choices of constructor.

Yeah. I would generally say, though, that any classmethod should be
looked at as a potential alternate constructor, or at least an
alternate way to obtain objects (eg preconstructed objects with
commonly-used configuration - imagine a SecuritySettings class with a
classmethod to get different defaults).

> Borrowing from the earlier example:
>
> >   This would be quite pythonic. For example, "datetime.date"
> >   has .fromtimestamp(timestamp), .fromordinal(ordinal),
> >   .fromisoformat(date_string), ...
>
> Please remember that this is only relevant if the function is actually a
> module - which sense does not appear from the OP (IMHO).
>
> The alternatives' names are well differentiated and (apparently#)
> appropriately named*.
>
>
> * PEP-008 hobgoblins will quote:
> "Function names should be lowercase, with words separated by underscores
> as necessary to improve readability.

Note the "as necessary". Underscores aren't required when readability
is fine without them (see for instance PEP 616, which recently added
two methods to strings "removeprefix" and "removesuffix", no
underscores - part of the argument here was consistency with other
string methods, but it's also not a major problem for readability
here).

> Variable names follow the same convention as function names."
> - but this is a common observation/criticism of code that has been in
> the PSL for a long time.
>
> # could also criticise as not following the Software Craftsmanship/Clean
> Code ideal of 'programming to the interface rather than the
> implementation' - which we see in PEP-008 as "usage rather than
> implementation"
> (but please don't ask me how to differentiate between them, given that
> the only reason for the different interfaces is the
> function's/parameters' implementation!)
>
> NB usual caveats apply to PEP-008 quotations!

Notably here, the caveat that PEP 8 is not a permanent and unchanging
document. It is advice, not rules, and not all code in the standard
library fully complies with its current recommendations.

> Continuing the 'have to read further' criticism (above), it could
> equally-well be applied to my preference for keyword-arguments, in that
> I've suggested defining four parameters but the user will only call the
> function with either three or one argument(s). Could this be described
> as potentially-confusing?

Yes, definitely. Personally, I'd split it into two, one that takes the
existing three arguments (preferably with the same name, for
compatibility), and one with a different name that takes just the one
arg. That could be a small wrapper that calls the original, or the
original could become a wrapper that calls the new one, or the main
body could be refactored into a helper that they both call. It all
depends what makes the most sense internally, because that's not part
of the API at that point.

But it does depend on how the callers operate. Sometimes it's easier
to have a single function with switchable argument forms, other times
it's cleaner to separate them.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list