[issue46972] Documentation: Reference says AssertionError is raised by `assert`, but not all AssertionErrors are.

2022-03-10 Thread Thomas Fischbacher


New submission from Thomas Fischbacher :

The Python reference says:

(1) https://docs.python.org/3/library/exceptions.html#concrete-exceptions

exception AssertionError
Raised when an assert statement fails.

(2) https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement

"assert ..." is equivalent to "if __debug__: ..."

>From this, one can infer the guarantee "the -O flag will suppress 
>AssertionError exceptions from being raised".

However, there is code in the Python standard library that does a direct "raise 
AssertionError" (strictly speaking, in violation of (1)), and it is just 
reasonable to assume that other code following the design of that would then 
also want to do a direct "raise AssertionError".

This happens e.g. in many methods defined in: unittest/mock.py

The most appropriate fix here may be to change the documentation to not say:

===
exception AssertionError
Raised when an assert statement fails.
===

but instead:

===
exception AssertionError
An assert[{add reference to `assert` definition}] statement fails, or a unit 
testing related assert{...}() callable detects an assertion violation.
===

--
messages: 414837
nosy: tfish2
priority: normal
severity: normal
status: open
title: Documentation: Reference says AssertionError is raised by `assert`, but 
not all AssertionErrors are.

___
Python tracker 
<https://bugs.python.org/issue46972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46972] Documentation: Reference says AssertionError is raised by `assert`, but not all AssertionErrors are.

2022-03-10 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

The documentation of exceptions in the reference is one of the places that 
makes the life of users substantially harder than it ought to be, since the 
documentation appears to not have been written with the intent to give 
guarantees that users can expect correctly written code to follow.

I would argue that "The reference documentation for X states that it gets 
raised under condition Y" generally should be understood as "this is a 
guarantee that also includes the guarantee that it is not raised under other 
conditions in correctly written code".

Other languages often appear to be somewhat stricter w.r.t. interpreting the 
reference documentation as binding for correct code - and for Python, having 
this certainly would help a lot when writing code that can give binding 
guarantees.

--

___
Python tracker 
<https://bugs.python.org/issue46972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46972] Documentation: Reference says AssertionError is raised by `assert`, but not all AssertionErrors are.

2022-03-24 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

Addendum

Serhiy, I agree that my assessment was incorrect.
It actually is unittest/mock.py that has quite a few 'raise AssertionError' 
that are not coming from an 'assert' keyword statement.

At a deeper level, the problem here is as follows:

Every programming language has to make an awkward choice: either it excludes 
some authors ("must be forklift certified"), or it adds a lot of bureaucratic 
scaffolding to have some mechanisms that allow code authors to enforce API 
contracts (as if this would help to "keep out the tide" of unprincipled code 
authors), or it takes a more relaxed perspective - as also Perl did - of "we 
are all responsible users" / "do not do this because you are not invited, not 
because the owner has a shotgun". I'd call this third approach quite reasonable 
overall, but then the understanding is that "everybody treats documentation as 
binding and knows how to write good documentation".

After all, we need to be able to reason about code, and in order to do that, it 
matters to have guarantees such as for example: "Looking up a nonexistent key 
for a mapping by evaluating the_mapping[the_key] can raise an exception, and 
when it does, that exception is guaranteed to be an instance of KeyError".

Unfortunately, Python on the one hand emphasizes "responsible behavior" - i.e. 
"people know how to write and read documentation, and the written documentation 
creates a shared understanding between its author and reader", but on the other 
hand is often really bad at properly documenting its interfaces. If I had to 
name one thing that really needs fixing with Python, it would be this.

--

___
Python tracker 
<https://bugs.python.org/issue46972>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47121] math.isfinite() can raise exception when called on a number

2022-03-25 Thread Thomas Fischbacher


New submission from Thomas Fischbacher :

>>> help(math.isfinite)
isfinite(x, /)
Return True if x is neither an infinity nor a NaN, and False otherwise.

So, one would expect the following expression to return `True` or `False`. We 
instead observe:

>>> math.isfinite(10**1000)
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: int too large to convert to float

(There likewise is a corresponding issue with other, similar, functions).

This especially hurts since PEP-484 states that having a Sequence[float] `xs` 
does not allow us to infer that `all(issubclass(type(x), float) for x in xs)` 
actually holds - since a PEP-484 "float" actually does also include "int" (and 
still, issubclass(int, float) == False).

Now, strictly speaking, `help(math)` states that

DESCRIPTION
This module provides access to the mathematical functions
defined by the C standard.

...but according to "man 3 isfinite", the math.h "isfinite" is a macro and not 
a function - and the man page does not show type information for that reason.

--
messages: 416010
nosy: tfish2
priority: normal
severity: normal
status: open
title: math.isfinite() can raise exception when called on a number

___
Python tracker 
<https://bugs.python.org/issue47121>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47121] math.isfinite() can raise exception when called on a number

2022-03-25 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

The problem with PEP-484 is that if one wants to use static type analysis, 
neither of these options are good:

- Use static annotations on functions, and additionally spec
  out expectations in docstrings. Do note that the two types places
  where "float" is mentioned here refer to different concepts.
  This looks as if there were duplication, but there actually
  isn't, since the claims are different. This is confusing as hell.

def foo(x: float) -> float:
  """Foos the barbaz

  Args:
x: float, the foobar
  Returns:
float, the foofoo"""

The floats in the docstring give me a guarantee: "If I feed in a float, I am 
guaranteed to receive back a float". The floats in the static type annotation 
merely say "yeah, can be float or int, and I'd call it ok in these cases" - 
that's a very different statement.

- Just go with static annotations, drop mention of types
  from docstrings, and accept that we lose the ability to
  stringently reason about the behavior of code.

With respect to this latter option, I think we can wait for "losing the ability 
to stringently reason about the behavior of code" to cause major security 
headaches. That's basically opening up the door to many problems at the level 
of "I can crash the webserver by requesting the url http://lpt1";.

--

___
Python tracker 
<https://bugs.python.org/issue47121>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-05 Thread Thomas Fischbacher

New submission from Thomas Fischbacher :

Here is a major general problem with python-static-typing as it is
described by PEP-484: The approach described in
https://peps.python.org/pep-0484/#the-numeric-tower negatively impacts
our ability to reason about the behavior of code with stringency.

I would like to clarify one thing in advance: this is a real problem
if we subscribe to some of the important ideas that Dijkstra
articulated in his classic article "On the role of scientific thought"
(e.g.: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD447.html).
Specifically, this part:

"""
Let me try to explain to you, what to my taste is characteristic
for all intelligent thinking. It is, that one is willing to study in
depth an aspect of one's subject matter in isolation for the sake of
its own consistency, all the time knowing that one is occupying
oneself only with one of the aspects. We know that a program must be
correct and we can study it from that viewpoint only; we also know
that it should be efficient and we can study its efficiency on another
day, so to speak. In another mood we may ask ourselves whether, and if
so: why, the program is desirable. But nothing is gained —on the
contrary!— by tackling these various aspects simultaneously. It is
what I sometimes have called "the separation of concerns", which, even
if not perfectly possible, is yet the only available technique for
effective ordering of one's thoughts, that I know of. This is what I
mean by "focussing one's attention upon some aspect": it does not mean
ignoring the other aspects, it is just doing justice to the fact that
from this aspect's point of view, the other is irrelevant. It is being
one- and multiple-track minded simultaneously.
"""

So, "code should be easy to reason about".

Now, let us look at this function - I am here (mostly) following the
Google Python style guide (https://google.github.io/styleguide/pyguide.html) 
for now:

=== Example 1, original form ===

def middle_mean(xs):
  """Compute the average of the nonterminal elements of `xs`.

  Args:
`xs`: a list of floating point numbers.

  Returns:
A float, the mean of the elements in `xs[1:-1]`.

  Raises:
ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
==

Let's not discuss performance, or whether it makes sense to readily
generalize this to operate on other sequences than lists, but focus,
following Dijkstra, on one specific concern here: Guaranteed properties.

Given the function as it is above, I can make statements that are
found to be correct when reasoning with mathematical rigor, such as
this specific one that we will come back to:

=== Theorem 1 ===
  If we have an object X that satisfies these properties...:

  1. type(X) is list
  2. len(X) == 4
  3. all(type(x) is float for x in X)

  ...then we are guaranteed that `middle_mean(X)` evaluates to a value Y
  which satisfies:

  - type(Y) is float
  - Y == (X[1] + X[2]) * 0.5 or math.isnan(Y)
===

Now, following PEP-484, we would want to re-write our function, adding type 
annotations.
Doing this mechanically would give us:

=== Example 1, with mechanically added type information ===
def middle_mean(xs: List[float]) -> float:
  """Compute the average of the nonterminal elements of `xs`.

  Args:
`xs`: a list of floating point numbers.

  Returns:
A float, the mean of the elements in `xs[1:-1]`.

  Raises:
ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
==

(We are also deliberately not discussing another question here: given
this documentation and type annotation, should the callee be
considered to be permitted to mutate the input list?)


So, given the above form, we now find that there seems to be quite a
bit of redundancy here. After all, we have the type annotation but
also repeat some typing information in the docstring. Hence, the
obvious proposal here is to re-write the above definition again, obtaining:

=== Example 1, "cleaned up" ===
def middle_mean(xs: List[float]) -> float:
  """Compute the average of the nonterminal elements of `xs`.

  Args:
`xs`: numbers to average, with terminals ignored.

  Returns:
The mean of the elements in `xs[1:-1]`.

  Raises:
ValueError: If `len(xs) < 3`.
  """
  if len(xs) < 3:
raise ValueError('Need at least 3 elements to compute middle mean.')
  return sum(xs[1:-1]) / (len(xs) - 2)
==

But now, what does this change mean for the contract? Part of the "If
arguments have these properties, then these are the guarantees&quo

[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-08 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

This is not a partial duplicate of https://bugs.python.org/issue47121 about 
math.isfinite().
The problem there is about a specific function on which the documentation may 
be off -
I'll comment separately on that.


The problem here is: There is a semantic discrepancy between what the
term 'float' means "at run time", such as in a check like:

issubclass(type(x), float)

(I am deliberately writing it that way, given that isinstance() can, in general 
[but actually not for float], lie.)

and what the term 'float' means in a statically-checkable type annotation like:

def f(x: float) -> ... : ...

...and this causes headaches.


The specific example ('middle_mean') illustrates the sort of weird
situations that arise due to this. (I discovered this recently when
updating some of our company's Python onboarding material, where the
aspiration naturally is to be extremely accurate with all claims.)

So, basically, there is a choice to make between these options:

Option A: Give up on the idea that "we want to be able to reason with
stringency about the behavior of code" / "we accept that there will be
gaps between what code does and what we can reason about".  (Not
really an option, especially with an eye on "writing secure code
requires being able to reason out everything with stringency".)

Option B: Accept the discrepancy and tell people that they have to be
mindful about float-the-dynamic-type being a different concept from
float-the-static-type.

Option C: Realizing that having "float" mean different things for
dynamic and static typing was not a great idea to begin with, and get
everybody who wants to state things such as "this function parameter
can be any instance of a real number type" to use the type
`numbers.Real` instead (which may well need better support by
tooling), respectively express "can be int or float" as `Union[int,
float]`.

Also, there is Option D: PEP-484 has quite a lot of other problems
where the design does not meet rather natural requirements, such as:
"I cannot introduce a newtype for 'a mapping where I know the key to
be a particular enum-type, but the value is type-parametric'
(so the new type would also be 1-parameter type-parametric)", and
this float-mess is merely one symptom of "maybe PEP-484 was approved
too hastily and should have been also scrutinized by people
from a community with more static typing experience".


Basically, Option B would spell out as: 'We expect users who use
static type annotations to write code like this, and expect them to be
aware of the fact that the four places where the term "float" occurs
refer to two different concepts':

def foo(x: float) -> float:
  """Returns the foo of the number `x`.

  Args:
x: float, the number to foo.

  Returns:
float, the value of the foo-function at `x`.
  """
  ...

...which actually is shorthand for...:

def foo(x: float  # Note: means float-or-int
  ) -> float  # Note: means float-or-int
  :
  """Returns the foo of the number `x`.

  Args:
x: the number to foo, an instance of the `float` type.

  Returns:
The value of the foo-function at `x`,
as an instance of the `float` type.
  """
  ...

Option C (and perhaps D) appear - to me - to be the only viable
choices here. The pain with Option C is that it invalidates/changes
the meaning of already-written code that claims to follow PEP-484,
and the main point of Option D is all about: "If we have to cause
a new wound and open up the patient again, let's try to minimize
the number of times we have to do this."

Option C would amount to changing the meaning of...:

def foo(x: float) -> float:
  """Returns the foo of the number `x`.

  Args:
x: float, the number to foo.

  Returns:
float, the value of the foo-function at `x`.
  """
  ...

to "static type annotation float really means instance-of-float here"
(I do note that issubclass(numpy.float64, float), so passing a
numpy-float64 is expected to work here, which is good), and ask people
who would want to have functions that can process more generic real
numbers to announce this properly. So, we would end up with basically
a list of different things that a function-sketch like the one above
could turn into - depending on the author's intentions for
the function, some major cases being perhaps:

(a) ("this is supposed to strictly operate on float")
def foo(x: float) -> float:
  """Returns the foo of the number `x`.

  Args:
x: the number to foo.

  Returns:
the value of the foo-function at `x`.
  """

(b) ("this will eat any kind of real number")

def foo(x: numbers.Real) -> numbers.Real:
  "&qu

[issue47121] math.isfinite() can raise exception when called on a number

2022-04-08 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

Tim, the problem may well be simply due to the documentation of math.isfinite() 
being off here.

This is what we currently have:

https://docs.python.org/3/library/math.html#math.isfinite

===
math.isfinite(x)
Return True if x is neither an infinity nor a NaN, and False otherwise. (Note 
that 0.0 is considered finite.)

New in version 3.2.
===

If this were re-worded as follows (and corresponding changes were made to other 
such functions), everyone would know what the expectations and behavior are:

===
math.isfinite(x)

If `x` is a `float` instance, this evaluates to `True` if `x` is
neither a float infinity nor a NaN, and `False` otherwise.
If `x` is not a `float` instance, this is evaluates to
`math.isfinite(float(x))`.

New in version 3.2.
===

This would be an accurate defining description of the actual behavior. Note 
that, "thanks to PEP-484", this abbreviation would currently be ambiguous 
though:

===
math.isfinite(x)

If `x` is a float, this evaluates to `True` if `x` is
neither a float infinity nor a NaN, and `False` otherwise.
If `x` is not a float, this is evaluates to
`math.isfinite(float(x))`.

New in version 3.2.
===

("ambiguous" since "float" means different things as a static type and as a 
numbers class - and it is not clear what would be referred to here).

Changing/generalizing the behavior might potentially be an interesting other 
proposal, but I would argue that then one would want to change the behavior of 
quite a few other functions here as well, and all this should then perhaps go 
into some other `xmath` (or so) module - bit like it is with `cmath`.

However, since the Python philosophy is to not rely on bureaucracy to enforce 
contracts (as C++, Java, etc. do it), but instead to rely on people's ability 
to define their own contracts, making the math.isfinite() contract more 
accurate w.r.t. actual behavior in the CPython implementation via extra 
clarification looks like a good thing to do, no?

--

___
Python tracker 
<https://bugs.python.org/issue47121>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation

2022-04-08 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

Re AlexWaygood:

If these PEP-484 related things were so obvious that they would admit a compact 
description of the problem in 2-3 lines, these issues would likely have been 
identified much earlier. We would not be seeing them now, given that Python by 
and large is a somewhat mature language.

--

___
Python tracker 
<https://bugs.python.org/issue47234>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45135] dataclasses.asdict() incorrectly calls __deepcopy__() on values.

2021-09-08 Thread Thomas Fischbacher


New submission from Thomas Fischbacher :

This problem may also be the issue underlying some other dataclasses.asdict() 
bugs:

https://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=dataclasses.asdict&submit=search&status=-1%2C1%2C2%2C3

The documentation of dataclasses.asdict() states:

https://docs.python.org/3/library/dataclasses.html#dataclasses.asdict

===
Converts the dataclass instance to a dict (by using the factory function 
dict_factory). Each dataclass is converted to a dict of its fields, as name: 
value pairs. dataclasses, dicts, lists, and tuples are recursed into. For 
example: (...)
===

Given this documentation, the expectation about behavior is roughly:

def _dataclasses_asdict_equivalent_helper(obj, dict_factory=dict):
  rec = lambda x: (
_dataclasses_asdict_equivalent_helper(x,
  dict_factory=dict_factory))
  if isinstance(obj, (list, tuple)):
return type(obj)(rec(x) for x in obj)
  elif isinstance(obj, dict):
return type(obj)((k, rec(v) for k, v in obj.items())
  # Otherwise, we are looking at a dataclass-instance.
  for field in type(obj).__dataclass_fields__:
val = obj.__getattribute__[field]
if (hasattr(type(obj), '__dataclass_fields__')):
  # ^ approx check for "is this a dataclass instance"?
  # Not 100% correct. For illustration only.
  ret[field] = rec(val)
ret[field] = val
  return ret

def dataclasses_asdict_equivalent(x, dict_factory=dict):
   if not hasattr(type(x), '__dataclass_fields__'):
  raise ValueError(f'Not a dataclass: {x!r}')
   return _dataclasses_asdict_equivalent(x, dict_factory=dict_factory)


In particular, field-values that are neither dict, list, tuple, or 
dataclass-instances are expected to be used identically.

What actually happens however is that .asdict() DOES call __deepcopy__ on field 
values it has no business inspecting:

===
import dataclasses


@dataclasses.dataclass
class Demo:
  field_a: object

class Obj:
   def __init__(self, x):
self._x = x

   def __deepcopy__(self, *args):
 raise ValueError('BOOM!')


###
d1 = Demo(field_a=Obj([1,2,3]))
dd = dataclasses.asdict(d1)

# ...Execution does run into a "BOOM!" ValueError.
===

Apart from this: It would be very useful if dataclasses.asdict() came with a 
recurse={boolish} parameter with which one can turn off recursive translation 
of value-objects.

--
components: Library (Lib)
messages: 401360
nosy: tfish2
priority: normal
severity: normal
status: open
title: dataclasses.asdict() incorrectly calls __deepcopy__() on values.
type: behavior

___
Python tracker 
<https://bugs.python.org/issue45135>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45135] dataclasses.asdict() incorrectly calls __deepcopy__() on values.

2021-09-08 Thread Thomas Fischbacher


Thomas Fischbacher  added the comment:

The current behavior deviates from the documentation in a way that might evade 
tests and hence has the potential to cause production outages.

Is there a way to fix the documentation so that it correctly describes current 
behavior - without having to wait for a new release? Eliminating the risk in 
such a way would be highly appreciated.

In the longer run, there may be some value in having a differently named method 
(perhaps .as_dict()?) that basically returns
{k: v for k, v in self.__dict__.items()}, but without going through reflection? 
The current approach to recurse looks as if it were based on quite a few 
doubtful assumptions.

(Context: some style guides, such as Google's Python style guide,
limit the use of reflection in order to keep some overall undesirable processes 
in check: https://google.github.io/styleguide/pyguide.html#2191-definition)

--

___
Python tracker 
<https://bugs.python.org/issue45135>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com