[issue46972] Documentation: Reference says AssertionError is raised by `assert`, but not all AssertionErrors are.
New submission from Thomas Fischbacher : The Python reference says: (1) https://docs.python.org/3/library/exceptions.html#concrete-exceptions exception AssertionError Raised when an assert statement fails. (2) https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement "assert ..." is equivalent to "if __debug__: ..." >From this, one can infer the guarantee "the -O flag will suppress >AssertionError exceptions from being raised". However, there is code in the Python standard library that does a direct "raise AssertionError" (strictly speaking, in violation of (1)), and it is just reasonable to assume that other code following the design of that would then also want to do a direct "raise AssertionError". This happens e.g. in many methods defined in: unittest/mock.py The most appropriate fix here may be to change the documentation to not say: === exception AssertionError Raised when an assert statement fails. === but instead: === exception AssertionError An assert[{add reference to `assert` definition}] statement fails, or a unit testing related assert{...}() callable detects an assertion violation. === -- messages: 414837 nosy: tfish2 priority: normal severity: normal status: open title: Documentation: Reference says AssertionError is raised by `assert`, but not all AssertionErrors are. ___ Python tracker <https://bugs.python.org/issue46972> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46972] Documentation: Reference says AssertionError is raised by `assert`, but not all AssertionErrors are.
Thomas Fischbacher added the comment: The documentation of exceptions in the reference is one of the places that makes the life of users substantially harder than it ought to be, since the documentation appears to not have been written with the intent to give guarantees that users can expect correctly written code to follow. I would argue that "The reference documentation for X states that it gets raised under condition Y" generally should be understood as "this is a guarantee that also includes the guarantee that it is not raised under other conditions in correctly written code". Other languages often appear to be somewhat stricter w.r.t. interpreting the reference documentation as binding for correct code - and for Python, having this certainly would help a lot when writing code that can give binding guarantees. -- ___ Python tracker <https://bugs.python.org/issue46972> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46972] Documentation: Reference says AssertionError is raised by `assert`, but not all AssertionErrors are.
Thomas Fischbacher added the comment: Addendum Serhiy, I agree that my assessment was incorrect. It actually is unittest/mock.py that has quite a few 'raise AssertionError' that are not coming from an 'assert' keyword statement. At a deeper level, the problem here is as follows: Every programming language has to make an awkward choice: either it excludes some authors ("must be forklift certified"), or it adds a lot of bureaucratic scaffolding to have some mechanisms that allow code authors to enforce API contracts (as if this would help to "keep out the tide" of unprincipled code authors), or it takes a more relaxed perspective - as also Perl did - of "we are all responsible users" / "do not do this because you are not invited, not because the owner has a shotgun". I'd call this third approach quite reasonable overall, but then the understanding is that "everybody treats documentation as binding and knows how to write good documentation". After all, we need to be able to reason about code, and in order to do that, it matters to have guarantees such as for example: "Looking up a nonexistent key for a mapping by evaluating the_mapping[the_key] can raise an exception, and when it does, that exception is guaranteed to be an instance of KeyError". Unfortunately, Python on the one hand emphasizes "responsible behavior" - i.e. "people know how to write and read documentation, and the written documentation creates a shared understanding between its author and reader", but on the other hand is often really bad at properly documenting its interfaces. If I had to name one thing that really needs fixing with Python, it would be this. -- ___ Python tracker <https://bugs.python.org/issue46972> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47121] math.isfinite() can raise exception when called on a number
New submission from Thomas Fischbacher : >>> help(math.isfinite) isfinite(x, /) Return True if x is neither an infinity nor a NaN, and False otherwise. So, one would expect the following expression to return `True` or `False`. We instead observe: >>> math.isfinite(10**1000) Traceback (most recent call last): File "", line 1, in OverflowError: int too large to convert to float (There likewise is a corresponding issue with other, similar, functions). This especially hurts since PEP-484 states that having a Sequence[float] `xs` does not allow us to infer that `all(issubclass(type(x), float) for x in xs)` actually holds - since a PEP-484 "float" actually does also include "int" (and still, issubclass(int, float) == False). Now, strictly speaking, `help(math)` states that DESCRIPTION This module provides access to the mathematical functions defined by the C standard. ...but according to "man 3 isfinite", the math.h "isfinite" is a macro and not a function - and the man page does not show type information for that reason. -- messages: 416010 nosy: tfish2 priority: normal severity: normal status: open title: math.isfinite() can raise exception when called on a number ___ Python tracker <https://bugs.python.org/issue47121> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47121] math.isfinite() can raise exception when called on a number
Thomas Fischbacher added the comment: The problem with PEP-484 is that if one wants to use static type analysis, neither of these options are good: - Use static annotations on functions, and additionally spec out expectations in docstrings. Do note that the two types places where "float" is mentioned here refer to different concepts. This looks as if there were duplication, but there actually isn't, since the claims are different. This is confusing as hell. def foo(x: float) -> float: """Foos the barbaz Args: x: float, the foobar Returns: float, the foofoo""" The floats in the docstring give me a guarantee: "If I feed in a float, I am guaranteed to receive back a float". The floats in the static type annotation merely say "yeah, can be float or int, and I'd call it ok in these cases" - that's a very different statement. - Just go with static annotations, drop mention of types from docstrings, and accept that we lose the ability to stringently reason about the behavior of code. With respect to this latter option, I think we can wait for "losing the ability to stringently reason about the behavior of code" to cause major security headaches. That's basically opening up the door to many problems at the level of "I can crash the webserver by requesting the url http://lpt1";. -- ___ Python tracker <https://bugs.python.org/issue47121> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation
New submission from Thomas Fischbacher : Here is a major general problem with python-static-typing as it is described by PEP-484: The approach described in https://peps.python.org/pep-0484/#the-numeric-tower negatively impacts our ability to reason about the behavior of code with stringency. I would like to clarify one thing in advance: this is a real problem if we subscribe to some of the important ideas that Dijkstra articulated in his classic article "On the role of scientific thought" (e.g.: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD447.html). Specifically, this part: """ Let me try to explain to you, what to my taste is characteristic for all intelligent thinking. It is, that one is willing to study in depth an aspect of one's subject matter in isolation for the sake of its own consistency, all the time knowing that one is occupying oneself only with one of the aspects. We know that a program must be correct and we can study it from that viewpoint only; we also know that it should be efficient and we can study its efficiency on another day, so to speak. In another mood we may ask ourselves whether, and if so: why, the program is desirable. But nothing is gained —on the contrary!— by tackling these various aspects simultaneously. It is what I sometimes have called "the separation of concerns", which, even if not perfectly possible, is yet the only available technique for effective ordering of one's thoughts, that I know of. This is what I mean by "focussing one's attention upon some aspect": it does not mean ignoring the other aspects, it is just doing justice to the fact that from this aspect's point of view, the other is irrelevant. It is being one- and multiple-track minded simultaneously. """ So, "code should be easy to reason about". Now, let us look at this function - I am here (mostly) following the Google Python style guide (https://google.github.io/styleguide/pyguide.html) for now: === Example 1, original form === def middle_mean(xs): """Compute the average of the nonterminal elements of `xs`. Args: `xs`: a list of floating point numbers. Returns: A float, the mean of the elements in `xs[1:-1]`. Raises: ValueError: If `len(xs) < 3`. """ if len(xs) < 3: raise ValueError('Need at least 3 elements to compute middle mean.') return sum(xs[1:-1]) / (len(xs) - 2) == Let's not discuss performance, or whether it makes sense to readily generalize this to operate on other sequences than lists, but focus, following Dijkstra, on one specific concern here: Guaranteed properties. Given the function as it is above, I can make statements that are found to be correct when reasoning with mathematical rigor, such as this specific one that we will come back to: === Theorem 1 === If we have an object X that satisfies these properties...: 1. type(X) is list 2. len(X) == 4 3. all(type(x) is float for x in X) ...then we are guaranteed that `middle_mean(X)` evaluates to a value Y which satisfies: - type(Y) is float - Y == (X[1] + X[2]) * 0.5 or math.isnan(Y) === Now, following PEP-484, we would want to re-write our function, adding type annotations. Doing this mechanically would give us: === Example 1, with mechanically added type information === def middle_mean(xs: List[float]) -> float: """Compute the average of the nonterminal elements of `xs`. Args: `xs`: a list of floating point numbers. Returns: A float, the mean of the elements in `xs[1:-1]`. Raises: ValueError: If `len(xs) < 3`. """ if len(xs) < 3: raise ValueError('Need at least 3 elements to compute middle mean.') return sum(xs[1:-1]) / (len(xs) - 2) == (We are also deliberately not discussing another question here: given this documentation and type annotation, should the callee be considered to be permitted to mutate the input list?) So, given the above form, we now find that there seems to be quite a bit of redundancy here. After all, we have the type annotation but also repeat some typing information in the docstring. Hence, the obvious proposal here is to re-write the above definition again, obtaining: === Example 1, "cleaned up" === def middle_mean(xs: List[float]) -> float: """Compute the average of the nonterminal elements of `xs`. Args: `xs`: numbers to average, with terminals ignored. Returns: The mean of the elements in `xs[1:-1]`. Raises: ValueError: If `len(xs) < 3`. """ if len(xs) < 3: raise ValueError('Need at least 3 elements to compute middle mean.') return sum(xs[1:-1]) / (len(xs) - 2) == But now, what does this change mean for the contract? Part of the "If arguments have these properties, then these are the guarantees&quo
[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation
Thomas Fischbacher added the comment: This is not a partial duplicate of https://bugs.python.org/issue47121 about math.isfinite(). The problem there is about a specific function on which the documentation may be off - I'll comment separately on that. The problem here is: There is a semantic discrepancy between what the term 'float' means "at run time", such as in a check like: issubclass(type(x), float) (I am deliberately writing it that way, given that isinstance() can, in general [but actually not for float], lie.) and what the term 'float' means in a statically-checkable type annotation like: def f(x: float) -> ... : ... ...and this causes headaches. The specific example ('middle_mean') illustrates the sort of weird situations that arise due to this. (I discovered this recently when updating some of our company's Python onboarding material, where the aspiration naturally is to be extremely accurate with all claims.) So, basically, there is a choice to make between these options: Option A: Give up on the idea that "we want to be able to reason with stringency about the behavior of code" / "we accept that there will be gaps between what code does and what we can reason about". (Not really an option, especially with an eye on "writing secure code requires being able to reason out everything with stringency".) Option B: Accept the discrepancy and tell people that they have to be mindful about float-the-dynamic-type being a different concept from float-the-static-type. Option C: Realizing that having "float" mean different things for dynamic and static typing was not a great idea to begin with, and get everybody who wants to state things such as "this function parameter can be any instance of a real number type" to use the type `numbers.Real` instead (which may well need better support by tooling), respectively express "can be int or float" as `Union[int, float]`. Also, there is Option D: PEP-484 has quite a lot of other problems where the design does not meet rather natural requirements, such as: "I cannot introduce a newtype for 'a mapping where I know the key to be a particular enum-type, but the value is type-parametric' (so the new type would also be 1-parameter type-parametric)", and this float-mess is merely one symptom of "maybe PEP-484 was approved too hastily and should have been also scrutinized by people from a community with more static typing experience". Basically, Option B would spell out as: 'We expect users who use static type annotations to write code like this, and expect them to be aware of the fact that the four places where the term "float" occurs refer to two different concepts': def foo(x: float) -> float: """Returns the foo of the number `x`. Args: x: float, the number to foo. Returns: float, the value of the foo-function at `x`. """ ... ...which actually is shorthand for...: def foo(x: float # Note: means float-or-int ) -> float # Note: means float-or-int : """Returns the foo of the number `x`. Args: x: the number to foo, an instance of the `float` type. Returns: The value of the foo-function at `x`, as an instance of the `float` type. """ ... Option C (and perhaps D) appear - to me - to be the only viable choices here. The pain with Option C is that it invalidates/changes the meaning of already-written code that claims to follow PEP-484, and the main point of Option D is all about: "If we have to cause a new wound and open up the patient again, let's try to minimize the number of times we have to do this." Option C would amount to changing the meaning of...: def foo(x: float) -> float: """Returns the foo of the number `x`. Args: x: float, the number to foo. Returns: float, the value of the foo-function at `x`. """ ... to "static type annotation float really means instance-of-float here" (I do note that issubclass(numpy.float64, float), so passing a numpy-float64 is expected to work here, which is good), and ask people who would want to have functions that can process more generic real numbers to announce this properly. So, we would end up with basically a list of different things that a function-sketch like the one above could turn into - depending on the author's intentions for the function, some major cases being perhaps: (a) ("this is supposed to strictly operate on float") def foo(x: float) -> float: """Returns the foo of the number `x`. Args: x: the number to foo. Returns: the value of the foo-function at `x`. """ (b) ("this will eat any kind of real number") def foo(x: numbers.Real) -> numbers.Real: "&qu
[issue47121] math.isfinite() can raise exception when called on a number
Thomas Fischbacher added the comment: Tim, the problem may well be simply due to the documentation of math.isfinite() being off here. This is what we currently have: https://docs.python.org/3/library/math.html#math.isfinite === math.isfinite(x) Return True if x is neither an infinity nor a NaN, and False otherwise. (Note that 0.0 is considered finite.) New in version 3.2. === If this were re-worded as follows (and corresponding changes were made to other such functions), everyone would know what the expectations and behavior are: === math.isfinite(x) If `x` is a `float` instance, this evaluates to `True` if `x` is neither a float infinity nor a NaN, and `False` otherwise. If `x` is not a `float` instance, this is evaluates to `math.isfinite(float(x))`. New in version 3.2. === This would be an accurate defining description of the actual behavior. Note that, "thanks to PEP-484", this abbreviation would currently be ambiguous though: === math.isfinite(x) If `x` is a float, this evaluates to `True` if `x` is neither a float infinity nor a NaN, and `False` otherwise. If `x` is not a float, this is evaluates to `math.isfinite(float(x))`. New in version 3.2. === ("ambiguous" since "float" means different things as a static type and as a numbers class - and it is not clear what would be referred to here). Changing/generalizing the behavior might potentially be an interesting other proposal, but I would argue that then one would want to change the behavior of quite a few other functions here as well, and all this should then perhaps go into some other `xmath` (or so) module - bit like it is with `cmath`. However, since the Python philosophy is to not rely on bureaucracy to enforce contracts (as C++, Java, etc. do it), but instead to rely on people's ability to define their own contracts, making the math.isfinite() contract more accurate w.r.t. actual behavior in the CPython implementation via extra clarification looks like a good thing to do, no? -- ___ Python tracker <https://bugs.python.org/issue47121> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47234] PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation
Thomas Fischbacher added the comment: Re AlexWaygood: If these PEP-484 related things were so obvious that they would admit a compact description of the problem in 2-3 lines, these issues would likely have been identified much earlier. We would not be seeing them now, given that Python by and large is a somewhat mature language. -- ___ Python tracker <https://bugs.python.org/issue47234> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45135] dataclasses.asdict() incorrectly calls __deepcopy__() on values.
New submission from Thomas Fischbacher : This problem may also be the issue underlying some other dataclasses.asdict() bugs: https://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40sort=-activity&%40filter=status&%40action=searchid&ignore=file%3Acontent&%40search_text=dataclasses.asdict&submit=search&status=-1%2C1%2C2%2C3 The documentation of dataclasses.asdict() states: https://docs.python.org/3/library/dataclasses.html#dataclasses.asdict === Converts the dataclass instance to a dict (by using the factory function dict_factory). Each dataclass is converted to a dict of its fields, as name: value pairs. dataclasses, dicts, lists, and tuples are recursed into. For example: (...) === Given this documentation, the expectation about behavior is roughly: def _dataclasses_asdict_equivalent_helper(obj, dict_factory=dict): rec = lambda x: ( _dataclasses_asdict_equivalent_helper(x, dict_factory=dict_factory)) if isinstance(obj, (list, tuple)): return type(obj)(rec(x) for x in obj) elif isinstance(obj, dict): return type(obj)((k, rec(v) for k, v in obj.items()) # Otherwise, we are looking at a dataclass-instance. for field in type(obj).__dataclass_fields__: val = obj.__getattribute__[field] if (hasattr(type(obj), '__dataclass_fields__')): # ^ approx check for "is this a dataclass instance"? # Not 100% correct. For illustration only. ret[field] = rec(val) ret[field] = val return ret def dataclasses_asdict_equivalent(x, dict_factory=dict): if not hasattr(type(x), '__dataclass_fields__'): raise ValueError(f'Not a dataclass: {x!r}') return _dataclasses_asdict_equivalent(x, dict_factory=dict_factory) In particular, field-values that are neither dict, list, tuple, or dataclass-instances are expected to be used identically. What actually happens however is that .asdict() DOES call __deepcopy__ on field values it has no business inspecting: === import dataclasses @dataclasses.dataclass class Demo: field_a: object class Obj: def __init__(self, x): self._x = x def __deepcopy__(self, *args): raise ValueError('BOOM!') ### d1 = Demo(field_a=Obj([1,2,3])) dd = dataclasses.asdict(d1) # ...Execution does run into a "BOOM!" ValueError. === Apart from this: It would be very useful if dataclasses.asdict() came with a recurse={boolish} parameter with which one can turn off recursive translation of value-objects. -- components: Library (Lib) messages: 401360 nosy: tfish2 priority: normal severity: normal status: open title: dataclasses.asdict() incorrectly calls __deepcopy__() on values. type: behavior ___ Python tracker <https://bugs.python.org/issue45135> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45135] dataclasses.asdict() incorrectly calls __deepcopy__() on values.
Thomas Fischbacher added the comment: The current behavior deviates from the documentation in a way that might evade tests and hence has the potential to cause production outages. Is there a way to fix the documentation so that it correctly describes current behavior - without having to wait for a new release? Eliminating the risk in such a way would be highly appreciated. In the longer run, there may be some value in having a differently named method (perhaps .as_dict()?) that basically returns {k: v for k, v in self.__dict__.items()}, but without going through reflection? The current approach to recurse looks as if it were based on quite a few doubtful assumptions. (Context: some style guides, such as Google's Python style guide, limit the use of reflection in order to keep some overall undesirable processes in check: https://google.github.io/styleguide/pyguide.html#2191-definition) -- ___ Python tracker <https://bugs.python.org/issue45135> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com