New submission from STINNER Victor <vstin...@python.org>:

At exit, Python calls Py_Finalize() which tries to clear every single Python 
objects. The order is which Python objects are cleared is not fully 
deterministic. Py_Finalize() uses an heuristic to attempt to clear modules of 
sys.modules in the "best" order.

The current code creates a weak reference to a module, set sys.modules[name] to 
None, and then clears the module attribute if and only if the module object was 
not destroyed (if the weak reference still points to the module).

The problem is that even if a module object is destroyed, the module dictionary 
can remain alive thanks for various kinds of strong references to it.

Worst case example:
---
class VerboseDel:
    def __del__(self):
        print("Goodbye Cruel World")
obj = VerboseDel()

def func():
    pass

import os
os.register_at_fork(after_in_child=func)
del os
del func

print("exit")
---

Output:
---
$ python3.9 script.py
exit
---

=> The VerboseDel object is never destroyed :-( BUG!


Explanation:

* os.register_at_fork(after_in_child=func) stores func in 
PyInterpreterState.after_forkers_child -> func() is kept alive until 
interpreter_clear() calls Py_CLEAR(interp->after_forkers_child);

* func() has reference to the module dictionary

I'm not sure why the VerboseDel object is not destroyed.


I propose to rewrite the finalize_modules() to clear modules in a more 
deterministic order:

* start by clearing __main__ module variables
* then iterate on reversed(sys.modules.values()) and clear the module variables
* Module attributes are cleared by _PyModule_ClearDict(): iterate on 
reversed(module.__dict__) and set dict values to None


Drawback: it is a backward incompatible change. Code which worked by luck 
previously no longer works. I'm talking about applications which rely on 
__del__() methods being calling in an exact order and expect Python being in a 
specific state.

Example:
---
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

a = VerboseDel("a")
b = VerboseDel("b")
c = VerboseDel("c")
---

Output:
---
c
b
a
---

=> Module attributes are deleted in the reverse order of their definition: the 
most recent object is deleted first, the oldest is deleted last.


Example 2 with 3 modules (4 files):
---
$ cat a.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

a = VerboseDel("a")


$ cat b.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

b = VerboseDel("b")


$ cat c.py 
class VerboseDel:
    def __init__(self, name):
        self.name = name
    def __del__(self):
        print(self.name)

c = VerboseDel("c")


$ cat z.py 
import a
import b
import c
---

Output:
---
$ ./python z.py 
c
b
a
---

=> Modules are deleted from the most recently imported (import c) to the least 
recently imported module (import a).

----------
components: Interpreter Core
messages: 383265
nosy: vstinner
priority: normal
severity: normal
status: open
title: Make the Python finalization more deterministic
versions: Python 3.10

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue42671>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to