New submission from Pascal Chambon:

Hello,

we've encountered several times a very nasty bug on our framework, several 
times tests or even production code (served by mod_wsgi) ended up in a broken 
state, where imports like "from . import processing_exceptions", which were NOT 
in circular imports and were 100% existing submodules, raised exceptions like 
"ImportError: cannot import name processing_exceptions". Restarting the 
test/server fixed it, and we never knew what happened.

I've crossed several forum threads on similar issues, only recently did I find 
one which gave a way to reproduce the bug:
http://stackoverflow.com/questions/12830901/why-does-import-error-change-to-cannot-import-name-on-the-second-import

So here attached is a python2 sample (python3 has the same pb), showing the bug 
(just run their test_import.py)

What happens here, is that a package "mypkg" fails to get imported due to an 
exception (eg. temporarily failuure of DB), but only AFTER successfully 
importing a submodule mypkg.module_a.
Thus, "mypkg.module_a" IS loaded and stays in sys.modules, but "mypkg" is 
erased from sys.modules (like the doc on python imports describes it).

The next time we try, from within the same application, to import "mypkg", and 
we cross "from mypkg import module_a" in the mypkg's __init__.py code, it SEEMS 
that the import system checks sys.modules, and seeing "mypkg.module_a" in it, 
it THINKS that necessarily mypkg is already initialized and contains a name 
"module_a" in its global namespace. Thus the "cannot import name 
processing_exceptions" error.

Importing "module_a" as an absolute or relative import changes nothing, however 
doing "import mypkg.module_a" solves the problem (dunno why).

Another workaround is to cleanup sys.modules in mypkg/__init__.py, to ensure 
that a previously failed attempt at importing the package modules doesn't 
hinder us.
    
    # on top of "mypkg/__init__.py"
    exceeding_modules = [k for k in sys.modules.keys() if 
k.startswith("mypkg.")]
    for k in exceeding_modules:
        del sys.modules[k]
        
Anyway, I don't know enough python's import internals to understand why, 
exactly, on second import attempt, the system tries a kind of faulty 
getattr(mypkg, "module_a"), instead of simply returning 
sys.modules["mypkg.module_a"] which exists.
Could anyone help with that ? 
That's a very damaging issue, imo, since webserver workers can reach a 
completely broken state because of that.

PS: more generally, I guess python users lack insight on the behaviour of "from 
xxx import yyy", especially when yyy is both a real submodule of xxx and a 
variable initialized in xxx/__init__.py (it seems the real module overrides the 
variable), or when the __all__ list of xxx could prevent the import of a 
submodule of xxx by not including it.
Provided I better understand the workflow of all these stuffs - that have quite 
moved recently I heard - I'd be willing to summarize it for the python docs.

----------
components: Interpreter Core
files: ImportFailPy2.zip
messages: 186738
nosy: Pascal.Chambon
priority: normal
severity: normal
status: open
title: IMPORTANT - Process corruption on partly failed imports
type: behavior
versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5
Added file: http://bugs.python.org/file29798/ImportFailPy2.zip

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue17716>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to