Jeff Robbins <je...@livedata.com> added the comment:
Steve, I did some more digging into why the native condition variable approach might be causing problems on Windows. Testing my fix revealed that there is at least one place in Modules\overlapped.c that either a) waits too long to call GetLastError(), or b) reveals an expectation that Py_END_ALLOW_THREADS won't change the results of GetLastError(). Py_BEGIN_ALLOW_THREADS ret = GetQueuedCompletionStatus(CompletionPort, &NumberOfBytes, &CompletionKey, &Overlapped, Milliseconds); save_err = GetLastError(); Py_END_ALLOW_THREADS err = ret ? ERROR_SUCCESS : GetLastError(); The problem in this code is that it allows *other* Windows API calls between the original Windows API call (in this case GetQueuedCompletionStatus()) and the call to GetLastError(). If those other Windows API calls change the thread-specific GetLastError state, the info we need is lost. To test for this possibility, I added a diagnostic test right after the code above if (!ret && (err != save_err)) { printf("GetQueuedCompletionStatus returned 0 but we lost the error=%d lost=%d Overlapped=%d\n", save_err, err, (long)Overlapped); } and ran a test that eventually produced this on the console: GetQueuedCompletionStatus returned 0 but we lost the error=258 lost=0 Overlapped=0 error 258 is WAIT_TIMEOUT. The next lines of code are looking for that "error" in order to decide if GetQueuedCompletionStatus failed, or merely timed out. if (Overlapped == NULL) { if (err == WAIT_TIMEOUT) Py_RETURN_NONE; else return SetFromWindowsErr(err); } So the impact of this problem is severe. Instead of returning None to the caller (in this case _poll in asyncio\windows_events.py), it will raise an error! while True: status = _overlapped.GetQueuedCompletionStatus(self._iocp, ms) if status is None: break And, to make things extra confusing, the error raised via SetFromWindowsErr(err) (where err == 0) ends up looking like this: OSError: [WinError 0] The operation completed successfully A valid WAIT_TIMEOUT thus gets converted to a Python error, but also loses the original Windows Error Code of 258, so you are left scratching your head about how a WinError 0 (ERROR_SUCCESS) could have crashed your call to, say, asyncio.run()? (See traceback below.) So either we need to make sure that all calls to GetLastError() are made immediately after the relevant Windows API call, without any intervening other Windows API calls, and thereby prevent case a) above, or as in case b), the GIL code (using either emulated or native condition variables from condvar.h) needs to preserve the Error state. Some code in Python\thread_nt.h in fact does this already, e.g. void * PyThread_get_key_value(int key) { /* because TLS is used in the Py_END_ALLOW_THREAD macro, * it is necessary to preserve the windows error state, because * it is assumed to be preserved across the call to the macro. * Ideally, the macro should be fixed, but it is simpler to * do it here. */ DWORD error = GetLastError(); void *result = TlsGetValue(key); SetLastError(error); return result; } Of course there might be *other* problems associated with using native condition variables on Windows, but this is the only one I've experienced after some fairly heavy use of Python 3.7.2 asyncio on Windows. traceback: ------------------------------------------------------------------------------------------------------------------------ asyncio.run(self.main()) File "C:\Users\jeffr\Documents\projects\Python-3.7.2\lib\asyncio\runners.py", line 43, in run return loop.run_until_complete(main) File "C:\Users\jeffr\Documents\projects\Python-3.7.2\lib\asyncio\base_events.py", line 571, in run_until_complete self.run_forever() File "C:\Users\jeffr\Documents\projects\Python-3.7.2\lib\asyncio\base_events.py", line 539, in run_forever self._run_once() File "C:\Users\jeffr\Documents\projects\Python-3.7.2\lib\asyncio\base_events.py", line 1739, in _run_once event_list = self._selector.select(timeout) File "C:\Users\jeffr\Documents\projects\Python-3.7.2\lib\asyncio\windows_events.py", line 405, in select self._poll(timeout) File "C:\Users\jeffr\Documents\projects\Python-3.7.2\lib\asyncio\windows_events.py", line 703, in _poll status = _overlapped.GetQueuedCompletionStatus(self._iocp, ms) OSError: [WinError 0] The operation completed successfully ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35662> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com