bee13oy added the comment: #Python logic error when deal with re and muti-threading ##Bug Description When use re and multi-threading it will trigger the bug. Bug type: `Logic Error`
Test Enviroment: * `Windows 7 SP1 x64 + python 3.4.3` * `Linux kali 3.14-kali1-amd64 + python 2.7.3 ` -----------------------------Normal Case------------------------ - 1. main-thread: join(timeout), wait for sub-thread finished - - 2. sub-thread: while(1), an infinite loop - ---------------------------------------------------------------- Test Code: #!/usr/bin/python __author__ = 'bee13oy' import re import threading timeout = 2 source = "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz" def run(source): while(1): print("test1") def handle(): try: t = threading.Thread(target=run,args=(source,)) t.setDaemon(True) t.start() t.join(timeout) print("thread finished...It's an normal case!\n") except: print("exception ...\n") handle() +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----------------------------Bug Case----------------------------------------------------------------------------- - 1. main-thread: join(timeout), wait for sub-thread finished - - 2. sub-thread: 1)we construct the special pattern "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz" - 2)regexp.search() can't deal with it, and hang up - 3)join(timeout), and the sub-thread was over time, at this time, main-thread should have got - the control of the program. But it didn't. - ------------------------------------------------------------------------------------------------------------------ POC: #!/usr/bin/python __author__ = 'bee13oy' import re import os import threading timeout = 2 source = "(.*(.)?)*bcd\\t\\n\\r\\f\\a\\e\\071\\x3b\\$\\\\\?caxyz" def run(source): regexp = re.compile(r''+source+'') sgroup = regexp.search(source) def handle(): try: t = threading.Thread(target=run,args=(source,)) t.setDaemon(True) t.start() t.join(timeout) print("finished...\n") except: print("exception ...\n") handle() +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ---------------------------------------------------------------- - Bug Analyze - ---------------------------------------------------------------- When we use Python multithreading, and use `join(timeout)` to wait until the **thread terminates** or **timed out**. 1. In normal case, I run a while() in sub-thread, the main thread will get the control of the program after the sub-thread is timed out. 2. In our POC, even the sub-thread timed out, the main thread still can't execute continue. After analyzing, I found the main thread trapped into an infinite loop. At first, it will run into the sub-thread, but it can't end normally. At this time, join(timeout) will wait for the sub-thread return or timed out, and try to call timed out function in order that main thread can get the control of the program. The bug is that the sub-thread was into an infinite loop and the main-thread was into an infinite loop too, which causes the program to be hang up. By analyzing the source code of Python, we found that: - sub-thread is into an infinite loop (code block 0) - main-thread is into an infinite loop (code block 1) -----------------------------code block 0---------------------------------- - the following code is where sub-thread trapped into an infinite loop: - --------------------------------------------------------------------------- the following code is where the sub-thread trapped into an **infinite loop**: ``` LOCAL(Py_ssize_t) SRE(match)(SRE_STATE* state, SRE_CODE* pattern, int match_all) { SRE_CHAR* end = (SRE_CHAR *)state->end; Py_ssize_t alloc_pos, ctx_pos = -1; Py_ssize_t i, ret = 0; Py_ssize_t jump; unsigned int sigcount=0; SRE(match_context)* ctx; SRE(match_context)* nextctx; TRACE(("|%p|%p|ENTER\n", pattern, state->ptr)); DATA_ALLOC(SRE(match_context), ctx); ctx->last_ctx_pos = -1; ctx->jump = JUMP_NONE; ctx->pattern = pattern; ctx->match_all = match_all; ctx_pos = alloc_pos; ..... /* Cycle code which will never return*/ for (;;) { ++sigcount; if ((0 == (sigcount & 0xfff)) && PyErr_CheckSignals()) RETURN_ERROR(SRE_ERROR_INTERRUPTED); switch (*ctx->pattern++) { case SRE_OP_MARK: /* set mark */ /* <MARK> <gid> */ TRACE(("|%p|%p|MARK %d\n", ctx->pattern, ctx->ptr, ctx->pattern[0])); ..... } ``` -----------------------------code block 1---------------------------------- - the following code is where main-thread trapped into an infinite loop: - --------------------------------------------------------------------------- static void take_gil(PyThreadState *tstate) { int err; if (tstate == NULL) Py_FatalError("take_gil: NULL tstate"); err = errno; MUTEX_LOCK(gil_mutex); if (!_Py_atomic_load_relaxed(&gil_locked)) goto _ready; /*Cycle code which will never return*/ while (_Py_atomic_load_relaxed(&gil_locked)) { int timed_out = 0; unsigned long saved_switchnum; saved_switchnum = gil_switch_number; COND_TIMED_WAIT(gil_cond, gil_mutex, INTERVAL, timed_out); /* If we timed out and no switch occurred in the meantime, it is time to ask the GIL-holding thread to drop it. */ if (timed_out && _Py_atomic_load_relaxed(&gil_locked) && gil_switch_number == saved_switchnum) { SET_GIL_DROP_REQUEST(); } } ..... } ---------- Added file: http://bugs.python.org/file39853/poc&bug_detail.zip _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue24555> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com