Qiangning Hong wrote: > I actually had considered Queue and pop() before I wrote the above code. > However, because there is a lot of data to get every time I call > get_data(), I want a more CPU friendly way to avoid the while-loop and > empty checking, and then the above code comes out. But I am not very > sure whether it will cause serious problem or not, so I ask here. If > anyone can prove it is correct, I'll use it in my program, else I'll go > back to the Queue solution.
OK, here is a real failure mode. Here is the code and the disassembly: >>> class Collector(object): ... def __init__(self): ... self.data = [] ... def on_received(self, a_piece_of_data): ... """This callback is executed in work bee threads!""" ... self.data.append(a_piece_of_data) ... def get_data(self): ... x = self.data ... self.data = [] ... return x ... >>> import dis >>> dis.dis(Collector.on_received) 6 0 LOAD_FAST 0 (self) 3 LOAD_ATTR 1 (data) 6 LOAD_ATTR 2 (append) 9 LOAD_FAST 1 (a_piece_of_data) 12 CALL_FUNCTION 1 15 POP_TOP 16 LOAD_CONST 1 (None) 19 RETURN_VALUE >>> dis.dis(Collector.get_data) 8 0 LOAD_FAST 0 (self) 3 LOAD_ATTR 1 (data) 6 STORE_FAST 1 (x) 9 9 BUILD_LIST 0 12 LOAD_FAST 0 (self) 15 STORE_ATTR 1 (data) 10 18 LOAD_FAST 1 (x) 21 RETURN_VALUE Imagine the thread calling on_received() gets as far as LOAD_ATTR (data), LOAD_ATTR (append) or LOAD_FAST (a_piece_of_data), so it has a reference to self.data; then it blocks and the get_data() thread runs. The get_data() thread could call get_data() and *finish processing the returned list* before the on_received() thread runs again and actually appends to the list. The appended value will never be processed. If you want to avoid the overhead of a Queue.get() for each data element you could just put your own mutex into on_received() and get_data(). Kent -- http://mail.python.org/mailman/listinfo/python-list