Previously, on Jun 14, Jeremy Jones said: # Kent Johnson wrote: # # > Peter Hansen wrote: # > # > > Qiangning Hong wrote: # > > # > > # > > > A class Collector, it spawns several threads to read from serial port. # > > > Collector.get_data() will get all the data they have read since last # > > > call. Who can tell me whether my implementation correct? # > > > # > > [snip sample with a list] # > > # > > # > > > I am not very sure about the get_data() method. Will it cause data lose # > > > if there is a thread is appending data to self.data at the same time? # > > > # > > That will not work, and you will get data loss, as Jeremy points out. # > > # > > Normally Python lists are safe, but your key problem (in this code) is # > > that you are rebinding self.data to a new list! If another thread calls # > > on_received() just after the line "x = self.data" executes, then the new # > > data will never be seen. # > > # > # > Can you explain why not? self.data is still bound to the same list as x. At # > least if the execution sequence is x = self.data # > self.data.append(a_piece_of_data) # > self.data = [] # > # > ISTM it should work. # > # > I'm not arguing in favor of the original code, I'm just trying to understand # > your specific failure mode. # > # > Thanks, # > Kent # > # Here's the original code: # # class Collector(object): # def __init__(self): # self.data = [] # spawn_work_bees(callback=self.on_received) # # def on_received(self, a_piece_of_data): # """This callback is executed in work bee threads!""" # self.data.append(a_piece_of_data) # # def get_data(self): # x = self.data # self.data = [] # return x # # The more I look at this, the more I'm not sure whether data loss will occur. # For me, that's good enough reason to rewrite this code. I'd rather be clear # and certain than clever anyday. # So, let's say you a thread T1 which starts in ``get_data()`` and makes it as # far as ``x = self.data``. Then another thread T2 comes along in # ``on_received()`` and gets as far as ``self.data.append(a_piece_of_data)``. # ``x`` in T1's get_data()`` (as you pointed out) is still pointing to the list # that T2 just appended to and T1 will return that list. But what happens if # you get multiple guys in ``get_data()`` and multiple guys in # ``on_received()``? I can't prove it, but it seems like you're going to have # an uncertain outcome. If you're just dealing with 2 threads, I can't see how # that would be unsafe. Maybe someone could come up with a use case that would # disprove that. But if you've got, say, 4 threads, 2 in each method....that's # gonna get messy. # And, honestly, I'm trying *really* hard to come up with a scenario that would # lose data and I can't. Maybe someone like Peter or Aahz or some little 13 # year old in Topeka who's smarter than me can come up with something. But I do # know this - the more I think about this as to whether this is unsafe or not is # making my head hurt. If you have a piece of code that you have to spend that # much time on trying to figure out if it is threadsafe or not, why would you # leave it as is? Maybe the rest of you are more confident in your thinking and # programming skills than I am, but I would quickly slap a Queue in there. If # for nothing else than to rest from simulating in my head 1, 2, 3, 5, 10 # threads in the ``get_data()`` method while various threads are in the # ``on_received()`` method. Aaaagghhh.....need....motrin...... # # # Jeremy Jones #
I may be wrong here, but shouldn't you just use a stack, or in other words, use the list as a stack and just pop the data off the top. I believe there is a method pop() already supplied for you. Since you wouldn't require an self.data = [] this should allow you to safely remove the data you've already seen without accidentally removing data that may have been added in the mean time. --- James Tanis [EMAIL PROTECTED] http://pycoder.org -- http://mail.python.org/mailman/listinfo/python-list
