On 3/29/23 13:13, Chris Angelico wrote:
On Thu, 30 Mar 2023 at 01:52, Jack Dangler <tdl...@gmail.com> wrote:
On 3/29/23 02:08, Chris Angelico wrote:
On Wed, 29 Mar 2023 at 16:56, Greg Ewing via Python-list
<python-list@python.org> wrote:
On 28/03/23 2:25 pm, Travis Griggs wrote:
Interestingly the error also only started showing up when I switched from
running a statistics.mean() on one of these, instead of what I had been using,
a statistics.median(). Apparently the kind of iteration done in a mean, is more
conflict prone than a median?
It may be a matter of whether the GIL is held or not. I had a look
at the source for deque, and it doesn't seem to explicitly do
anything about locking, it just relies on the GIL.
So maybe statistics.median() is implemented in C and statistics.mean()
in Python, or something like that?
Both functions are implemented in Python, but median() starts out with
this notable line:
data = sorted(data)
which gives back a copy, iterated over rapidly in C. All subsequent
work is done on that copy.
The same effect could be had with mean() by taking a snapshot using
list(q) and, I believe, would have the same effect (the source code
for the sorted() function begins by calling PySequence_List).
In any case, it makes *conceptual* sense to do your analysis on a copy
of the queue, thus ensuring that your stats are stable. The other
threads can keep going while you do your calculations, even if that
means changing the queue.
ChrisA
Sorry for any injected confusion here, but that line "data =
sorted(data)" appears as though it takes the value of the variable named
_data_, sorts it and returns it to the same variable store, so no copy
would be created. Am I missing something there?
The variable name "data" is the parameter to median(), so it's
whatever you ask for the median of. (I didn't make that obvious in my
previous post - an excess of brevity on my part.)
The sorted() function, UNlike list.sort(), returns a sorted copy of
what it's given. I delved into the CPython source code for that, and
it begins with the PySequence_List call to (effectively) call
list(data) to get a copy of it. It ought to be a thread-safe copy due
to holding the GIL the entire time. I'm not sure what would happen in
a GIL-free world but most likely the lock on the input object would
still ensure thread safety.
ChrisA
Aah - thanks, Chris! That makes much more sense.
--
https://mail.python.org/mailman/listinfo/python-list