[issue36051] Drop the GIL during large bytes.join operations?

Josh Rosenberg Mon, 30 Dec 2019 17:45:43 -0800


Josh Rosenberg <[email protected]> added the comment:


This will introduce a risk of data races that didn't previously exist. If you 
do:

    ba1 = bytearray(b'\x00') * 50000
    ba2 = bytearray(b'\x00') * 50000
    ... pass references to thread that mutates them ...
    ba3 = b''.join((ba1, ba2))

then two things will change from the existing behavior:

1. If the thread in question attempts to write to the bytearrays in place, then 
it could conceivably write data that is only partially picked up (ba1[0], 
ba1[40000] = 2, 3 could end up copying the results of the second write without 
the first; at present, it could only copy the first without the second)

2. If the thread tries to change the size of the bytearrays during the join 
(ba1 += b'123'), it'll die with a BufferError that wasn't previously possible

#1 isn't terrible (as noted, data races in that case already existed, this just 
lets them happen in more ways), but #2 is a little unpleasant; code that 
previously had simple data races (the data might be inconsistent, but the code 
ran and produced some valid output) can now fail hard, nowhere near the actual 
call to join that introduced the behavioral change.

I don't think this sinks the patch (loudly breaking code that was silently 
broken before isn't awful), but I feel like a warning of some kind in the 
documentation (if only a simple compatibility note in What's New) might be 
appropriate.

----------
nosy: +josh.r

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue36051>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue36051] Drop the GIL during large bytes.join operations?

Reply via email to