Some code needs to maintain an output buffer that has an unpredictable size.
Such as bz2/lzma/zlib modules, _PyBytesWriter/_PyUnicodeWriter.
In current code, when the output buffer grows, resizing will cause unnecessary
memcpy().
issue41486 uses memory blocks to represent output buffer in bz2/lzma/zlib
modules, it could eliminate the overhead of resizing.
There are benchmark charts in issue41486: https://bugs.python.org/issue41486
_PyBytesWriter/_PyUnicodeWriter could use the same way.
If write a "general blocks output buffer", it could be used in
_PyBytesWriter/bz2/lzma/zlib. (issue41486 is not very general, it uses a bytes
object to represent a memory block.)
If write a new _PyUnicodeWriter like this, it has a chance to eliminate the
overhead of switching PyUnicode_Kind (record the switching position):
'a' * 100_000_000 + '\uABCD'
If anyone has time and is willing to try, it's very welcome.
Or I might do this at sometime in the future.
___
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/UMB52BEZCX424K5K2ZNPWV7ZTQAGYL53/
Code of Conduct: http://python.org/psf/codeofconduct/