New submission from STINNER Victor:

Attached patch modifies unicode escape and raw unicode escape encoders to use 
the new _PyBytesWriter API.

The patch is optimized to encode Latin1 characters: encoding Latin1 characters 
when no character is escaped should not have to call _PyByte_Resize() at all.

When characters are escaped or a BMP or non-BMP string is encoded, 
overallocation is used to reduce the number of _PyByte_Resize(). It uses 
_PyBytesWriter overallocation strategy instead of always overallocate for the 
worst case.

_PyBytesWriter also embeds a small buffer allocated on the stack which also 
avoids calls to _PyBytes_Resize() when the output fits into 512 bytes.

----------
files: unicode_escape.patch
keywords: patch
messages: 252599
nosy: haypo, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Use _PyBytesWriter for unicode escape and raw unicode escape encoders
type: performance
versions: Python 3.6
Added file: http://bugs.python.org/file40727/unicode_escape.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue25353>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to