New submission from Ammar Askar: So currently as far as string concatenation goes. ceval has this nice little branch it can take if both operators are unicode types. However, since this check is an Exact check, it means that subtypes of unicode end up going through the slow code path through: PyNumber_Add -> PyUnicode_Concat.
This patch aims to allow subtypes to take that optimized branch without breaking any existing behavior and without any more memory copy calls than necessary. The motivation for this change is that some templating engines (Mako/Jinja2/Cheetah) use stuff like MarkupSafe which is implemented with a unicode subtype called `Markup`. Concatenating these custom objects (pretty common for templating engines) is fairly slow. This change modifies and uses the existing cpython code to make it a fair bit faster. I think the only real "dangerous" change in here is in the cast_unicode_subtype_to_base function which uses a trick at the end to prevent deallocation of memory. I've made sure to keep it well commented but I'd appreciate any feedback on it. >From what I can tell from running the test suite, all tests pass and there >don't seem to be any new reference leaks. ---------- components: Interpreter Core files: python.diff keywords: patch messages: 269849 nosy: ammar2, benjamin.peterson, ezio.melotti, haypo, lemburg, pitrou priority: normal severity: normal status: open title: Allow subtypes of unicode/str to hit the optimized unicode_concatenate block type: performance Added file: http://bugs.python.org/file43631/python.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27458> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com