[Python-Dev] Hash computation enhancement for {buffer, string, unicode}object
Hi All, This is Alecsandru from Server Scripting Languages Optimization team at Intel Corporation. I would like to submit a patch that improves the performance of the hash computation code on stringobject, bufferobject and unicodeobject. As can be seen from the attached sample performance results from the Grand Unified Python Benchmark, speedups up to 40% were observed. Furthermore, we see a 5-7% performance on OpenStack/Swift, where most of the code is in Python 2.7. Attached is the patch that modifies Object/stringobject.c, Object/bufferobject.c and Object/unicodeobject.c files. We built and tested this patch for Python 2.7 on our Linux machines (CentOS 7/Ubuntu Server 14.04, Intel Xeon Haswell/Broadwell with 18/8 cores). I've also opened an issue on the bug tracker: http://bugs.python.org/issue25106 Steps to apply the patch: 1. hg clone https://hg.python.org/cpython cpython 2. cd cpython 3. hg update 2.7 4. Copy hash8.patch to the current directory 5. hg import --no-commit hash8.patch 6. ./configure 7. make In the following, please find our sample performance results measured on a XEON Haswell machine. Hardware (HW): Intel XEON (Haswell) 18 Cores BIOS settings: Intel Turbo Boost Technology: false Hyper-Threading: false Operating System: Ubuntu 14.04.3 LTS trusty OS configuration: CPU freq set at fixed: 2.0GHz by echo 200 > /sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq echo 200 > /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq Address Space Layout Randomization (ASLR) disabled (to reduce run to run variation) by echo 0 > /proc/sys/kernel/randomize_va_space GCC version:gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04) Benchmark: Grand Unified Python Benchmark (GUPB) GUPB Source: https://hg.python.org/benchmarks/ Python2.7 results: Python source: hg clone https://hg.python.org/cpython cpython Python Source: hg update 2.7 Benchmarks Speedup(%) unpack_sequence 40.32733766 chaos 24.84002537 chameleon 23.01392651 silent_logging 22.27202911 django 20.83842317 etree_process 20.46968294 nqueens 20.34234985 pathlib 19.63445919 pidigits19.34722148 etree_generate 19.25836634 pybench 19.06895825 django_v2 18.06073108 etree_iterparse 17.3797149 fannkuch17.08120879 pickle_list 16.60363602 raytrace16.0316265 slowpickle 15.86611184 pickle_dict 15.30447114 call_simple 14.42909032 richards14.2949594 simple_logging 13.6522626 etree_parse 13.38113097 json_dump_v212.2655 float 11.88164311 mako11.20606516 spectral_norm 11.04356684 hg_startup 10.57686164 mako_v2 10.37912648 slowunpickle10.24030714 go 10.03567319 meteor_contest 9.956231435 normal_startup 9.607401586 formatted_logging 9.601244811 html5lib9.082603748 2to38.741557816 html5lib_warmup 8.268150981 nbody 7.507012306 regex_compile 7.153922724 bzr_startup 7.140244739 telco 6.869411927 slowspitfire5.746323922 tornado_http5.24360121 rietveld3.865704876 regex_v83.777622219 hexiom2 3.586305282 json_dump 3.477551682 spambayes 3.183991854 fastunpickle2.971645347 fastpickle 0.673086656 regex_effbot0.127946837 json_load 0.023727176 Thank you, Alecsandru hash8-v01.patch Description: hash8-v01.patch ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] semantics of subclassing things from itertools
On 10.09.15 15:50, Maciej Fijalkowski wrote: On Thu, Sep 10, 2015 at 10:26 AM, Serhiy Storchaka wrote: There is another reason why itertools iterators can't be implemented as simple generator functions. All iterators are pickleable in 3.x. maybe the documentation should reflect that? (note that generators are pickleable on pypy anyway) This pickling is not compatible with CPython. So even if itertools classes would not subclassable, you would need to implement itertools iterators as classes. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] What happens of the Python 3.4 branch?
Hi, Python 3.5.0 was released. What happens to the 3.4 branch in Mercurial? Does it still accept bugfixes, or is it only for security fixes now? Victor ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What happens of the Python 3.4 branch?
On 09/14/2015 09:29 AM, Victor Stinner wrote: Python 3.5.0 was released. What happens to the 3.4 branch in Mercurial? Does it still accept bugfixes, or is it only for security fixes now? Nothing has been announced or decided. As release manager I suppose I get some say. Here, I'll propose something: Python 3.4.4 rc1 should be released on Sunday October 4th. Python 3.4.4 final should be released on Sunday October 13th. After the tag of 3.4.4, Python 3.4 should enter security-fixes-only mode, and any future releases (3.4.5+) will be source code only. How's that? //arry/ ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What happens of the Python 3.4 branch?
On 14/09/2015 10:49, Larry Hastings wrote: On 09/14/2015 09:29 AM, Victor Stinner wrote: Python 3.5.0 was released. What happens to the 3.4 branch in Mercurial? Does it still accept bugfixes, or is it only for security fixes now? Nothing has been announced or decided. As release manager I suppose I get some say. Here, I'll propose something: Python 3.4.4 rc1 should be released on Sunday October 4th. Python 3.4.4 final should be released on Sunday October 13th. After the tag of 3.4.4, Python 3.4 should enter security-fixes-only mode, and any future releases (3.4.5+) will be source code only. How's that? //arry/ Sorry but Sunday October 13th doesn't suit me, how about Sunday October 11th or Sunday October 18th? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What happens of the Python 3.4 branch?
On 09/14/2015 11:37 AM, Mark Lawrence wrote: Sorry but Sunday October 13th doesn't suit me, how about Sunday October 11th or Sunday October 18th? Fair enough. Sunday October 11th, 2015. On second thought it's probably best to not wait until 2019, //arry/ ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: In-line the append operations inside deque_inplace_repeat().
Would it be worth adding a comment that the block of code is an inlined
copy of deque_append()? Or maybe even turn the append() function into a
macro so you minimize code duplication?
On Sat, 12 Sep 2015 at 08:00 raymond.hettinger
wrote:
> https://hg.python.org/cpython/rev/cb96ffe6ff10
> changeset: 97943:cb96ffe6ff10
> parent: 97941:b8f3a01937be
> user:Raymond Hettinger
> date:Sat Sep 12 11:00:20 2015 -0400
> summary:
> In-line the append operations inside deque_inplace_repeat().
>
> files:
> Modules/_collectionsmodule.c | 22 ++
> 1 files changed, 18 insertions(+), 4 deletions(-)
>
>
> diff --git a/Modules/_collectionsmodule.c b/Modules/_collectionsmodule.c
> --- a/Modules/_collectionsmodule.c
> +++ b/Modules/_collectionsmodule.c
> @@ -567,12 +567,26 @@
> if (n > MAX_DEQUE_LEN)
> return PyErr_NoMemory();
>
> +deque->state++;
> for (i = 0 ; i < n-1 ; i++) {
> -rv = deque_append(deque, item);
> -if (rv == NULL)
> -return NULL;
> -Py_DECREF(rv);
> +if (deque->rightindex == BLOCKLEN - 1) {
> +block *b = newblock(Py_SIZE(deque) + i);
> +if (b == NULL) {
> +Py_SIZE(deque) += i;
> +return NULL;
> +}
> +b->leftlink = deque->rightblock;
> +CHECK_END(deque->rightblock->rightlink);
> +deque->rightblock->rightlink = b;
> +deque->rightblock = b;
> +MARK_END(b->rightlink);
> +deque->rightindex = -1;
> +}
> +deque->rightindex++;
> +Py_INCREF(item);
> +deque->rightblock->data[deque->rightindex] = item;
> }
> +Py_SIZE(deque) += i;
> Py_INCREF(deque);
> return (PyObject *)deque;
> }
>
> --
> Repository URL: https://hg.python.org/cpython
> ___
> Python-checkins mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-checkins
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Numpy-discussion] The process I intend to follow for any proposed changes to NumPy
Travis, I'm sure you appreciate that this might all look a bit scary, given the recent discussion about numpy governance. But it's an open-source project, and I, at least, fully understand that going through a big process is NOT the way to get a new idea tried out and implemented. So I think think this is a great development -- I know I want to see something like this dtype work done. So, as someone who has been around this community for a long time, and dependent on Numeric, numarray, and numpy over the years, this looks like a great development. And, in fact, with the new governance effort -- I think less scary -- people can go off and work on a branch or fork, do good stuff, and we, as a community, can be assured that API (or even ABI) changes won't be thrust upon us unawares :-) As for the technical details -- I get a bit lost, not fully understanding the current dtype system either, but do your ideas take us in the direction of having dtypes independent of the container and ufunc machinery -- and thus easier to create new dtypes (even in Python?) 'cause that would be great. I hope you find the partner you're looking for -- that's a challenge! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [email protected] ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: In-line the append operations inside deque_inplace_repeat().
> On Sep 14, 2015, at 12:49 PM, Brett Cannon wrote: > > Would it be worth adding a comment that the block of code is an inlined copy > of deque_append()? > Or maybe even turn the append() function into a macro so you minimize code > duplication? I don't think either would be helpful. The point of the inlining was to let the code evolve independently from deque_append(). Once separated from the mother ship, the code in deque_inline_repeat() could now shed the unnecessary work. The state variable is updated once. The updates within a single block are now in the own inner loop. The deque size is updated outside of that loop, etc. In other words, they are no longer the same code. The original append-in-a-loop version was already being in-lined by the compiler but was doing way too much work. For each item written in the original, there were 7 memory reads, 5 writes, 6 predictable compare-and-branches, and 5 add/sub operations. In the current form, there are 0 reads, 1 writes, 2 predictable compare-and-branches, and 3 add/sub operations. FWIW, my work flow is that periodically I expand the code with new features (the upcoming work is to add slicing support http://bugs.python.org/issue17394), then once it is correct and tested, I make a series optimization passes (such as the work I just described above). After that, I come along and factor-out common code, usually with clean, in-lineable functions rather than macros (such as the recent check-in replacing redundant code in deque_repeat with a call to the common code in deque_inplace_repeat). My schedule lately hasn't given me any big blocks of time to work with, so I do the steps piecemeal as I get snippets of development time. Raymond P.S. For those who are interested, here is the before and after: before - L1152: movq__Py_NoneStruct@GOTPCREL(%rip), %rdi cmpq$0, (%rdi) < je L1257 L1159: addq$1, %r13 cmpq%r14, %r13 je L1141 movq16(%rbx), %rsi < L1142: movq48(%rbx), %rdx < addq$1, 56(%rbx) <> cmpq$63, %rdx je L1143 movq32(%rbx), %rax < addq$1, %rdx L1144: addq$1, 0(%rbp) <> leaq1(%rsi), %rcx movq%rdx, 48(%rbx)> movq%rcx, 16(%rbx)> movq%rbp, 8(%rax,%rdx,8) > movq64(%rbx), %rax < cmpq%rax, %rcx jle L1152 cmpq$-1, %rax je L1152 after L777: cmpq$63, %rdx je L816 L779: addq$1, %rdx movq%rbp, 16(%rsi,%rbx,8)< addq$1, %rbx leaq(%rdx,%r9), %rcx subq%r8, %rcx cmpq%r12, %rbx jl L777 # outside the inner-loop movq%rdx, 48(%r13) movq%rcx, 0(%rbp) cmpq%r12, %rbx jl L780 ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: In-line the append operations inside deque_inplace_repeat().
On Mon, 14 Sep 2015 at 15:37 Raymond Hettinger wrote: > > > On Sep 14, 2015, at 12:49 PM, Brett Cannon wrote: > > > > Would it be worth adding a comment that the block of code is an inlined > copy of deque_append()? > > Or maybe even turn the append() function into a macro so you minimize > code duplication? > > I don't think either would be helpful. The point of the inlining was to > let the code evolve independently from deque_append(). > OK, commit message just didn't point that out as the reason for the inlining (I guess in the future call it a fork of the code to know it is meant to evolve independently?). -Brett > > Once separated from the mother ship, the code in deque_inline_repeat() > could now shed the unnecessary work. The state variable is updated once. > The updates within a single block are now in the own inner loop. The deque > size is updated outside of that loop, etc. In other words, they are no > longer the same code. > > The original append-in-a-loop version was already being in-lined by the > compiler but was doing way too much work. For each item written in the > original, there were 7 memory reads, 5 writes, 6 predictable > compare-and-branches, and 5 add/sub operations. In the current form, there > are 0 reads, 1 writes, 2 predictable compare-and-branches, and 3 add/sub > operations. > > FWIW, my work flow is that periodically I expand the code with new > features (the upcoming work is to add slicing support > http://bugs.python.org/issue17394), then once it is correct and tested, I > make a series optimization passes (such as the work I just described > above). After that, I come along and factor-out common code, usually with > clean, in-lineable functions rather than macros (such as the recent > check-in replacing redundant code in deque_repeat with a call to the common > code in deque_inplace_repeat). > > My schedule lately hasn't given me any big blocks of time to work with, so > I do the steps piecemeal as I get snippets of development time. > > > Raymond > > > P.S. For those who are interested, here is the before and after: > > before - > L1152: > movq__Py_NoneStruct@GOTPCREL(%rip), %rdi > cmpq$0, (%rdi) < > je L1257 > L1159: > addq$1, %r13 > cmpq%r14, %r13 > je L1141 > movq16(%rbx), %rsi < > L1142: > movq48(%rbx), %rdx < > addq$1, 56(%rbx) <> > cmpq$63, %rdx > je L1143 > movq32(%rbx), %rax < > addq$1, %rdx > L1144: > addq$1, 0(%rbp) <> > leaq1(%rsi), %rcx > movq%rdx, 48(%rbx)> > movq%rcx, 16(%rbx)> > movq%rbp, 8(%rax,%rdx,8) > > movq64(%rbx), %rax < > cmpq%rax, %rcx > jle L1152 > cmpq$-1, %rax > je L1152 > > > after > L777: > cmpq$63, %rdx > je L816 > L779: > addq$1, %rdx > movq%rbp, 16(%rsi,%rbx,8)< > addq$1, %rbx > leaq(%rdx,%r9), %rcx > subq%r8, %rcx > cmpq%r12, %rbx > jl L777 > > # outside the inner-loop > movq%rdx, 48(%r13) > movq%rcx, 0(%rbp) > cmpq%r12, %rbx > jl L780 ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
