from:"Jeethu Rao"

[issue32534] Speed-up list.insert: use memmove()

2018-01-17 Thread Jeethu Rao

Jeethu Rao added the comment: It's also interesting that in https://gist.github.com/pitrou/29eb7592fa1eae2be390f3bfa3db0a3a : | django_template | 307 ms| 312 ms | 1.02x slower | Not significant| It seems to be slower and the benchmarks before it

[issue32584] Uninitialized free_extra in code_dealloc

2018-01-17 Thread Jeethu Rao

New submission from Jeethu Rao : In one of patches I'm building, (yet another attempt at caching LOAD_GLOBALS)[1], I'm using the private APIs from PEP 523 to store an array with every code object. I'm calling _PyEval_RequestCodeExtraIndex with PyMem_Free for the freefunc

[issue32534] Speed-up list.insert: use memmove()

2018-01-17 Thread Jeethu Rao

Jeethu Rao added the comment: > What is 54640? That's the pid of the process. > I'm interested to know which benchmarks call list.insert() 40k times. The django_template benchmark. -- ___ Python tracker <https://bugs.py

[issue32534] Speed-up list.insert: use memmove()

2018-01-17 Thread Jeethu Rao

Jeethu Rao added the comment: > > I still think those numbers are misleading or downright bogus. There is no > > existing proof that list.insert() is a critical path in those benchmarks. > Can someone check if these bencmarks really use list.insert() in hot code? If > yes,

[issue32534] Speed-up list.insert: use memmove()

2018-01-17 Thread Jeethu Rao

Jeethu Rao added the comment: > FWIW, we've encountered a number of situations in the past when something > that improved the timings on one compiler would make timings worse on another > compiler. There was also variance between timings on 32-bit builds versus > 64-

[issue30604] co_extra_freefuncs is stored thread locally and can lead to crashes

2018-01-16 Thread Jeethu Rao

Change by Jeethu Rao : -- nosy: +jeethu ___ Python tracker <https://bugs.python.org/issue30604> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue28521] _PyEval_RequestCodeExtraIndex should return a globally valid index, not a ThreadState specific one

2018-01-16 Thread Jeethu Rao

Change by Jeethu Rao : -- nosy: +jeethu ___ Python tracker <https://bugs.python.org/issue28521> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue32534] Speed-up list.insert: use memmove()

2018-01-16 Thread Jeethu Rao

Jeethu Rao added the comment: > Be careful. Moving "l.insert" lookup of the loop might make the code slower. > I never looked why. But Python 3.7 was also optimized in many places to call > methods, so I'm not sure anymore :) Thanks again! Here's a gist wit

[issue32534] Speed-up list.insert: use memmove()

2018-01-16 Thread Jeethu Rao

Jeethu Rao added the comment: Victor: I’m booting with the isolcpus and rcu_nocbs flags, and running pyperformance with the --affinity flag to pin the benchmark to the isolated CPU cores. I’ve also run `perf system tune`. And the OS is Ubuntu 17.10. Thanks for the tip on using perf timeit

[issue32534] Speed-up list.insert

2018-01-16 Thread Jeethu Rao

Jeethu Rao added the comment: Built and benchmarked both the baseline and the patch without PGO; the differences are less pronounced, but still present. https://gist.github.com/jeethu/abd404e39c6dfcbabb4c01661b9238d1 -- ___ Python tracker <ht

[issue32534] Speed-up list.insert

2018-01-15 Thread Jeethu Rao

Jeethu Rao added the comment: I rebased my branch off of master and rebuilt it, and also rebuilt the baseline from master. Both versions were configured with --with-lto and --enable-optimizations. The benchmark numbers are rather different this time[1]. pidigits is slower, but nbody is still

[issue32534] Speed-up list.insert

2018-01-14 Thread Jeethu Rao

Jeethu Rao added the comment: I managed to tune an i7700k desktop running Ubuntu 17.10 per this doc[1], and ran the pyperformance benchmarks[2]. I also tried various threshold with this benchmark and 16 still seems to be the sweet spot. The geometric mean of the relative changes across all

[issue32534] Speed-up list.insert

2018-01-11 Thread Jeethu Rao

Jeethu Rao added the comment: I tried it with a couple of different thresholds, twice each, ignoring the results of the first run. 16 seems to be the sweet spot. THRESHOLD = 0 jeethu@dev:cpython (3.7_list_insert_memmove)$ ./python -m timeit -s "l = []" "for _ in range(100): l

[issue32534] Speed-up list.insert

2018-01-11 Thread Jeethu Rao

Change by Jeethu Rao : -- keywords: +patch pull_requests: +5017 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue32534> ___ ___ Python-

[issue32534] Speed-up list.insert

2018-01-11 Thread Jeethu Rao

New submission from Jeethu Rao : I've noticed that replacing the for loop in the ins1 function in listobject.c with a memmove when the number of pointers to move is greater than 16 seems to speed up list.insert by about 3 to 4x on a contrived benchmark. # Before jeethu@dev:cpython (m

[issue32534] Speed-up list.insert: use memmove()

[issue32584] Uninitialized free_extra in code_dealloc

[issue32534] Speed-up list.insert: use memmove()

[issue32534] Speed-up list.insert: use memmove()

[issue32534] Speed-up list.insert: use memmove()

[issue30604] co_extra_freefuncs is stored thread locally and can lead to crashes

[issue28521] _PyEval_RequestCodeExtraIndex should return a globally valid index, not a ThreadState specific one

[issue32534] Speed-up list.insert: use memmove()

[issue32534] Speed-up list.insert: use memmove()

[issue32534] Speed-up list.insert

[issue32534] Speed-up list.insert

[issue32534] Speed-up list.insert

[issue32534] Speed-up list.insert

[issue32534] Speed-up list.insert

[issue32534] Speed-up list.insert

15 matches

Site Navigation

Mail list logo

Footer information