INADA Naoki added the comment:

Current code and my patch called insertdict_clean() or insert_index() for each 
entry.
On the other hand, Serhiy's patch calls build_indices() once.
This may be faster when compiler doesn't inlining the helper function.
As a bonus, we can use memcpy to copy entries.

Cons of Serhiy's patch is it's two pass. If entries are larger than L2 cache,
fetch from L3 cache may be larger.
So I can't declare that Serhiy's patch is faster until benchmark.

(My forecast is no performance difference between my patch and Serhiy's on amd64
machine, and Serhiy's patch is faster on more poor CPU.)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue28199>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to