Raymond Hettinger added the comment: > May be worth to move this test outside of the loop as in the > set_lookup function.
The loop used to be this way but testing showed no benefit, so a while back I recombined it back to a j=0 start point and the code looked much nicer. I don't really want to go back if I don't have to. There may or may not be a microscopic benefit to moving the leaq 9(%rcx), %rsi / cmpq %rsi, %rax / jae L237 sequence before the table[i+0]->key == NULL check. I don't still have access to VTune to verify that this highly predictable branch is essentially free. I do have a preference at this point for the simpler code -- that will make the future optimization work easier (perhaps inlining the lookkey code into set_insert_key, set_discard, and set_contains). Also, looking at the disassembly of your patch, there is an additional cost for setting up the table[i+1] entry that wasn't there before (two new extra instructions: leaq 1(%rcx), %rsi; salq $4, %rsi; occur before the normal lookup and test sequence: leaq 0(%rbp,%rsi), %rdx; cmpq $0, (%rdx); je L237) That said, if you think it is important to move the table[i+0] test outside the (i + LINEAR_PROBES <= mask) test, just say so and I'll apply your version of the patch instead of mine. Otherwise, I prefer the cleaner code (both C and generated assembly) of my patch. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue23269> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com