STINNER Victor <vstin...@python.org> added the comment:
I close this issue. It's likely just a hiccup in the PGO compilation. It's not the thing that we can easily control. The good thing is that the common code path iter(list) is efficient ;-) > The code for listiter_next() and listreviter_next() is almost the same. Right. It cannot explain a 2x slowdown. > python -m timeit -s "a = list(range(1000))" "list(iter(a))" > 50000 loops, best of 5: 5.73 usec per loop It means around 5.73 ns per iteration. This is almost "nothing": just a few CPU cycles. For such microbenchmark, you are very close to the bare metal. You have to take in account CPU low-level metrics like usage of the CPU caches. > Another possible cause is that this is just a random build outcome due to PGO > or incidental branch mis-prediction from aliasing (as described in > https://stackoverflow.com/a/17906589/1001643 ). If someone cares about such microbenchmark, I suggest to get access to a profiling tool and measure the CPU cache usage and other metrics like that. On Linux, I know the "perf" command which can be used. I don't know performance tooling on Windows. Maybe search in Intel developer tools. I expect that list(iter(a)) better uses the CPU (cache? branch predictor?) than list(reversed(a)), because of how listiter_next() and listreviter_next() have been optimized. Bad code placement has a high cost on performance on such microbenchmarks. See: * https://llvmdevelopersmeetingbay2016.sched.org/event/8YzY/causes-of-performance-instability-due-to-code-placement-in-x86 * https://vstinner.github.io/analysis-python-performance-issue.html ---------- resolution: -> not a bug stage: -> resolved status: open -> closed _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39521> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com