On Tue, Mar 31, 2020 at 2:31 AM Andres Freund <and...@anarazel.de> wrote: > I think the form of lea generated here is among the ones that can only > be executed on port 1. Whereas e.g. an register+register/immediate add > can be executed on four different ports.
I looked into slow vs. fast leas, and I think the above are actually fast because they have 2 operands. leal (%rdi,%rdi,2), %eax A 3-op lea would look like this: leal 42(%rdi,%rdi,8), %ecx In other words, the scale doesn't count as an operand. Although I've seen in a couple places say that a non-1 scale adds a cycle of latency for some AMD chips. Some interesting discussion in these LLVM commits and discussion from 2017 about avoiding slow leas: https://reviews.llvm.org/D32277 https://reviews.llvm.org/D32352 -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services