https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115749
--- Comment #4 from kim.walisch at gmail dot com --- One possible explanation for why GCC's current integer division by a constant assembly sequence was chosen back in the day (I guess one or two decades ago) is that GCC's current assembly sequence uses only 1 mul instruction whereas Clang uses 2 mul instructions. Historically, multiplication instructions used to be slower than add, sub and shift instructions on nearly all CPU architectures and so it made sense to avoid mul instructions whenever possible. However in the past decade this performance gap has narrowed and now it is more important to avoid long instruction dependency chains which GCC's current integer modulo by a constant assembly sequence suffers from.