On Sun, Jan 31, 2016 at 8:43 PM, Jim Wilson <jim.wil...@linaro.org> wrote: >> Are we certain that the libcall is a win for any target? >> I would have expected a default of >> q = x / y >> r = x - (q * y) >> to be most efficient on modern machines. Even more so on targets like ARM >> that have multiply-and-subtract instructions.
If there is a div insn, then yes, gcc will emit a div and a multiply. However, a div insn is a relatively recent addition to the 32-bit ARM architecture. Without the div insn, we get a div libcall and a mod libcall. That means two libcalls, both of which are likely implemented by calling the divmod libcall and returning the desired part of the result. One call to a divmod libcall is clearly more efficient than two calls to a divmod libcall. So that makes the transformation useful. Prathamesh's patch has a number of conditions required to trigger the optimization, such as a divmod insn, or a lack of a div insn and the presence of a divmod libcall. Jim