https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #19 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- I think the problem with anddi iordi and xordi instructions is that they obscure the data flow between low and high half words. When they are not enabled, we have the low and high parts expanded independently, but in the case of the di mode instructions it is not clear which of the half words propagate from input to output. With my new patch, we have 2328 bytes stack for hard float point, and only 272 bytes for arm-none-eabi which is a target I care about. This is still not perfect, but certainly a big improvement. Wilco, where have you seen the additional registers used with my previous patch, maybe we can try to fix that somehow?