https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78090
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Even with this patch reverted, I can't reproduce the #c0 difference, neither in 7.3.1 nor on the trunk. And even if we emit the direct inter-unit conversions in cold sections, it is significantly smaller (4 vs. 10 bytes when going through stack), so it is what we are looking for.