https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- It's likely the tail padding we possibly inspect, with now unrolling the loop twice to improve the number of badly predictable branches we can now end up with inspecting a completely uninitialized qword. This possibly makes valgrind fail to realize that uninit data doesn't actually influence the jump so IMO it's a valgrind issue.