https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82965
--- Comment #8 from amker at gcc dot gnu.org --- I think there is inconsistent semantics between call in vect_do_peeling: scale_loop_profile (prolog, prob_prolog, bound_prolog); and implementation of scale_loop_profile. When the loop is predicted to be executed too many times, scale_loop_profile works in a way that messes up the count information. The issue is worsened by below code: count_delta -= e->count (); as well as sub behavior defined in profile_count &operator-= (const profile_count &other) { if (*this == profile_count::zero () || other == profile_count::zero ()) return *this; if (!initialized_p () || !other.initialized_p ()) return *this = profile_count::uninitialized (); else { gcc_checking_assert (compatible_p (other)); m_val = m_val >= other.m_val ? m_val - other.m_val: 0; m_quality = MIN (m_quality, other.m_quality); } return *this; } in which we return 0 if (count_delta < e->count ()), which is always the case because the new iteration bound is smaller than the original guessed one. Given scale_loop_profile is only used by vect_do_peeling, I will see how we should rewrite it in align with caller's semantics. Thanks.