So this is another nasty latent bug exposed by ext-dce.
Similar to the prior m68k failure it's another problem with how we
handle paradoxical subregs on big endian targets.
In this instance when we remove the hard subregs we take something like:
(subreg:DI (reg:SI 0) 0)
And turn it into
(reg:SI -1)
Which is clearly wrong. (reg:SI 0) is correct.
The transformation happens in alter_subreg, but I really wanted to fix
this in subreg_regno since we could have similar problems in some of the
other callers of subreg_regno.
Unfortunately reload depends on the current behavior of subreg_regno; in
the cases where the return value is an invalid register, the wrong half
of a register pair, etc the resulting bogus value is detected by reload
and triggers reloading of the inner object. So that's the new comment
in subreg_regno.
The second best place to fix is alter_subreg which is what this patch
does. If presented with a paradoxical subreg, then the base register
number should always be REGNO (SUBREG_REG (object)). It's just how
paradoxicals are designed to work.
I haven't tried to fix the other places that call subreg_regno. After
being burned by reload, I'm more than a bit worried about unintended
fallout.
I must admit I'm surprised we haven't stumbled over this before and that
it didn't fix any failures on the big endian embedded targets.
Boostrapped & regression tested on x86_64, also went through all the
embedded targets in my tester and bootstrapped on m68k & s390x to get
some additional big endian testing.
Pushing to the trunk.
jeff
commit e9738e77674e23f600315ca1efed7d1c7944d0cc
Author: Jeff Law <j...@ventanamicro.com>
Date: Mon Aug 12 07:29:25 2024 -0600
[rtl-optimization/116244] Don't create bogus regs in alter_subreg
So this is another nasty latent bug exposed by ext-dce.
Similar to the prior m68k failure it's another problem with how we handle
paradoxical subregs on big endian targets.
In this instance when we remove the hard subregs we take something like:
(subreg:DI (reg:SI 0) 0)
And turn it into
(reg:SI -1)
Which is clearly wrong. (reg:SI 0) is correct.
The transformation happens in alter_subreg, but I really wanted to fix this
in
subreg_regno since we could have similar problems in some of the other
callers
of subreg_regno.
Unfortunately reload depends on the current behavior of subreg_regno; in the
cases where the return value is an invalid register, the wrong half of a
register pair, etc the resulting bogus value is detected by reload and
triggers
reloading of the inner object. So that's the new comment in subreg_regno.
The second best place to fix is alter_subreg which is what this patch does.
If
presented with a paradoxical subreg, then the base register number should
always be REGNO (SUBREG_REG (object)). It's just how paradoxicals are
designed
to work.
I haven't tried to fix the other places that call subreg_regno. After being
burned by reload, I'm more than a bit worried about unintended fallout.
I must admit I'm surprised we haven't stumbled over this before and that it
didn't fix any failures on the big endian embedded targets.
Boostrapped & regression tested on x86_64, also went through all the
embedded
targets in my tester and bootstrapped on m68k & s390x to get some additional
big endian testing.
Pushing to the trunk.
rtl-optimization/116244
gcc/
* rtlanal.cc (subreg_regno): Update comment.
* final.cc (alter_subrg): Always use REGNO (SUBREG_REG ()) to get
the base regsiter for paradoxical subregs.
gcc/testsuite/
* g++.target/m68k/m68k.exp: New test driver.
* g++.target/m68k/pr116244.C: New test.
diff --git a/gcc/final.cc b/gcc/final.cc
index eb9e065d9f0..0167b2f8602 100644
--- a/gcc/final.cc
+++ b/gcc/final.cc
@@ -3123,7 +3123,17 @@ alter_subreg (rtx *xp, bool final_p)
unsigned int regno;
poly_int64 offset;
- regno = subreg_regno (x);
+ /* A paradoxical should always be REGNO (y) + 0. Using subreg_regno
+ for something like (subreg:DI (reg:SI N) 0) on a WORDS_BIG_ENDIAN
+ target will return N-1 which is catastrophic for N == 0 and just
+ wrong for other cases.
+
+ Fixing subreg_regno would be a better option, except that reload
+ depends on its current behavior. */
+ if (paradoxical_subreg_p (x))
+ regno = REGNO (y);
+ else
+ regno = subreg_regno (x);
if (subreg_lowpart_p (x))
offset = byte_lowpart_offset (GET_MODE (x), GET_MODE (y));
else
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 4158a531bdd..6f6e6544755 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -4313,7 +4313,16 @@ lowpart_subreg_regno (unsigned int regno, machine_mode
xmode,
return simplify_subreg_regno (regno, xmode, offset, ymode);
}
-/* Return the final regno that a subreg expression refers to. */
+/* Return the final regno that a subreg expression refers to.
+
+ Callers such as reload_inner_reg_of_subreg rely on this returning
+ the simplified hard reg, even if that result is not a valid regno for
+ the given mode. That triggers reloading the inner part of the
+ subreg.
+
+ That inherently means other uses of this routine probably need
+ to be audited for their behavior when requested subreg can't
+ be expressed as a hard register after apply the offset. */
unsigned int
subreg_regno (const_rtx x)
{
diff --git a/gcc/testsuite/g++.target/m68k/m68k.exp
b/gcc/testsuite/g++.target/m68k/m68k.exp
new file mode 100644
index 00000000000..8f6416e9fdf
--- /dev/null
+++ b/gcc/testsuite/g++.target/m68k/m68k.exp
@@ -0,0 +1,34 @@
+# Copyright (C) 2019-2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3. If not see
+# <http://www.gnu.org/licenses/>.
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't a m68k target.
+if ![istarget m68k*-*-*] then {
+ return
+}
+
+# Load support procs.
+load_lib g++-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] "" ""
+
+# All done.
+dg-finish
diff --git a/gcc/testsuite/g++.target/m68k/pr116244.C
b/gcc/testsuite/g++.target/m68k/pr116244.C
new file mode 100644
index 00000000000..2e78832efbb
--- /dev/null
+++ b/gcc/testsuite/g++.target/m68k/pr116244.C
@@ -0,0 +1,226 @@
+// { dg-do compile }
+// { dg-additional-options "-std=gnu++20 -O2 -w -march=isaa" }
+namespace std {
+struct __conditional {
+ template <typename _Tp, typename> using type = _Tp;
+};
+template <bool, typename _If, typename _Else>
+using __conditional_t = __conditional::type<_If, _Else>;
+template <typename> constexpr bool is_lvalue_reference_v = true;
+template <typename> bool is_same_v;
+template <typename _From, typename _To>
+constexpr bool is_convertible_v = __is_convertible(_From, _To);
+template <typename> bool is_invocable_v;
+template <typename...> using common_reference_t = int;
+template <typename _Iterator> struct iterator_traits : _Iterator {};
+namespace __detail {
+template <typename, typename _Up>
+concept __same_as = is_same_v<_Up>;
+}
+template <typename _Tp, typename _Up>
+concept same_as = __detail::__same_as<_Up, _Tp>;
+template <typename _Derived, typename _Base>
+concept derived_from = is_convertible_v<_Derived, _Base>;
+template <typename, typename>
+concept common_reference_with =
+ same_as<common_reference_t<>, common_reference_t<>>;
+namespace __detail {
+template <typename _Tp>
+concept __boolean_testable = requires(_Tp __t) { __t; };
+template <typename _Tp, typename>
+concept __weakly_eq_cmp_with = requires(_Tp __t) { __t; };
+} // namespace __detail
+template <typename... _Args>
+concept invocable = is_invocable_v<_Args...>;
+template <typename>
+concept regular_invocable = invocable<>;
+template <typename _Fn>
+concept predicate = __detail::__boolean_testable<_Fn>;
+template <typename _Rel, typename, typename>
+concept relation = predicate<_Rel>;
+template <typename _Rel, typename _Tp, typename _Up>
+concept strict_weak_order = relation<_Rel, _Tp, _Up>;
+namespace {
+struct duration {
+ duration() = default;
+ template <typename _Rep2> duration(_Rep2 __rep) : __r(__rep) {}
+ long long count() { return __r; }
+ long long __r;
+};
+long long operator+(duration __lhs, duration __rhs) {
+ long __trans_tmp_1 = __lhs.count();
+ return __trans_tmp_1 + __rhs.count();
+}
+struct time_point {
+ time_point();
+ time_point(duration __dur) : __d(__dur) {}
+ duration time_since_epoch() { return __d; }
+ duration __d;
+};
+time_point operator+(time_point __lhs, duration __rhs) {
+ duration __trans_tmp_2 = __lhs.time_since_epoch();
+ return time_point(__trans_tmp_2 + __rhs);
+}
+template <typename> using sys_time = time_point;
+} // namespace
+template <typename> class allocator;
+namespace {
+struct less {};
+} // namespace
+struct forward_iterator_tag {};
+namespace __detail {
+template <typename _Iter> struct __iter_traits_impl {
+ using type = iterator_traits<_Iter>;
+};
+template <typename _Iter> using __iter_traits =
__iter_traits_impl<_Iter>::type;
+template <typename> struct __iter_concept_impl;
+template <typename _Iter>
+ requires requires { typename _Iter; }
+struct __iter_concept_impl<_Iter> {
+ using type = __iter_traits<_Iter>::iterator_concept;
+};
+template <typename _Iter>
+using __iter_concept = __iter_concept_impl<_Iter>::type;
+template <typename _In>
+concept __indirectly_readable_impl = common_reference_with<_In, _In>;
+} // namespace __detail
+template <typename _In>
+concept indirectly_readable = __detail::__indirectly_readable_impl<_In>;
+template <typename _Iter>
+concept input_or_output_iterator = requires(_Iter __i) { __i; };
+template <typename _Sent, typename _Iter>
+concept sentinel_for = __detail::__weakly_eq_cmp_with<_Sent, _Iter>;
+template <typename _Iter>
+concept forward_iterator =
+ derived_from<__detail::__iter_concept<_Iter>, forward_iterator_tag>;
+template <typename _Fn, typename>
+concept indirectly_regular_unary_invocable = regular_invocable<_Fn>;
+template <typename _Fn, typename _I1, typename _I2>
+concept indirect_strict_weak_order = strict_weak_order<_Fn, _I1, _I2>;
+namespace __detail {
+template <typename, typename> struct __projected;
+}
+template <indirectly_readable _Iter,
+ indirectly_regular_unary_invocable<_Iter> _Proj>
+using projected = __detail::__projected<_Iter, _Proj>;
+namespace ranges::__access {
+template <typename _Tp> auto __begin(_Tp __t) { return __t.begin(); }
+} // namespace ranges::__access
+namespace __detail {
+template <typename _Tp>
+using __range_iter_t = decltype(ranges::__access::__begin(_Tp()));
+}
+template <typename _Tp> struct iterator_traits<_Tp *> {
+ typedef forward_iterator_tag iterator_concept;
+ using reference = _Tp;
+};
+} // namespace std
+template <typename _Iterator, typename> struct __normal_iterator {
+ using iterator_concept = std::__detail::__iter_concept<_Iterator>;
+ std::iterator_traits<_Iterator>::reference operator*();
+ void operator++();
+};
+template <typename _IteratorL, typename _IteratorR, typename _Container>
+bool operator==(__normal_iterator<_IteratorL, _Container>,
+ __normal_iterator<_IteratorR, _Container>);
+int _M_get_sys_info_ri;
+namespace std {
+template <typename> struct allocator_traits;
+template <typename _Tp> struct allocator_traits<allocator<_Tp>> {
+ using pointer = _Tp *;
+};
+namespace {
+namespace __detail {
+template <typename _Tp>
+concept __maybe_borrowed_range = is_lvalue_reference_v<_Tp>;
+}
+int end;
+template <typename>
+concept range = requires { end; };
+template <typename _Tp>
+concept borrowed_range = __detail::__maybe_borrowed_range<_Tp>;
+template <typename _Tp> using iterator_t = std::__detail::__range_iter_t<_Tp>;
+template <typename _Tp>
+concept forward_range = forward_iterator<iterator_t<_Tp>>;
+struct dangling;
+template <input_or_output_iterator _It, sentinel_for<_It> _Sent = _It>
+struct subrange {
+ _It begin();
+ _Sent end();
+};
+template <range _Range>
+using borrowed_subrange_t =
+ __conditional_t<borrowed_range<_Range>, subrange<iterator_t<_Range>>,
+ dangling>;
+} // namespace
+template <typename _Alloc> struct _Vector_base {
+ typedef allocator_traits<_Alloc>::pointer pointer;
+};
+template <typename _Tp, typename _Alloc = allocator<_Tp>> struct vector {
+ __normal_iterator<typename _Vector_base<_Alloc>::pointer, int> begin();
+};
+template <typename _Tp> struct __shared_ptr_access {
+ _Tp *operator->();
+};
+namespace chrono {
+struct weekday {
+ char _M_wd;
+ weekday _S_from_days() {
+ long __trans_tmp_3;
+ return 7 > __trans_tmp_3;
+ }
+ weekday(unsigned __wd) : _M_wd(__wd == 7 ?: __wd) {}
+ weekday() : weekday{_S_from_days()} {}
+ unsigned c_encoding() { return _M_wd; }
+ friend duration operator-(weekday __x, weekday __y) {
+ auto __n = __x.c_encoding() - __y.c_encoding();
+ return int(__n) >= 0 ? __n : duration{};
+ }
+};
+struct year_month_day {
+ year_month_day _S_from_days(duration __dp) {
+ auto __r0 = int(__dp.count()), __u2 = 5 * __r0;
+ year_month_day{__u2};
+ }
+ year_month_day();
+ year_month_day(int);
+ year_month_day(sys_time<long> __dp)
+ : year_month_day(_S_from_days(__dp.time_since_epoch())) {}
+};
+struct time_zone {
+ void _M_get_sys_info() const;
+};
+} // namespace chrono
+namespace ranges {
+struct {
+ template <
+ forward_range _Range, typename _Tp, typename _Proj,
+ indirect_strict_weak_order<_Tp, projected<_Range, _Proj>> _Comp = less>
+ borrowed_subrange_t<_Range> operator()(_Range, _Tp, _Comp, _Proj);
+} equal_range;
+} // namespace ranges
+} // namespace std
+namespace std::chrono {
+struct Rule;
+unsigned day_of_week;
+struct _Node {
+ vector<Rule> rules;
+};
+char name;
+struct Rule {
+ int from;
+ void start_time(int, long) {
+ year_month_day ymd;
+ sys_time<long> date;
+ duration diff = day_of_week - weekday{};
+ ymd = date + diff;
+ }
+};
+__shared_ptr_access<_Node> __trans_tmp_5;
+void time_zone::_M_get_sys_info() const {
+ auto node = __trans_tmp_5;
+ auto rules = ranges::equal_range(node->rules, _M_get_sys_info_ri, {}, name);
+ for (auto rule : rules)
+ rule.start_time(rule.from, {});
+}
+} // namespace std::chrono