Richard Biener <rguent...@suse.de> writes:
> On Tue, 15 Aug 2023, Richard Sandiford wrote:
>
>> Richard Biener <rguent...@suse.de> writes:
>> > On Tue, 15 Aug 2023, Kewen.Lin wrote:
>> >
>> >> Hi Stefan,
>> >> 
>> >> on 2023/8/15 02:51, Stefan Schulze Frielinghaus wrote:
>> >> > Hi everyone,
>> >> > 
>> >> > I have bootstrapped and regtested the patch below on s390.  For the
>> >> > 64-bit target I do not see any changes regarding the testsuite.  For the
>> >> > 31-bit target I see the following failures:
>> >> > 
>> >> > FAIL: gcc.dg/vect/no-scevccp-outer-14.c (internal compiler error: in 
>> >> > require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/no-scevccp-outer-14.c (test for excess errors)
>> >> > FAIL: gcc.dg/vect/pr50451.c (internal compiler error: in require, at 
>> >> > machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr50451.c (test for excess errors)
>> >> > FAIL: gcc.dg/vect/pr50451.c -flto -ffat-lto-objects (internal compiler 
>> >> > error: in require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr50451.c -flto -ffat-lto-objects (test for excess 
>> >> > errors)
>> >> > FAIL: gcc.dg/vect/pr53773.c (internal compiler error: in require, at 
>> >> > machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr53773.c (test for excess errors)
>> >> > FAIL: gcc.dg/vect/pr53773.c -flto -ffat-lto-objects (internal compiler 
>> >> > error: in require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr53773.c -flto -ffat-lto-objects (test for excess 
>> >> > errors)
>> >> > FAIL: gcc.dg/vect/pr71407.c (internal compiler error: in require, at 
>> >> > machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr71407.c (test for excess errors)
>> >> > FAIL: gcc.dg/vect/pr71407.c -flto -ffat-lto-objects (internal compiler 
>> >> > error: in require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr71407.c -flto -ffat-lto-objects (test for excess 
>> >> > errors)
>> >> > FAIL: gcc.dg/vect/pr71416-1.c (internal compiler error: in require, at 
>> >> > machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr71416-1.c (test for excess errors)
>> >> > FAIL: gcc.dg/vect/pr71416-1.c -flto -ffat-lto-objects (internal 
>> >> > compiler error: in require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr71416-1.c -flto -ffat-lto-objects (test for excess 
>> >> > errors)
>> >> > FAIL: gcc.dg/vect/pr94443.c (internal compiler error: in require, at 
>> >> > machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr94443.c (test for excess errors)
>> >> > FAIL: gcc.dg/vect/pr94443.c -flto -ffat-lto-objects (internal compiler 
>> >> > error: in require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr94443.c -flto -ffat-lto-objects (test for excess 
>> >> > errors)
>> >> > FAIL: gcc.dg/vect/pr97558.c (internal compiler error: in require, at 
>> >> > machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr97558.c (test for excess errors)
>> >> > FAIL: gcc.dg/vect/pr97558.c -flto -ffat-lto-objects (internal compiler 
>> >> > error: in require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/pr97558.c -flto -ffat-lto-objects (test for excess 
>> >> > errors)
>> >> > FAIL: gcc.dg/vect/vect-reduc-pattern-3.c -flto -ffat-lto-objects 
>> >> > (internal compiler error: in require, at machmode.h:313)
>> >> > FAIL: gcc.dg/vect/vect-reduc-pattern-3.c -flto -ffat-lto-objects (test 
>> >> > for excess errors)
>> >> > UNRESOLVED: gcc.dg/vect/no-scevccp-outer-14.c compilation failed to 
>> >> > produce executable
>> >> > UNRESOLVED: gcc.dg/vect/pr53773.c -flto -ffat-lto-objects  
>> >> > scan-tree-dump-times optimized "\\* 10" 2
>> >> > UNRESOLVED: gcc.dg/vect/pr53773.c scan-tree-dump-times optimized "\\* 
>> >> > 10" 2
>> >> > UNRESOLVED: gcc.dg/vect/pr71416-1.c -flto -ffat-lto-objects compilation 
>> >> > failed to produce executable
>> >> > UNRESOLVED: gcc.dg/vect/pr71416-1.c compilation failed to produce 
>> >> > executable
>> >> > UNRESOLVED: gcc.dg/vect/vect-reduc-pattern-3.c -flto -ffat-lto-objects 
>> >> > compilation failed to produce executable
>> >> > 
>> >> > I've randomely picked pr50451.c and ran gcc against it which results in:
>> >> > 
>> >> > during GIMPLE pass: vect
>> >> > dump file: pr50451.c.174t.vect
>> >> > /gcc-verify-workdir/patched/src/gcc/testsuite/gcc.dg/vect/pr50451.c: In 
>> >> > function ?foo?:
>> >> > /gcc-verify-workdir/patched/src/gcc/testsuite/gcc.dg/vect/pr50451.c:5:1:
>> >> >  internal compiler error: in require, at machmode.h:313
>> >> > 0x1265d21 opt_mode<scalar_int_mode>::require() const
>> >> >         /gcc-verify-workdir/patched/src/gcc/machmode.h:313
>> >> > 0x1d7e4e9 opt_mode<machine_mode>::require() const
>> >> >         /gcc-verify-workdir/patched/src/gcc/vec.h:955
>> >> > 0x1d7e4e9 vect_verify_loop_lens
>> >> >         /gcc-verify-workdir/patched/src/gcc/tree-vect-loop.cc:1471
>> >> > 0x1da29ab vect_analyze_loop_2
>> >> >         /gcc-verify-workdir/patched/src/gcc/tree-vect-loop.cc:2929
>> >> > 0x1da40c7 vect_analyze_loop_1
>> >> >         /gcc-verify-workdir/patched/src/gcc/tree-vect-loop.cc:3330
>> >> > 0x1da499d vect_analyze_loop(loop*, vec_info_shared*)
>> >> >         /gcc-verify-workdir/patched/src/gcc/tree-vect-loop.cc:3484
>> >> > 0x1deed27 try_vectorize_loop_1
>> >> >         /gcc-verify-workdir/patched/src/gcc/tree-vectorizer.cc:1064
>> >> > 0x1deed27 try_vectorize_loop
>> >> >         /gcc-verify-workdir/patched/src/gcc/tree-vectorizer.cc:1180
>> >> > 0x1def5c1 execute
>> >> >         /gcc-verify-workdir/patched/src/gcc/tree-vectorizer.cc:1296
>> >> > Please submit a full bug report, with preprocessed source (by using 
>> >> > -freport-bug).
>> >> > Please include the complete backtrace with any bug report.
>> >> > See <https://gcc.gnu.org/bugs/> for instructions.
>> >> > 
>> >> 
>> >> It looks like s390 supports variable index vec_extract at -m31 but
>> >> no vector with length.  It seems we need to further check the vector
>> >> with length capability, with something like:
>> >> 
>> >> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
>> >> index 5ae9f69c7eb..ef754467baf 100644
>> >> --- a/gcc/tree-vect-loop.cc
>> >> +++ b/gcc/tree-vect-loop.cc
>> >> @@ -10327,10 +10327,11 @@ vectorizable_live_operation (vec_info *vinfo, 
>> >> stmt_vec_info stmt_info,
>> >>                  vect_record_loop_mask (loop_vinfo,
>> >>                                         &LOOP_VINFO_MASKS (loop_vinfo),
>> >>                                         1, vectype, NULL);
>> >> -              else if (can_vec_extract_var_idx_p (
>> >> +              else if (get_len_load_store_mode (TYPE_MODE (vectype), 
>> >> true)
>> >> +                         .exists ()
>> >> +                       && can_vec_extract_var_idx_p (
>> >>                           TYPE_MODE (vectype), TYPE_MODE (TREE_TYPE 
>> >> (vectype))))
>> >> -                vect_record_loop_len (loop_vinfo,
>> >> -                                      &LOOP_VINFO_LENS (loop_vinfo),
>> >> +                vect_record_loop_len (loop_vinfo, &LOOP_VINFO_LENS 
>> >> (loop_vinfo),
>> >>                                        1, vectype, 1);
>> >>                else
>> >>                  {
>> >> 
>> >> sigh, the formatting looks odd.
>> >
>> > I think the error is in vect_verify_loop_lens which assumes that
>> > when we record _any_ length related op the target has to support
>> > both len_load and len_store.  Now that we have many other _len
>> > functions that's certainly not true.
>> >
>> > Instead a vect_verify_loop_lens-local "fix" would be to not use
>> > .require () but instead when !.exists () simply return false.
>> > That would still effectively require both len-load and len-store
>> > for any -len predicated loop, but at least avoid the ICE.
>> 
>> Yeah, agree that would be the simplest workaround.  But I think
>> instead we should require vectorizable_load and vectorizable_store
>> to record the bias that they want to use (perhaps in a hash_set?).
>> Then vect_verify_loop_lens can return false if the set has more
>> than one element.  It can use a bias of 0 if the set is empty.
>
> But with all the other _LEN fns now also having a bias, never
> quering it but using the one from len_{load,store} having
> that supported looks like a requirement?  OTOH if we would require
> querying it for each used _LEN fn then supporting multiple
> different biases wouldn't be an issue either?

Yeah, the OTOH is what I think we should do.  The bias stuff dates
back to when loads and stores were the only LEN_* operations.  The reason
it checks both loads and stores is to ensure consistency between "all"
length operations.

If instead we took the position that LEN_LOAD determines the biases
for everything, we wouldn't need to check LEN_STORE biases.

But the biases need to be consistent because they're built into the IVs.
In principle we could create a different LEN input for each bias,
but that's pointless when no target needs it.

Thanks,
Richard

Reply via email to