Yes. We can defer it 

Thanks richard.
---- Replied Message ----
FromRichard Biener<richard.guent...@gmail.com>
Date11/30/2023 20:19
Tojuzhe.zh...@rivai.ai<juzhe.zh...@rivai.ai>
Cctamar.christina<tamar.christ...@arm.com>,
gcc-patches<gcc-patches@gcc.gnu.org>
SubjectRe: RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code
On Thu, Nov 30, 2023 at 12:11 PM juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai> wrote:
BIAS should be:

        signed char biasval
          = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo);
        tree bias = build_int_cst (intQI_type_node, biasval);

Currently, only IBM will set LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS -1 for some situations of len_load/len_store.
Otherwise, it is always 0.
But for consistency, I think we should use the codes as follows.

I see your patch is so big and separate into multiple sub-patches.
Do you have a patch that directly can be applied for whole support.
I want to support length and test that base your patch.

Can we defer LEN support for a followup?  I think we still need to set partial loop support
as disabled when there are any lengths with the initial patch for correctness.

Richard.
 
Thanks.


 
Date: 2023-11-30 18:58
Subject: RE: RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

Hi Juzhe,

 

I meant that “lens” is undefined, from looking around I guess that needs to be

 

  vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo);

 

for `bias` I meant

 

    cond = gimple_build (&cond_gsi, IFN_VCOND_MASK_LEN, truth_type,

                         all true mask, cond, all false mask, len, bias);

 

that variable `bias` isn’t defined. And I can’t find any other usage of IFN_VCOND_MASK_LEN creation to figure out what it’s supposed to be

 

is it just an SImode 0?

 

Thanks,

Tamar

 

 

From: juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai>
Sent: Thursday, November 30, 2023 11:49 AM
To: Tamar Christina <tamar.christ...@arm.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <richard.guent...@gmail.com>
Subject: Re: RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

 

Thanks Tamar.

 

I am not sure whether I am not on the same page with you.

 

IMHO, ARM SVE will use the final mask = loop mask (generate by WHILE_ULT) & conditional mask.

Use that final mask to do the cbranch. Am I right ?

 

If yes, I leverage that for length and avoid too much codes change in your patch.

 

So, for RVV, the length is pretty same as loop mask in ARM SVE.

For example, suppose n = 4, in ARM SVE, WHILE_ULT (whilelo) generate mask = 0b11110000000....

Then use that mask to control the operations.

 

For RVV, is the same, length will be 4, then we will only process the elements with index < 4.

 

For bias, I think that won't be the issue. Currently, BIAS is not used by RVV and only used on len_load/len_store for IBM targets.

So, the bias value by default is 0 in all other situations except len_load/len_store specifically for IBM.

 


 

Date: 2023-11-30 18:39

Subject: RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

Hi Juzhe,

 

I’m happy to take the hunks, just that I can’t test it and don’t know the specifics of how it lens work.

I still need to read up on it.

 

I tried adding that chunk in, but for the first bit `lens` seems undefined, and the second bit it seems `bias` is undefined.

 

I’ll also need what to do for vectorizable_live_operations how to get the first element rather than the last.

 

Thanks,

Tamar

 

From: juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai>
Sent: Thursday, November 30, 2023 4:48 AM
To: gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Richard Biener <richard.guent...@gmail.com>; Tamar Christina <tamar.christ...@arm.com>
Subject: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

 

Hi, Richard and Tamar.

 

I am sorry for bothering you.

Hope you don't mind I give some comments:

 

Can we support partial vector for length ?

 

IMHO, we can do that as follows:

 

bool length_loop_p = LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);

 

if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))

  {

    if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,

                                        OPTIMIZE_FOR_SPEED))

      vect_record_loop_len (loop_vinfo, lens, ncopies, vectype, 1);

    else

      vect_record_loop_mask (loop_vinfo, masks, ncopies, truth_type, NULL);

  }

 

if (length_loop_p)

  {

    tree len = vect_get_loop_len (loop_vinfo, gsi, loop_lens, 1, vectype, 0, 0);

    /* Use VCOND_MASK_LEN (all true, cond, all false, len, bias) to generate

       final mask = i < len + bias ? cond[i] : false.  */

    cond = gimple_build (&cond_gsi, IFN_VCOND_MASK_LEN, truth_type,

                         all true mask, cond, all false mask, len, bias);

  }

else if (masked_loop_p)

  {

    tree mask

      = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies, truth_type, 0);

    cond

      = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond, &cond_gsi);

  }

 

This is a prototype. Is this idea reasonable to Richi ?

 

Thanks.

 


Reply via email to