Hi Andrew,

Thanks for looking into it and my apologies for not being clear.

My proposal was to use value ranges when expanding gimple to RTL and
eliminate redundant zero/sign extensions. I.e., if we know the value
generated by some gimple operation is already in the (zero/sign)
extended from based on our VR analysis, we could skip the SUBREG and
ZERO/SIGN_EXTEND (or set SRP_SIGNED_AND_UNSIGNED and likes).

However, the problem is, RTL operations are done in PROMOTE_MODE
precision while gimple value range is in natural types. This can be a
problem when type wraps (and shows up mainly for targets where
PROMOTE_MODE is DImode like alpha).

For example, as Uros pointed out with the reverted patch, for alpha-linux we had

FAIL: libgomp.fortran/simd7.f90   -O2  execution test
FAIL: libgomp.fortran/simd7.f90   -Os  execution test

The reason being that values wrap and in VR calculation we only
records the type precision (which is what matters for gimple) but in
order to eliminate the zero/sign extension we need the full precision
in the PROMOTE_MODE.

Extract from the testcase failing:
_343 = ivtmp.179_52 + 2147483645; [0x80000004, 0x800000043]
_344 = _343 * 2; [0x8, 0x86]
_345 = (integer(kind=4)) _344; [0x8, 0x86]

With the above VR of [0x8, 0x86] (in promoted precision which is
[0x100000008, 0x100000086]), my patch was  setting
SRP_SIGNED_AND_UNSIGNED which was wrong and causing the error
(eliminating and extension which was not redundant). If we had the VR
in promoted precision, the patch would be correct and used to
eliminate redundant zero/sign extensions.

Please let me know if my explanation is not clear and I will show it
with more examples.

Thanks,
Kugan


On Fri, 21 Jun 2019 at 23:27, Andrew MacLeod <amacl...@redhat.com> wrote:
>
> On 6/19/19 11:04 PM, Kugan Vivekanandarajah wrote:
>
> Hi Andrew,
>
> Thanks for working on this.
>
> Enable elimination of zext/sext with VRP patch had to be reverted in
> (https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00672.html) due to the
> need for value ranges in PROMOTED_MODE precision for at least 1 test
> case for alpha.
>
> Playing with ranger suggest that it is not possible to get value
> ranges in PROMOTED_MODE precision on demand. Or is there any way we
> can use on-demand ranger here?
>
> Thanks,
> Kugan
>
>
>
> I took a look at the thread, but I think I'm still missing some context.
>
> Lets go back to the beginning.  What is an example of the case we arent 
> getting that you want to get?
>
> I'm going to guess to start :-)
>
> short foo(unsigned char c)
>  {
>    c = c & (unsigned char)0x0F;
>    if( c > 7 )
>      return((short)(c - 5));
>    else
>      return(( short )c);
>  }
>
>
>
> A run of this thru the ranger shows me code that looks like (on x86 anyway):
>
>  =========== BB 2 ============
> c_4(D)  [0, 255] unsigned char
>     <bb 2> :
>     c_5 = c_4(D) & 15;
>     _9 = c_4(D) & 8;
>     if (_9 != 0)
>       goto <bb 3>; [INV]
>     else
>       goto <bb 4>; [INV]
>
> c_5 : [0, 15] unsigned char
> _9 : [0, 0][8, 8] unsigned char
> 2->3  (T)*c_5 :         [0, 15] unsigned char
> 2->3  (T) _9 :  [8, 8] unsigned char
> 2->4  (F)*c_5 :         [0, 15] unsigned char
> 2->4  (F) _9 :  [0, 0] unsigned char
>
> =========== BB 3 ============
> c_5     [0, 15] unsigned char
>     <bb 3> :
>     _1 = (unsigned short) c_5;
>     _2 = _1 + 65531;
>     _7 = (short int) _2;
>     // predicted unlikely by early return (on trees) predictor.
>     goto <bb 5>; [INV]
>
> _1 : [0, 15] unsigned short
> _2 : [0, 10][65531, 65535] unsigned short
> _7 : [-5, 10] short int
>
> =========== BB 4 ============
> c_5     [0, 15] unsigned char
>     <bb 4> :
>     _6 = (short int) c_5;
>     // predicted unlikely by early return (on trees) predictor.
>
>
> I think I see.  we aren't adjusting the range of c_5 going into blocks 3 and 
> 4.  its obvious from the original source where the code says < 7, but once 
> its been "bitmasked" that info becomes obfuscated.
> .
> If you were to see a range in bb3 of c_5 = [8,15], and a range in bb4 of c_4 
> = [0,7],  would that solve your problem?
>
> so in bb3 , we'd then see ranges that look like:
>
> _1 : [8, 15] unsigned short
> _2 : [3, 10] unsigned short
> _7 : [3, 10] short int
>
> and then later on we'd see there is no negative/wrap value, and you could 
> could just drop the extension then?
>
> SO.
>
> yes. this is fixable,  but is it easy? :-)
>
> We're in the process of replacing the range extraction code with the 
> range-ops/gori-computes  components from the ranger.  This is the part which 
> figures ranges out from individual statements on exit to a block..
>
> We have implemented mostly the same functionality of VRP to start, and one of 
> the things we have not done is enhanced the bitmask tracking side of it.
> The gori-computes component which performs backwards calculations does 
> virtually nothing with masks right now.
>
> lets look at this hunk
>
>     c_5 = c_4(D) & 15;
>     _9 = c_4(D) & 8;
>     if (_9 != 0)
>
> on the true side of this branch, we start with _9 having a range of [0,0][8,8]
>
> [0,0][8,8] = c_4(D) & 8
>
> unfortunately we dont know much about c_4, its effectively VARYING at this 
> point.   We do know c_5 has a range of [0,15].  if that statement were
>     _9 = c_5 & 8
>  this problem would be quite trivial  A quick tweak to the windback code for 
> bitwise_and would generate the correct results  because we';d see
> [1,1] = [0,15] & 8   and we'd come up with [8, 15]  for c_5 no problem.
>
> but its not.
>
> I do have some  "re-evalaution" code in the ranger which is only in the proof 
> of concept stages which might solve this problem down the road.
>
> what the re-evlaution code does is look at the any changes to ranges which 
> are in the definition chain of the ssa-name you are looking at.
> going back to the example again:
>     c_5 = c_4(D) & 15;
>     _9 = c_4(D) & 8;
>     if (_9 != 0)
>         _1 = (unsigned short) c_5;      <<--- here.
>
> When we ask for the range of c_5 here, we calculate the range normally and 
> come up with [0,15].
> The re-evalution code then takes a look and notices that c_5 is also 
> dependent on the value of c_4 , and queries if the range of c_4 changed at 
> this point.
> Well, we can know the value of c_4 must have the 0x8 bit set in this block.  
> This can be fed into a  "re-evaluation" of c_5, and also then know that c_5 
> must also have the 0x8 bit set.
> THis would then reduce the calculated range from the original [0,15] to a 
> range of [8, 15]
>
> So the mechanisms within the new code are sort of there, but they have not 
> been flushed out.  THis will require
>  a)  the re-evaluation code to be completed.  Im not sure if it can be 
> applied within EVRP (It seems it should be possible)  or whether it requires 
> the ranger.
>  b) some level of integration of bitfields with the range info so we can also 
> track bits set and cleared along with the range, and integrate them with the 
> calculations.
>
> SO assuming I have been working on the correct problem here, No. we dont get 
> this now.
>
> Once we get the range-ops and gori-computes code integrated and working with 
> EVRP and the bitfield stuff, we can again take a look and see where we stand. 
>  I will keep it in mind as a practical example of how we might want to 
> combine the information.
>
> So revisit this in September maybe?
>
> Or have i completely missed the mark of the problem you are trying to solve? 
> :-)
>
> Andrew
>
> PS
>
> Your original question is
>
>
> Playing with ranger suggest that it is not possible to get value
> ranges in PROMOTED_MODE precision on demand. Or is there any way we
> can use on-demand ranger here?
>
>
> I'm not sure i follow completely why, but in theory it would be possible to 
> add code to the ranger to walk a definition list calculating ranges in 
> whatever precision you want
>
> =========== BB 2 ============
> c_4(D)  [0, 255] unsigned char
>     <bb 2> :
>     c_5 = c_4(D) & 15;
>     _9 = c_4(D) & 8;
>     if (_9 != 0)
>       goto <bb 3>; [INV]
>     else
>       goto <bb 4>; [INV]
>
>
> =========== BB 3 ============
> c_5     [0, 15] unsigned char
>     <bb 3> :
>     _1 = (unsigned short) c_5;
>     _2 = _1 + 65531;
>     _7 = (short int) _2;
>
> Im not sure how that would help?   the root of the problem would appear to be 
> the range of c_5 is not properly calculated to be [8,15].  no matter what 
> precision you use, the [0,4] + 65531 part of the range is still going to 
> produce an undesireable set upper bit, and still require the zext/sext?
>
>

Reply via email to