Andrew MacLeod <amacl...@redhat.com> writes:
> On 05/23/2017 08:11 AM, Jakub Jelinek wrote:
>> On Tue, May 23, 2017 at 06:48:15AM -0400, Aldy Hernandez wrote:
>>> --- a/gcc/tree-ssanames.h
>>> +++ b/gcc/tree-ssanames.h
>>> @@ -45,14 +45,12 @@ struct GTY(()) ptr_info_def
>>>     unsigned int misalign;
>>>   };
>>>   
>>> -/* Value range information for SSA_NAMEs representing non-pointer 
>>> variables.  */
>>> -
>>> -struct GTY ((variable_size)) range_info_def {
>>> -  /* Minimum, maximum and nonzero bits.  */
>>> -  TRAILING_WIDE_INT_ACCESSOR (min, ints, 0)
>>> -  TRAILING_WIDE_INT_ACCESSOR (max, ints, 1)
>>> -  TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2)
>>> -  trailing_wide_ints <3> ints;
>>> +/* Used bits information for SSA_NAMEs representing non-pointer variables. 
>>>  */
>>> +
>>> +struct GTY ((variable_size)) nonzero_bits_def {
>>> +  /* Mask representing which bits are known to be used in an integer.  */
>>> +  TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 0)
>>> +  trailing_wide_ints <1> ints;
>>>   };
>>>   
>>>   
>>> --- a/gcc/tree-core.h
>>> +++ b/gcc/tree-core.h
>>> @@ -1416,11 +1413,15 @@ struct GTY(()) tree_ssa_name {
>>>     union ssa_name_info_type {
>>>       /* Pointer attributes used for alias analysis.  */
>>>       struct GTY ((tag ("0"))) ptr_info_def *ptr_info;
>>> -    /* Value range attributes used for zero/sign extension elimination.  */
>>> -    struct GTY ((tag ("1"))) range_info_def *range_info;
>>> +    /* Value range attributes.  */
>>> +    class GTY ((tag ("1"))) irange *range_info;
>>>     } GTY ((desc ("%1.typed.type ?" \
>>>             "!POINTER_TYPE_P (TREE_TYPE ((tree)&%1)) : 2"))) info;
>>>   
>>> +  /* For non-pointer types, this specifies a mask for the bits that
>>> +     are known to be set.  */
>>> +  struct nonzero_bits_def *nonzero_bits;
>>> +
>>>     /* Immediate uses list for this SSA_NAME.  */
>>>     struct ssa_use_operand_t imm_uses;
>>>   };
>> I'm worried a lot here about compile time memory usage increase, especially
>> with EVRP and IPA-VRP and even more so with LTO.
>> The changes mean that for every SSA_NAME of any kind we now need 8 more
>> bytes, and for every SSA_NAME that has range info (typically both range info
>> and nonzero_bits; in the old implementation the two were tied together for a
>> good reason, many ranges also imply some non-zero bits and from non-zero
>> bits one can in many cases derive a range) much more than that (through
>
> As follow on work we have discussed an interface which would be able to 
> calculate a bitmask (for either zeros or ones) from a range and vice 
> versa.. Meaning we wouldn't to need to store both unless the ssa_name is 
> used in such a way that it generates both.  ie.  when we create a range 
> from a bit operation, we simply store the bit version and not a range, 
> otherwise we create an irange but not a bitrange. When you query and ask 
> the ssa_name for an irange,  if there is no irange and there is a bit 
> pattern, we can generate the irange from that on demand.    Likewise, if 
> there is an irange and no bit pattern, we can generate any known on or 
> off bits from the irange. The only time there would be both would be 
> when we can somehow find some real benefit from having both.. I doubt 
> that would be very common.
>
> I think aldy simply copied the bitrange stuff wholesale in a separate 
> array just to get it working..  converting between the two was follow on 
> work to make things more efficient once it was fundamentally working.
>
>
>
>>   
>> the nonzero_bits_def you get all the overhead of trailing_wide_ints -
>> the 3 fields it has, just save on the actual 2 lengths and 2 HWI sets,
>> but then you add 3x 8 bytes and 6x size of the whole wide_int which is
>> from what I understood not really meant as the memory storage of wide ints
>> in data structures, but as something not memory efficient you can work
>> quickly on (so ideal for automatic variables in functions that work with
>> those).  So it is a 8 byte growth for SSA_NAMEs without ranges and
>> 8+3*8+6*32-2*4-2*8*x growth for SSA_NAMEs with ranges if x is the number
>> of HWIs needed to represent the integers of that type (so most commonly
>> x=1).  For x=1 8+3*8+6*32-2*4-2*8*x is 200 bytes growth.
>> With say 10000000 SSA_NAMEs, 5000000 of them with ranges, that will be
>> already a 1GB difference, dunno how many SSA_NAMEs are there e.g. in firefox
>> LTO build.
> why don't we actually try measuring it and see if it is noticeable? and 
> if it is, where the issues really are. You really think we have relevant 
> range info  for 50% of the ssa_names in a program?  Id be surprised.  
> Again, thats something we should probably measure and see whats 
> typical.  the range table should only be used when we can refine the 
> global range to something other than range_for_type().
>
> We can also change the representation of the range to be 2 ranges like 
> it is today..  we're just defaulting to 3 because it seems like it may 
> carry more useful information some times.   The long range plan is that 
> this can be tweaked at runtime to only use as much as we need or ant to 
> represent a range..  allowing an increase in the range precision for 
> specific optimizations that might care, and decreasing it the rest of 
> the time.  None of the external API enforces a particular number of ranges.
>
> More importantly, when the on-demand range work comes online it will end 
> the proliferation of SSA_NAMES which carry ASSERT_EXPR information.  We 
> won't see the trail of new  ssa_names the ASSET_EXPR chains create. I 
> suspect that will eliminate storing global ranges and bits for most SSA 
> names, which is likely to make this class a win regardless.
> That we will have to wait and see until my code is more functional in a 
> month or so
>> Can the nonzero_bits stuff be folded back into irange (and have code to
>> maintain nonzero_bits in addition to ranges as before (from setting a range
>> compute or update nonzero_bits and vice versa)?  Can you use
> the two are completely different representations of information, it 
> seems cleaner to have them separate but allow one to generate the other 
> when required.  I suspect most of the time we only need to save one or 
> the other. having another class and the ability to convert from one to 
> the other seems like a better solution.
>> trailing_wide_int for the irange storage of the values?  Can you allocate
>> only as many HWIs as you really need, rather than always 6?
> probably.
>> Again, it can be different between a class you use for accessing the
>> information and manipulating it and for actual underlying storage on
>> SSA_NAMEs.
> Im not familiar with the details of how wide_int and host vs target 
> integers really works. I don't think Aldy is either.  If there is a more 
> efficient way to store an integer fr this use case then by all means we 
> can do that.  to get things working we just used what was easily 
> available.   if there is a more efficient way to represent it, perhaps 
> it makes sense to create a class for a"memory_efficient_wide_int" and 
> allow it to convert back and forth from a wide_int as needed?

Like Jakub says, that's effectively what trailing_wide_int is.
You calculate exactly how many HWIs you need to represent the numbers,
then adjust the amount of allocated memory accordingly.  So once you've
computed the range in a local irange, a version of irange with
trailing_wide_ints would be better for long-term storage.
It has an overhead of 64 bits for <= 5 integers, on top of the
HWIs themselves.

Thanks,
Richard

Reply via email to