Andrew MacLeod <amacl...@redhat.com> writes: > On 05/23/2017 08:11 AM, Jakub Jelinek wrote: >> On Tue, May 23, 2017 at 06:48:15AM -0400, Aldy Hernandez wrote: >>> --- a/gcc/tree-ssanames.h >>> +++ b/gcc/tree-ssanames.h >>> @@ -45,14 +45,12 @@ struct GTY(()) ptr_info_def >>> unsigned int misalign; >>> }; >>> >>> -/* Value range information for SSA_NAMEs representing non-pointer >>> variables. */ >>> - >>> -struct GTY ((variable_size)) range_info_def { >>> - /* Minimum, maximum and nonzero bits. */ >>> - TRAILING_WIDE_INT_ACCESSOR (min, ints, 0) >>> - TRAILING_WIDE_INT_ACCESSOR (max, ints, 1) >>> - TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2) >>> - trailing_wide_ints <3> ints; >>> +/* Used bits information for SSA_NAMEs representing non-pointer variables. >>> */ >>> + >>> +struct GTY ((variable_size)) nonzero_bits_def { >>> + /* Mask representing which bits are known to be used in an integer. */ >>> + TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 0) >>> + trailing_wide_ints <1> ints; >>> }; >>> >>> >>> --- a/gcc/tree-core.h >>> +++ b/gcc/tree-core.h >>> @@ -1416,11 +1413,15 @@ struct GTY(()) tree_ssa_name { >>> union ssa_name_info_type { >>> /* Pointer attributes used for alias analysis. */ >>> struct GTY ((tag ("0"))) ptr_info_def *ptr_info; >>> - /* Value range attributes used for zero/sign extension elimination. */ >>> - struct GTY ((tag ("1"))) range_info_def *range_info; >>> + /* Value range attributes. */ >>> + class GTY ((tag ("1"))) irange *range_info; >>> } GTY ((desc ("%1.typed.type ?" \ >>> "!POINTER_TYPE_P (TREE_TYPE ((tree)&%1)) : 2"))) info; >>> >>> + /* For non-pointer types, this specifies a mask for the bits that >>> + are known to be set. */ >>> + struct nonzero_bits_def *nonzero_bits; >>> + >>> /* Immediate uses list for this SSA_NAME. */ >>> struct ssa_use_operand_t imm_uses; >>> }; >> I'm worried a lot here about compile time memory usage increase, especially >> with EVRP and IPA-VRP and even more so with LTO. >> The changes mean that for every SSA_NAME of any kind we now need 8 more >> bytes, and for every SSA_NAME that has range info (typically both range info >> and nonzero_bits; in the old implementation the two were tied together for a >> good reason, many ranges also imply some non-zero bits and from non-zero >> bits one can in many cases derive a range) much more than that (through > > As follow on work we have discussed an interface which would be able to > calculate a bitmask (for either zeros or ones) from a range and vice > versa.. Meaning we wouldn't to need to store both unless the ssa_name is > used in such a way that it generates both. ie. when we create a range > from a bit operation, we simply store the bit version and not a range, > otherwise we create an irange but not a bitrange. When you query and ask > the ssa_name for an irange, if there is no irange and there is a bit > pattern, we can generate the irange from that on demand. Likewise, if > there is an irange and no bit pattern, we can generate any known on or > off bits from the irange. The only time there would be both would be > when we can somehow find some real benefit from having both.. I doubt > that would be very common. > > I think aldy simply copied the bitrange stuff wholesale in a separate > array just to get it working.. converting between the two was follow on > work to make things more efficient once it was fundamentally working. > > > >> >> the nonzero_bits_def you get all the overhead of trailing_wide_ints - >> the 3 fields it has, just save on the actual 2 lengths and 2 HWI sets, >> but then you add 3x 8 bytes and 6x size of the whole wide_int which is >> from what I understood not really meant as the memory storage of wide ints >> in data structures, but as something not memory efficient you can work >> quickly on (so ideal for automatic variables in functions that work with >> those). So it is a 8 byte growth for SSA_NAMEs without ranges and >> 8+3*8+6*32-2*4-2*8*x growth for SSA_NAMEs with ranges if x is the number >> of HWIs needed to represent the integers of that type (so most commonly >> x=1). For x=1 8+3*8+6*32-2*4-2*8*x is 200 bytes growth. >> With say 10000000 SSA_NAMEs, 5000000 of them with ranges, that will be >> already a 1GB difference, dunno how many SSA_NAMEs are there e.g. in firefox >> LTO build. > why don't we actually try measuring it and see if it is noticeable? and > if it is, where the issues really are. You really think we have relevant > range info for 50% of the ssa_names in a program? Id be surprised. > Again, thats something we should probably measure and see whats > typical. the range table should only be used when we can refine the > global range to something other than range_for_type(). > > We can also change the representation of the range to be 2 ranges like > it is today.. we're just defaulting to 3 because it seems like it may > carry more useful information some times. The long range plan is that > this can be tweaked at runtime to only use as much as we need or ant to > represent a range.. allowing an increase in the range precision for > specific optimizations that might care, and decreasing it the rest of > the time. None of the external API enforces a particular number of ranges. > > More importantly, when the on-demand range work comes online it will end > the proliferation of SSA_NAMES which carry ASSERT_EXPR information. We > won't see the trail of new ssa_names the ASSET_EXPR chains create. I > suspect that will eliminate storing global ranges and bits for most SSA > names, which is likely to make this class a win regardless. > That we will have to wait and see until my code is more functional in a > month or so >> Can the nonzero_bits stuff be folded back into irange (and have code to >> maintain nonzero_bits in addition to ranges as before (from setting a range >> compute or update nonzero_bits and vice versa)? Can you use > the two are completely different representations of information, it > seems cleaner to have them separate but allow one to generate the other > when required. I suspect most of the time we only need to save one or > the other. having another class and the ability to convert from one to > the other seems like a better solution. >> trailing_wide_int for the irange storage of the values? Can you allocate >> only as many HWIs as you really need, rather than always 6? > probably. >> Again, it can be different between a class you use for accessing the >> information and manipulating it and for actual underlying storage on >> SSA_NAMEs. > Im not familiar with the details of how wide_int and host vs target > integers really works. I don't think Aldy is either. If there is a more > efficient way to store an integer fr this use case then by all means we > can do that. to get things working we just used what was easily > available. if there is a more efficient way to represent it, perhaps > it makes sense to create a class for a"memory_efficient_wide_int" and > allow it to convert back and forth from a wide_int as needed?
Like Jakub says, that's effectively what trailing_wide_int is. You calculate exactly how many HWIs you need to represent the numbers, then adjust the amount of allocated memory accordingly. So once you've computed the range in a local irange, a version of irange with trailing_wide_ints would be better for long-term storage. It has an overhead of 64 bits for <= 5 integers, on top of the HWIs themselves. Thanks, Richard