On Tue, Oct 4, 2022 at 3:27 PM Andrew MacLeod <amacl...@redhat.com> wrote: > > > On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote: > > On Tue, Oct 4, 2022, 13:28 Aldy Hernandez <al...@redhat.com> wrote: > > > >> On Tue, Oct 4, 2022 at 9:55 AM Richard Biener > >> <richard.guent...@gmail.com> wrote: > >>> > >>> Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches < > >> gcc-patches@gcc.gnu.org>: > >>>> The reason the nonzero mask was kept in a tree was basically inertia, > >>>> as everything in irange is a tree. However, there's no need to keep > >>>> it in a tree, as the conversions to and from wide ints are very > >>>> annoying. That, plus special casing NULL masks to be -1 is prone > >>>> to error. > >>>> > >>>> I have not only rewritten all the uses to assume a wide int, but > >>>> have corrected a few places where we weren't propagating the masks, or > >>>> rather pessimizing them to -1. This will become more important in > >>>> upcoming patches where we make better use of the masks. > >>>> > >>>> Performance testing shows a trivial improvement in VRP, as things like > >>>> irange::contains_p() are tied to a tree. Ughh, can't wait for trees in > >>>> iranges to go away. > >>> You want trailing wide int storage though. A wide_int is quite large. > >> Absolutely, this is only for short term storage. Any time we need > >> long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go > >> through vrange_storage which will stream things in a more memory > >> efficient manner. For irange, vrange_storage will stream all the > >> sub-ranges, including the nonzero bitmask which is the first entry in > >> such storage, as trailing_wide_ints. > >> > >> See irange_storage_slot to see how it lives in GC memory. > >> > > That being said, the ranger's internal cache uses iranges, albeit with a > > squished down number of subranges (the minimum amount to represent the > > range). So each cache entry will now be bigger by the difference between > > one tree and one wide int. > > > > I wonder if we should change the cache to use vrange_storage. If not now, > > then when we convert all the subranges to wide ints. > > > > Of course, the memory pressure of the cache is not nearly as problematic as > > SSA_NAME_RANGE_INFO. The cache only stores names it cares about. > > Rangers cache can be a memory bottleneck in pathological cases.. > Certainly not as bad as it use to be, but I'm sure it can still be > problematic. Its suppose to be a memory efficient representation > because of that. The cache can have an entry for any live ssa-name > (which means all of them at some point in the IL) multiplied by a factor > involving the number of dominator blocks and outgoing edges ranges are > calculated on. So while SSA_NAME_RANGE_INFO is a linear thing, the > cache lies somewhere between a logarithmic and exponential factor based > on the CFG size.
Hmmm, perhaps the ultimate goal here should be to convert the cache to use vrange_storage, which uses trailing wide ints for all of the end points plus the masks (known_ones included for the next release). > > if you are growing the common cases of 1 to 2 endpoints to more than > double in size (and most of the time not be needed), that would not be > very appealing :-P If we have any wide-ints, they would need to be a > memory efficient version. The Cache uses an irange_allocator, which is > suppose to provide a memory efficient objects.. hence why it trims the > number of ranges down to only what is needed. It seems like a trailing > wide-Int might be in order based on that.. > > Andrew > > > PS. which will be more problematic if you eventually introduce a > known_ones wide_int. I thought the mask tracking was/could be > something simple like HOST_WIDE_INT.. then you only tracks masks in > types up to the size of a HOST_WIDE_INT. then storage and masking is > all trivial without going thru a wide_int. Is that not so/possible? That's certainly easy and cheaper to do. The hard part was fixing all the places where we weren't keeping the masks up to date, and that's done (sans any bugs ;-)). Can we get consensus here on only tracking masks for type sizes less than HOST_WIDE_INT? I'd hate to do all the work only to realize we need to track 512 bit masks on a 32-bit host cross :-). Aldy