On 10/4/22 08:13, Aldy Hernandez via Gcc-patches wrote:
On Tue, Oct 4, 2022, 13:28 Aldy Hernandez <al...@redhat.com> wrote:
On Tue, Oct 4, 2022 at 9:55 AM Richard Biener
<richard.guent...@gmail.com> wrote:
Am 04.10.2022 um 09:36 schrieb Aldy Hernandez via Gcc-patches <
gcc-patches@gcc.gnu.org>:
The reason the nonzero mask was kept in a tree was basically inertia,
as everything in irange is a tree. However, there's no need to keep
it in a tree, as the conversions to and from wide ints are very
annoying. That, plus special casing NULL masks to be -1 is prone
to error.
I have not only rewritten all the uses to assume a wide int, but
have corrected a few places where we weren't propagating the masks, or
rather pessimizing them to -1. This will become more important in
upcoming patches where we make better use of the masks.
Performance testing shows a trivial improvement in VRP, as things like
irange::contains_p() are tied to a tree. Ughh, can't wait for trees in
iranges to go away.
You want trailing wide int storage though. A wide_int is quite large.
Absolutely, this is only for short term storage. Any time we need
long term storage, say global ranges in SSA_NAME_RANGE_INFO, we go
through vrange_storage which will stream things in a more memory
efficient manner. For irange, vrange_storage will stream all the
sub-ranges, including the nonzero bitmask which is the first entry in
such storage, as trailing_wide_ints.
See irange_storage_slot to see how it lives in GC memory.
That being said, the ranger's internal cache uses iranges, albeit with a
squished down number of subranges (the minimum amount to represent the
range). So each cache entry will now be bigger by the difference between
one tree and one wide int.
I wonder if we should change the cache to use vrange_storage. If not now,
then when we convert all the subranges to wide ints.
Of course, the memory pressure of the cache is not nearly as problematic as
SSA_NAME_RANGE_INFO. The cache only stores names it cares about.
Rangers cache can be a memory bottleneck in pathological cases..
Certainly not as bad as it use to be, but I'm sure it can still be
problematic. Its suppose to be a memory efficient representation
because of that. The cache can have an entry for any live ssa-name
(which means all of them at some point in the IL) multiplied by a factor
involving the number of dominator blocks and outgoing edges ranges are
calculated on. So while SSA_NAME_RANGE_INFO is a linear thing, the
cache lies somewhere between a logarithmic and exponential factor based
on the CFG size.
if you are growing the common cases of 1 to 2 endpoints to more than
double in size (and most of the time not be needed), that would not be
very appealing :-P If we have any wide-ints, they would need to be a
memory efficient version. The Cache uses an irange_allocator, which is
suppose to provide a memory efficient objects.. hence why it trims the
number of ranges down to only what is needed. It seems like a trailing
wide-Int might be in order based on that..
Andrew
PS. which will be more problematic if you eventually introduce a
known_ones wide_int. I thought the mask tracking was/could be
something simple like HOST_WIDE_INT.. then you only tracks masks in
types up to the size of a HOST_WIDE_INT. then storage and masking is
all trivial without going thru a wide_int. Is that not so/possible?
Aldy