Re: [RFC] Get rid of awkward semantics for subtypes

Robert Dewar Thu, 09 Apr 2009 19:59:44 -0700

Richard Kenner wrote:

There are many cases when you can prove the value can be treated as valid.
One interesting case is based on the fact that suppressing a
language-defined check is erroneous if that check would fail.  So, for


        A: = B + 1;

you *can* assume A is valid.


Yes, but I don't think it is a good idea to do so. I think you get
unexpected behavior if you make this kind of assumption, and I do
not think anyone has demonstrated sufficient optimization advantage
to make it worth while to get this unexpected behavior. The compiler
should not assume validity unless it can prove that the value is
actually in the declared range in my opinion.

So the "proper" way of dealing with this would be to have VRP propagate a
"valid" property using fairly easy-to-state rules and then only assume a
value within the subtype bounds if it can prove it to be valid (otherwise,
the type bounds must be used).

We've thought about doing this, but it's a lot of work and the benefit
isn't clear.  From a theoretical point of view, the more an optimizer knows
about a program, the better it can do and knowing that valid values are in
specified ranges certainly conveys some information. One could also argue
that such information is even more valuable in an IPA context.

But from a practical point of view, it's much less clear.  How much does
the range information actually buy in practice?  In how many cases do we
know that a variable will always be valid?  Might there be something about
that subset that makes the range information less useful for it?  We just
don't know the answers to these questions.


Right, although what you say is theoretically true, I doubt in practice
it will make a difference in performance, and it will sure lead to
surprises for the programmer.


But in the absence of doing that, the present situation is problematic.
It's true that only the most pedantic programmer would care about the
distinction between a "bounded error" and "erroneous behavior",


I disagree, the rules in Ada 95 are formulated so that programmers do
NOT see unexpected behavior from e.g. uninitialized values, if such
a value is invalid, the program behaves in some reasonable way, and
is NOT erroneous, and that's definitely important.

so one
could argue that this problem isn't that serious, but there do exist such
people in the Ada world.  Similarly, it's possible to find some solutions
to the "test for validity" case that don't involve the full solution above,
but they're also work.  But the existance of both problems together suggest
that the present situation isn't workable and it's not clear that it's
worth fixing "properly" at this time.

The current situation (which this change corrects) is not onlytheoretically wrong, but I think is pragmatically undesirable, in

that it can indeed result in unexpected results.

If you have

  X, Y : Integer range 1 .. 10;

and X has the value 11 and Y has the value 10

then when you write

   if X > Y then ...

you expect this to be true.

Formal Ada semantics says that it MUST be true if X is merely invalid
(e.g. results from an uninitialized variable). Richard is pointing out
that it is allowed to be false if X is abnormal because the program is
erroneous in this case (e.g. if the value in X comes from a missed
suppressed overflow check).

Yes, true, but it is still 100% undesirable to give anything else than
true here.

Once again, my view is that the compiler should not assume that
variables are in the required range unless it can prove that they
are actually in that range, it is not good enough to prove they
are valid in the absence of erroneousness.

In practice, gcc will often be able to deduce the actual ranges
anyway, e.g. if you write

(for the above declarations)

X := X + 1;
Y := Y + 1;

and you have checks enabled, then the check will inform gcc that
the variables are now in the range 1 .. 10.

The kind of case Richard is talking about is the following:

   subtype R is Integer range 1 .. 10;
   X : R := 1;
   A : array (R) of Integer;

   X := X + 1;
   A (X) := 10;

   --  can we assume X is in range 1..10 and avoid the check

If checks are enabled for both assignments, the answer is
yes, since the first assignment will do the check. Currently
the front end will generate checks for the array subscript,
but gcc should be able to eliminate the second check, based
on the knowledge from the first check.

If checks are disabled for both assignments, the answer is
yes, since the second assignment is erroneous if X is out
of range (and I think any programmer would expect random
memory destruction in this case, as we would get in C
with no checks).

The one case where Richard's argument holds theoretically,
is when checks are disabled for the first assignment and
enabled for the second assignment.

In this case, theoretically the compiler can omit the
check in the second case, since it can determine that
either X is in range or the program is erroneous.

However I think it is not a good idea to take advantage
of this permission:

a) in any case it will happen very rarely in practice
b) there is no demonstration that this will be worthwhile
c) it leads to unexpected behavior.

I think programmers expect a check to be present if
checks are enabled for the second assignment.

Because this decision may be revisited and it may be found worthwhile to
add the mechanism above and "do this right", it's important that we not
remove code that supports the ranges unless absolutely necessary because
doing so would greatly increase the amount of working needed to do this
right and thus make it even less likely.  (And, in any event, these types
*are* needed for array bounds, so must be supported at some level.)


*that* seems reasonable

Re: [RFC] Get rid of awkward semantics for subtypes

Reply via email to