Richard Kenner wrote:
There are many cases when you can prove the value can be treated as valid. One interesting case is based on the fact that suppressing a language-defined check is erroneous if that check would fail. So, forA: = B + 1; you *can* assume A is valid.
Yes, but I don't think it is a good idea to do so. I think you get unexpected behavior if you make this kind of assumption, and I do not think anyone has demonstrated sufficient optimization advantage to make it worth while to get this unexpected behavior. The compiler should not assume validity unless it can prove that the value is actually in the declared range in my opinion.
So the "proper" way of dealing with this would be to have VRP propagate a "valid" property using fairly easy-to-state rules and then only assume a value within the subtype bounds if it can prove it to be valid (otherwise, the type bounds must be used). We've thought about doing this, but it's a lot of work and the benefit isn't clear. From a theoretical point of view, the more an optimizer knows about a program, the better it can do and knowing that valid values are in specified ranges certainly conveys some information. One could also argue that such information is even more valuable in an IPA context. But from a practical point of view, it's much less clear. How much does the range information actually buy in practice? In how many cases do we know that a variable will always be valid? Might there be something about that subset that makes the range information less useful for it? We just don't know the answers to these questions.
Right, although what you say is theoretically true, I doubt in practice it will make a difference in performance, and it will sure lead to surprises for the programmer.
But in the absence of doing that, the present situation is problematic. It's true that only the most pedantic programmer would care about the distinction between a "bounded error" and "erroneous behavior",
I disagree, the rules in Ada 95 are formulated so that programmers do NOT see unexpected behavior from e.g. uninitialized values, if such a value is invalid, the program behaves in some reasonable way, and is NOT erroneous, and that's definitely important.
so one could argue that this problem isn't that serious, but there do exist such people in the Ada world. Similarly, it's possible to find some solutions to the "test for validity" case that don't involve the full solution above, but they're also work. But the existance of both problems together suggest that the present situation isn't workable and it's not clear that it's worth fixing "properly" at this time.
The current situation (which this change corrects) is not only theoretically wrong, but I think is pragmatically undesirable, in
that it can indeed result in unexpected results. If you have X, Y : Integer range 1 .. 10; and X has the value 11 and Y has the value 10 then when you write if X > Y then ... you expect this to be true. Formal Ada semantics says that it MUST be true if X is merely invalid (e.g. results from an uninitialized variable). Richard is pointing out that it is allowed to be false if X is abnormal because the program is erroneous in this case (e.g. if the value in X comes from a missed suppressed overflow check). Yes, true, but it is still 100% undesirable to give anything else than true here. Once again, my view is that the compiler should not assume that variables are in the required range unless it can prove that they are actually in that range, it is not good enough to prove they are valid in the absence of erroneousness. In practice, gcc will often be able to deduce the actual ranges anyway, e.g. if you write (for the above declarations) X := X + 1; Y := Y + 1; and you have checks enabled, then the check will inform gcc that the variables are now in the range 1 .. 10. The kind of case Richard is talking about is the following: subtype R is Integer range 1 .. 10; X : R := 1; A : array (R) of Integer; X := X + 1; A (X) := 10; -- can we assume X is in range 1..10 and avoid the check If checks are enabled for both assignments, the answer is yes, since the first assignment will do the check. Currently the front end will generate checks for the array subscript, but gcc should be able to eliminate the second check, based on the knowledge from the first check. If checks are disabled for both assignments, the answer is yes, since the second assignment is erroneous if X is out of range (and I think any programmer would expect random memory destruction in this case, as we would get in C with no checks). The one case where Richard's argument holds theoretically, is when checks are disabled for the first assignment and enabled for the second assignment. In this case, theoretically the compiler can omit the check in the second case, since it can determine that either X is in range or the program is erroneous. However I think it is not a good idea to take advantage of this permission: a) in any case it will happen very rarely in practice b) there is no demonstration that this will be worthwhile c) it leads to unexpected behavior. I think programmers expect a check to be present if checks are enabled for the second assignment.
Because this decision may be revisited and it may be found worthwhile to add the mechanism above and "do this right", it's important that we not remove code that supports the ranges unless absolutely necessary because doing so would greatly increase the amount of working needed to do this right and thus make it even less likely. (And, in any event, these types *are* needed for array bounds, so must be supported at some level.)
*that* seems reasonable
