Am Donnerstag, den 18.04.2019, 11:56 +0200 schrieb Richard Biener:
> On Thu, Apr 18, 2019 at 11:31 AM Richard Biener
> <richard.guent...@gmail.com> wrote:
> > 
> > On Wed, Apr 17, 2019 at 4:12 PM Uecker, Martin
> > <martin.uec...@med.uni-goettingen.de> wrote:
> > > 
> > > Am Mittwoch, den 17.04.2019, 15:34 +0200 schrieb Richard Biener:
> > > > On Wed, Apr 17, 2019 at 2:56 PM Uecker, Martin
> > > > <martin.uec...@med.uni-goettingen.de> wrote:

....
> > > Let's consider this example:
> > > 
> > > int x;
> > > int y;
> > > uintptr_t pi = (uintptr_t)&x;
> > > uintptr_t pj = (uintptr_t)&y;
> > > 
> > > if (pi + 4 == pj) {
> > > 
> > >    int* p = (int*)pj; // can be one-after pointer of 'x'
> > >    p[-1] = 1;         // well defined?
> > > }
> > > 
> > > If I understand correctly, a pointer obtained from
> > > pi + 4 would have a "anything" provenance (which is
> > > fine). But the pointer obtained from 'pj' would have the
> > > provenance of 'y' so the access to 'x' would not
> > > be allowed.
> > 
> > Correct.  This is the most difficult case for us to handle
> > exactly also because (also valid for the proposal?)
> > 
> > int x;
> > int y;
> > uintptr_t pi = (uintptr_t)&x;
> > uintptr_t pj = (uintptr_t)&y;
> > 
> > if (pi + 4 == pj) {
> > 
> >    int* p = (int*)(pi + 4); // can be one-after pointer of 'x'
> >    p[-1] = 1;         // well defined?
> > }
> > 
> > while well-handled by GCC in the written form (as you
> > say, pi + 4 yields "anything" provenance), GCC itself
> > may tranform it into the first variant by noticing
> > the conditional equivalence and substituting pj for
> > pi + 4.

Integers are just integers in the proposal, so conditional
equivalence is not a problem for them. In my opinion this
is a strength of the proposal. Tracking provenance for
integers would mean that all computations would be affected
by such subtle semantics issues (where you can not even
replace an integer by an equivalent one). In this
proposal this is limited to pointers where it at least
makes some sense.

> > > But according to the preferred version of
> > > our proposal, the pointer could also be used to
> > > access 'x' because it is also exposed.
> > > 
> > > GCC could make pj have a "anything" provenance
> > > even though it is not modified. (This would break
> > > some optimization such as the one for Matlab.)
> > > 
> > > Maybe one could also refine this optimization to check
> > > for additional conditions which rule out the case
> > > that there is another object the pointer could point
> > > to.
> > 
> > The only feasible solution would be to not track
> > provenance through non-pointers and make
> > conversions of non-pointers to pointers have
> > "anything" provenance.

This would be one solution, yes. But you could
reattach the same provenance if you know that the
pointer points in the middle of an object (so is
not a first or one-after pointer) or if you know
that there is no exposed object directly adjacent
to this object, etc.. 

> > The additional issue that appears here though
> > is that we cannot even turn (int *)(uintptr_t)p
> > into p anymore since with the conditional
> > substitution we can then still arrive at
> > effectively (&y)[-1] = 1 which is of course
> > undefined behavior.
> > 
> > That is, your proposal makes
> > 
> >  ((int *)(uintptr_t)&y)[-1] = 1
> > 
> > well-defined (if &y - 1 == &x) but keeps
> > 
> >   (&y)[-1] = 1
> > 
> > as undefined which strikes me as a little bit
> > inconsistent.  If that's true it's IMHO worth
> > a defect report and second consideration.

This is true. But I would not call it inconsistent.
It is just unusual if you expect that casts to integers
and back are no-ops.  In this proposal a round-trip has
the effect of stripping the original provenance and
attaching a new one (which could be the same as the
old one).

While in this specific scenario this might seem
unreasonable, there are other examples where you may
want to be able to get from one object to the others.
and using casts to integers would then be the
blessed way to express this. 

In my opinion, this is also intuitive: 
By casting to an integer one then gets simple discrete
pointer semantics where one does not have provenance.


> Similarly that
> 
> int x;
> int y;
> uintptr_t pj = (uintptr_t)&y;
> 
> if (&x + 1 == &y) {
> 
>    int* p = (int*)pj; // can be one-after pointer of 'x'
>    p[-1] = 1;         // well defined?
> }
> 
> is undefined but when I add a no-op
> 
>  (uintptr_t)&x;
> 
> it is well-defined is undesirable.  Can this no-op
> stmt appear in another function?  Or even in
> another translation unit (if x and y are global variables)?
> And does such stmt have to be present (in another
> TU) to make the example valid in this case?

Without that statement, the example is not valid as the
address of 'x' is not exposed. With the statement this
becomes valid and it does not matter where this statement
appears. Again, I agree that he fact that such a statement
has a side-effect is something one needs to get used to.

But adress-taken already has side-effect which could be
surprising, doesn't it? If I understood your answer
above correctly, for GCC you get this side-effect already
without the cast:

&x;


For the statement to appear elsewhere, the address must
escape first. I would expect a compiler to treat a
cast to an integer identically to an escaped address.

> To me all this makes requiring exposal through a cast
> to a non-pointer (or accessing its representation) not
> in any way more "useful" for an optimizing compiler than
> modeling exposal through address-taking.

There would be a difference for cases like this:

int x[3];
int y;

x[0] = 1; 
uintptr_t pj = (uintptr_t)&y;

if (pi + 4 == pj) {

  int* p = (int*)(pi + 4);
  p[-1] = 1; 
}

Here 'x' is not exposed in our proposal so the assignment
via 'p' is invalid but the address is taken implicitly. 

Other examples is storage allocated via malloc/alloca
where there is always a pointer involved but which is
not automatically exposed in our proposal.


Best,
Martin



Reply via email to