On Fri, Mar 19, 2021 at 11:29 AM Erick Ochoa <eoc...@gcc.gnu.org> wrote:
>
> Hello Richard,
>
> I have a related question. I have the following GCC program (which is
> again an altered version of ipa-pta-16.c:
>
> ```
>
> struct X
> {
>   long l1;
>   struct Y
>     {
>       long l2;
>       int *p;
>     } y;
> };
> int i;
> static int __attribute__((noinline))
> foo (struct X *x)
> {
>   struct Y y = x->y;
>   *y.p = 0;
>    //i = 1;
>   return *y.p;
> }
> extern void abort (void);
> int main()
> {
>   struct X x;
>   struct X t;
>   x.y.p = &i;
>   long* e = &t.l1;
>   if (foo(&x) != 1)
>     abort ();
>   return e;
> }
> ```
>
> I am compiling it with gcc -O2 -ftree-pta -fdump-tree-alias
> -ftree-salias ipa-pta-16.c
>
> and I get the following constraints for function foo.
>
> ```
> ANYTHING = &ANYTHING
> ESCAPED = *ESCAPED
> ESCAPED = ESCAPED + UNKNOWN
> *ESCAPED = NONLOCAL
> NONLOCAL = &NONLOCAL
> NONLOCAL = &ESCAPED
> INTEGER = &ANYTHING
> ISRA.4 = &NONLOCAL
> derefaddrtmp(9) = &NULL
> *ISRA.4 = derefaddrtmp(9)
> ```
>
> I am just ordering to make this easier:
>
> ```
> ANYTHING = &ANYTHING
> NONLOCAL = &NONLOCAL
> NONLOCAL = &ESCAPED
> INTEGER = &ANYTHING
> ISRA.4 = &NONLOCAL
> derefaddrtmp(9) = &NULL
>
> ESCAPED = ESCAPED + UNKNOWN
>
> ESCAPED = *ESCAPED
> *ESCAPED = NONLOCAL
> *ISRA.4 = derefaddrtmp(9)
> ```
>
> I now construct a constraint graph. It is basically a collection of
> unconnected nodes, except for ESCAPED which points to itself.
>
> I then walk over this graph and first encounter the constraint
> "ESCAPED = ESCAPED + UNKNOWN". Since Sol(ESCAPED) is the empty set
> nothing happens.
> Similar for the constraints "ESCAPED = *ESCAPED" and "*ESCAPED = NONLOCAL".
>
> I then walk to "*ISRA.4 = derefaddrtmp(9)". Sol(ISRA.4) = { NONLOCAL
> }. Then there's possibly an edge made from NONLOCAL to derefaddrtmp.
> (I am not sure of the implementation, since NONLOCAL doesn't change it
> might not require an added edge.). And because NONLOCAL is a special
> variable, Sol(NONLOCAL) does not change.
>
> Now, none of the solutions have changed. We have possibly added an
> edge to NONLOCAL, so let's just process the graph once more. Again
> Sol(ESCAPED) is still empty so "ESCAPED = ESCAPED + UNKNOWN", "ESCAPED
> = *ESCAPED", and "*ESCAPED = NONLOCAL" should have no effect ."Then
> *ISRA.4 = derefaddrtmp(9)" also has no effect since nothing can
> happen.
>
> My question here is, why if we look at the points-to sets dumped by GCC we see
>
> ESCAPED = { NULL }

That's likely because ISRA points to global memory (NONLOCAL) and thus
all stores to it, like *ISRA.4 = derefaddrtmp(9), escape.  That detail
is handled
in the solver in do_ds_constraint:

             /* If v is a global variable then this is an escape point.  */
              if (v->is_global_var
                  && !escaped_p)
                {
                  t = find (escaped_id);
                  if (add_graph_edge (graph, t, rhs)
                      && bitmap_ior_into (get_varinfo (t)->solution, sol))
                    bitmap_set_bit (changed, t);
                  /* Enough to let rhs escape once.  */
                  escaped_p = true;
                }

which is an artifact on how we have split NONLOCAL and ESCAPED
(for the benefit of getting some flow sensitivity, function start and later).

You can also see special-casing for stores to ANYTHING in this
function.

> Here is the full points-to sets:
>
> computed for foo:
>
> ```
> ANYTHING = { ANYTHING }
> ESCAPED = { NULL }
> NONLOCAL = { ESCAPED NONLOCAL }
> STOREDANYTHING = { }
> INTEGER = { ANYTHING }
> ISRA.4 = { NONLOCAL }
> derefaddrtmp(9) = { NULL }
> foo.constprop.0.isra.0 = { }
> ```
>
> I think this might have to do with "+ UNKNOWN" but I am unsure about
> its semantics in the constraint set.
>
> Thanks, any help is appreciated!
>
>
> On Thu, 18 Mar 2021 at 13:36, Erick Ochoa <eoc...@gcc.gnu.org> wrote:
> >
> > Perfect, thank you! This is exactly the answer that I was looking for.
> >
> >
> > On Thu, 18 Mar 2021 at 13:27, Richard Biener <richard.guent...@gmail.com> 
> > wrote:
> > >
> > > On Wed, Mar 17, 2021 at 4:17 PM Erick Ochoa <eoc...@gcc.gnu.org> wrote:
> > > >
> > > > Hi Richard, I think I misunderstood yesterday's answer and deviated a
> > > > little bit. But now I want to focus on this:
> > > >
> > > > > > * the process in GCC that generates the constraints for NULL works
> > > > > > fine (i.e., feeding the constraints generated by GCC to an external
> > > > > > solver should yield a conservatively correct answer) but the process
> > > > > > that solves the constraints relaxes the solutions for the NULL
> > > > > > constraint variable (i.e., GCC has deviated from the constraint
> > > > > > solving algorithm somehow)
> > > > >
> > > > > No, that part should work OK.
> > > > >
> > > >
> > > > So, let's ignore the other solver for now and instead focus on the
> > > > concrete example I presented on the previous email. If GCC is solving
> > > > these constraints:
> > > >
> > > > ```
> > > > ANYTHING = &ANYTHING
> > > > ESCAPED = *ESCAPED
> > > > ESCAPED = ESCAPED + UNKNOWN
> > > > *ESCAPED = NONLOCAL
> > > > NONLOCAL = &NONLOCAL
> > > > NONLOCAL = &ESCAPED
> > > > INTEGER = &ANYTHING
> > > > ISRA.4 = &NONLOCAL
> > > > derefaddrtmp(9) = &NULL
> > > > *ISRA.4 = derefaddrtmp(9)
> > > > i = NONLOCAL
> > > > i = &NONLOCAL
> > > > ESCAPED = &NONLOCAL
> > > > _2 = *ISRA.4
> > > > ```
> > > >
> > > > What would a hand calculated solution gives us vs what was the
> > > > solution given by GCC?
> > > >
> > > > I am following the algorithm stated on Section 3.3 of Structure
> > > > Aliasing in GCC, and I will be ignoring the ESCAPED = ESCAPED +
> > > > UNKNOWN constraint since there isn't any other field offset that needs
> > > > to be calculated.
> > > >
> > > > First, I want to make some adjustments. I am going to be using "=" to
> > > > signify the \supseteq symbol and I will be adding curly braces to
> > > > specify the element in a set as opposed to the whole set. Therefore
> > > > the constraints will now become (ordered slightly differently):
> > > >
> > > > ```
> > > > ====direct contraints========
> > > > ANYTHING = { ANYTHING }
> > > > ESCAPED = { NONLOCAL }
> > > > NONLOCAL = { NONLOCAL }
> > > > NONLOCAL =  { ESCAPED }
> > > > INTEGER = { ANYTHING }
> > > > ISRA.4 = { NONLOCAL }
> > > > derefaddrtmp(9) = { NULL }
> > > > i = { NONLOCAL }
> > > >
> > > > ====complex constraints======
> > > > ESCAPED = *ESCAPED
> > > > *ESCAPED = NONLOCAL
> > > > *ISRA.4 = derefaddrtmp(9)
> > > > _2 = *ISRA.4
> > > >
> > > > ===== copy-constraints======
> > > > ESCAPED = ESCAPED // again ignoring the + UNKNOWN since I don't think
> > > > it will matter...
> > > > i = NONLOCAL
> > > > ```
> > > >
> > > > Solution sets are basically the direct constraints at the moment.
> > > >
> > > > Let's now create the graph
> > > >
> > > > 1. node ESCAPED has an edge going to itself (due to the copy constraint)
> > > > 2. node ISRA.4 has no outgoing copy edges
> > > > 3. node derefaddrtmp(9) has no outgoing edges
> > > > 4. node _2 has no outgoing edges
> > > > 5. node i has an outgoing edge to NONLOCAL (due to the copy constraint)
> > > > 6. node NONLOCAL has no outgoing edge
> > > >
> > > > Now, we can iterate over this set of nodes
> > > >
> > > > 1. Walking over node ESCAPED. Sol(ESCAPED) = {NONLOCAL}. There are no
> > > > edges, but it has complex-constraints. Let's modify the graph.
> > > >   1. Looking at ESCAPED = *ESCAPED we note that we need to add a copy
> > > > edge from ESCAPED to NONLOCAL.
> > > >   2. Looking at *ESCAPED = NONLOCAL we note that we need to add a copy
> > > > edge from NONLOCAL to NONLOCAL
> > > >
> > > > The graph is now transformed to
> > > >
> > > > 1. node ESCAPED has an edge going to ESCAPED and NONLOCAL
> > > > 2. node ISRA.4 has no outgoing copy edges
> > > > 3. node derefaddrtmp(9) has no outgoing edges
> > > > 4. node _2 has no outgoing edges
> > > > 5. node i has an outgoing edge to NONLOCAL (due to the copy constraint)
> > > > 6. node NONLOCAL has an edge going to NONLOCAL
> > > >
> > > > The solution set of escaped is now Sol(ESCAPED) = Sol(ESCAPED) U
> > > > Sol(NONLOCAL) = {NONLOCAL, ESCAPED}
> > > >
> > > > Now we continue walking
> > > >
> > > > 2. Walking over node ISRA.4. It has the solution set { NONLOCAL }.
> > > > There are no edges, but it has complex-constraints. Let's modify the
> > > > graph.
> > > >   1. Looking at *ISRA.4 = derefaddrtmp(9), we note that we need to add
> > > > a copy edge from NONLOCAL to derefaddrtmp(9).
> > > >
> > > > The graph is now transformed to
> > > >
> > > > 1. node ESCAPED has an edge going to ESCAPED and NONLOCAL
> > > > 2. node ISRA.4 has no outgoing copy edges
> > > > 3. node derefaddrtmp(9) has no outgoing edges
> > > > 4. node _2 has no outgoing edges
> > > > 5. node i has an outgoing edge to NONLOCAL (due to the copy constraint)
> > > > 6. node NONLOCAL has an edge going to NONLOCAL, derefaddrtmp(9)
> > > >
> > > > The Sol(NONLOCAL) = Sol(NONLOCAL) U Sol(derefaddrtmp(9)) = {NONLOCAL,
> > > > ESCAPED, NULL}.
> > > >
> > > > Now I could continue, but here is already something that is not shown
> > > > in the points-to sets in the dump. It shows that
> > > >
> > > > NONLOCAL = {NONLOCAL, ESCAPED, NULL}
> > > >
> > > > Looking at the data that I showed yesterday:
> > > >
> > > > ```
> > > > NONLOCAL = { ESCAPED NONLOCAL } same as i
> > > > ```
> > > >
> > > > we see that NULL is not in the solution set of NONLOCAL.
> > > >
> > > > Now, yesterday you said that NULL is not conservatively correctly
> > > > represented in the constraints. You also said today the points-to
> > > > analysis should be solving the constraints fine. What I now understand
> > > > from this is that while NULL may be pointed to by some constraints, it
> > > > doesn't mean that not being on the set means that a pointer will not
> > > > point to NULL. However, it should still be shown in the dumps where
> > > > the points-to sets are shown for the constraint variables since it is
> > > > solved using the same analysis? Is this correct? Am I doing the points
> > > > to analysis by hand wrong somehow? Why would NULL not be in
> > > > Sol(NONLOCAL) if it is solving the same constraints that I am solving
> > > > by hand?
> > >
> > > Because NONLOCAL is a special var and those do not get "solved".
> > > Their points-to set is fixed.  NONLOCAL is the set of "global"
> > > variables _at function entry_ (otherwise it would be the same as
> > > ESCAPED).
> > >
> > > Richard.

Reply via email to