Re: More questions on points-to analysis

Richard Biener via Gcc Wed, 17 Mar 2021 03:51:26 -0700

On Wed, Mar 17, 2021 at 11:34 AM Erick Ochoa via Gcc <[email protected]> wrote:
>
> Hello,
>
> I'm still trying to compare the solution generated from the
> intraprocedural points-to analysis in GCC against an external solver.
>
> Yesterday it was pointed out that "NULL is not conservatively
> correctly represented in the constraints". Can someone expand on this?
> To me this sounds like a couple of things:
> * even though malloc may return NULL, NULL is not added to the
> points-to sets of whatever variable is on the left hand side of the
> malloc call.
> * the process in GCC that generates the constraints for NULL somehow
> does not generate enough constraints to treat NULL conservatively and
> therefore there might be points-to sets which should contain NULL but
> don't. (However, doesn't this mean that feeding the constraints to an
> external solver should still give the same answers?)


Yes, there is an unknown number of places that get this "wrong".  Getting
it "right" wasn't needed until we started using pt->null for anything.

> * the process in GCC that generates the constraints for NULL works
> fine (i.e., feeding the constraints generated by GCC to an external
> solver should yield a conservatively correct answer) but the process
> that solves the constraints relaxes the solutions for the NULL
> constraint variable (i.e., GCC has deviated from the constraint
> solving algorithm somehow)

No, that part should work OK.

> Also, "at some point we decided to encode optimistic info into
> pt->null which means points-to now has to compute a conservatively
> correct pt->null." Doesn't this contradict itself? How is a pt->null
> first optimistically and now conservatively? Is what this is trying to
> say that:
>
> * NULL constraints were conservative first
> * pt->null optimistic first
> * Then conversion to SSA happened and NULL constraints became not
> conservatively represented in the constraints (effectively becoming
> somewhat optimistic)
> * To avoid NULL and pt->null be both unsafe, pt->null was changed to
> be conservative

The SSA points-to solution is what gets used, it is now populated not only
by PTA but also by range analysis which eventually sets pt->null to false.

PTA now simply sets pt->null to true as a conservative measure because
it cannot guarantee that when !pt->null the pointer is actually never NULL
(because of the above issues).

> I've been looking at find_what_vars_points_to and have changed my code
> which verifies the constraint points-to sets. Basically, I now find
> which variables have been collapsed and only for "real" constraint
> pointer variables I take a look at the points to solution struct.
> Before looking into vars, I take a look at the fields and compare the
> null, anything, escape, etc, against the id of the pointee-variable.
> Checking vars is slightly confusing for me at the moment, since it
> appears that there are at least 3 plausible ways of validating the
> solution (I haven't actually gotten there because assertions are being
> triggered).
>
> ```
> for (auto &output : *orel) {
>        int from_i;
>        int to_i;
>
>       // Since find_what_var_points_to
>       // doesn't change the solution for collapsed
>       // variables, only verify the answer for the real ones.
>        varinfo_t from_var = get_varinfo(from_i);
>        varinfo_t vi = get_varinfo (find (from_i));
>        if (from_var->id != vi->id) continue;
>        if (!from_var->may_have_pointers) continue;
>
>        // compute the pt_solution
>        pt_solution solution = find_what_var_points_to (cfun->decl, from_var);
>
>        // pointee variable according to external analysis
>        varinfo_t vi_to = get_varinfo(to_i);
>
>        // Since some artificial variables are stored in fields instead
> of the bitset
>        // assert based on field values.
>       // However you can see that I already had to disable some of the
> assertions.
>        if (vi_to->is_artificial_var)
>         {
>           if (vi_to->id == nothing_id)
>           {
>             gcc_assert(solution.null && vi_to->id == nothing_id);
>             continue;
>           }
>           else if (vi_to->id == escaped_id)
>             {
>               if (in_ipa_mode)
>               {
>                 gcc_assert(solution.ipa_escaped && vi_to->id == escaped_id);
>               }
>               else
>               {
>                 //gcc_assert(solution.escaped && vi_to->id == escaped_id);
>               }
>               continue;
>               /* Expand some special vars of ESCAPED in-place here. ??*/
>             }
>           // More...
>      }
>
>        if (solution.anything) continue;
>
>        bitmap vars = solution.vars;
>        if (!vars) continue;
>
>
>        if (dump_file) fprintf(dump_file, "SAME = %s\n",
> bitmap_bit_p(vars, DECL_PT_UID(vi_to->decl)) ? "true" : "false");
>        if (dump_file) fprintf(dump_file, "SAME2 = %s\n",
> bitmap_bit_p(vars, to_i) ? "true" : "false");
>        if (dump_file) fprintf(dump_file, "SAME3 = %s\n",
> bitmap_bit_p(from_var->solution, to_i) ? "true" : "false");
> ```
>
> Can someone help me figure out why even though I have a "real"
> variable and I compute its solution with the "find_what_var_points_to"
> method the solution does not have the fields that I expect to be set?
> (I would expect solution.escaped to be escaped if the pointee variable
> vi_to has an id = escaped_id).

solution.escaped is 1 if the pointer variable may point to everything
that escaped.  That is, solution.vars is really solution.vars |
cfun->gimple_df->escaped
(there is also ipa_escaped).  Have a look at pt_solution_includes_1 for an
outline how to interpret the fields when asking whether DECL is a member of
the points-to solution.

>
> And also, how is the DECL_PT_UID different from the varinfo id field?
> Shouldn't they be the same? It seems that during
> "find_what_var_points_to" DECL_PT_UID is being used to set the bit in
> the bitmap, but in previous instances it was the varinfo id offset?

The "SSA" points-to sets contain DECL_UIDs, not varinfo IDs which are
only transitional during points-to computation.  DECL_PT_UID maintains
the original UID also after inlining duplicates decls so the points-to solutions
remain valid even after inlining (or cloning, or IPA ICF, etc.).

Richard.

Re: More questions on points-to analysis

Reply via email to