On Wed, Mar 17, 2021 at 11:34 AM Erick Ochoa via Gcc <gcc@gcc.gnu.org> wrote: > > Hello, > > I'm still trying to compare the solution generated from the > intraprocedural points-to analysis in GCC against an external solver. > > Yesterday it was pointed out that "NULL is not conservatively > correctly represented in the constraints". Can someone expand on this? > To me this sounds like a couple of things: > * even though malloc may return NULL, NULL is not added to the > points-to sets of whatever variable is on the left hand side of the > malloc call. > * the process in GCC that generates the constraints for NULL somehow > does not generate enough constraints to treat NULL conservatively and > therefore there might be points-to sets which should contain NULL but > don't. (However, doesn't this mean that feeding the constraints to an > external solver should still give the same answers?)
Yes, there is an unknown number of places that get this "wrong". Getting it "right" wasn't needed until we started using pt->null for anything. > * the process in GCC that generates the constraints for NULL works > fine (i.e., feeding the constraints generated by GCC to an external > solver should yield a conservatively correct answer) but the process > that solves the constraints relaxes the solutions for the NULL > constraint variable (i.e., GCC has deviated from the constraint > solving algorithm somehow) No, that part should work OK. > Also, "at some point we decided to encode optimistic info into > pt->null which means points-to now has to compute a conservatively > correct pt->null." Doesn't this contradict itself? How is a pt->null > first optimistically and now conservatively? Is what this is trying to > say that: > > * NULL constraints were conservative first > * pt->null optimistic first > * Then conversion to SSA happened and NULL constraints became not > conservatively represented in the constraints (effectively becoming > somewhat optimistic) > * To avoid NULL and pt->null be both unsafe, pt->null was changed to > be conservative The SSA points-to solution is what gets used, it is now populated not only by PTA but also by range analysis which eventually sets pt->null to false. PTA now simply sets pt->null to true as a conservative measure because it cannot guarantee that when !pt->null the pointer is actually never NULL (because of the above issues). > I've been looking at find_what_vars_points_to and have changed my code > which verifies the constraint points-to sets. Basically, I now find > which variables have been collapsed and only for "real" constraint > pointer variables I take a look at the points to solution struct. > Before looking into vars, I take a look at the fields and compare the > null, anything, escape, etc, against the id of the pointee-variable. > Checking vars is slightly confusing for me at the moment, since it > appears that there are at least 3 plausible ways of validating the > solution (I haven't actually gotten there because assertions are being > triggered). > > ``` > for (auto &output : *orel) { > int from_i; > int to_i; > > // Since find_what_var_points_to > // doesn't change the solution for collapsed > // variables, only verify the answer for the real ones. > varinfo_t from_var = get_varinfo(from_i); > varinfo_t vi = get_varinfo (find (from_i)); > if (from_var->id != vi->id) continue; > if (!from_var->may_have_pointers) continue; > > // compute the pt_solution > pt_solution solution = find_what_var_points_to (cfun->decl, from_var); > > // pointee variable according to external analysis > varinfo_t vi_to = get_varinfo(to_i); > > // Since some artificial variables are stored in fields instead > of the bitset > // assert based on field values. > // However you can see that I already had to disable some of the > assertions. > if (vi_to->is_artificial_var) > { > if (vi_to->id == nothing_id) > { > gcc_assert(solution.null && vi_to->id == nothing_id); > continue; > } > else if (vi_to->id == escaped_id) > { > if (in_ipa_mode) > { > gcc_assert(solution.ipa_escaped && vi_to->id == escaped_id); > } > else > { > //gcc_assert(solution.escaped && vi_to->id == escaped_id); > } > continue; > /* Expand some special vars of ESCAPED in-place here. ??*/ > } > // More... > } > > if (solution.anything) continue; > > bitmap vars = solution.vars; > if (!vars) continue; > > > if (dump_file) fprintf(dump_file, "SAME = %s\n", > bitmap_bit_p(vars, DECL_PT_UID(vi_to->decl)) ? "true" : "false"); > if (dump_file) fprintf(dump_file, "SAME2 = %s\n", > bitmap_bit_p(vars, to_i) ? "true" : "false"); > if (dump_file) fprintf(dump_file, "SAME3 = %s\n", > bitmap_bit_p(from_var->solution, to_i) ? "true" : "false"); > ``` > > Can someone help me figure out why even though I have a "real" > variable and I compute its solution with the "find_what_var_points_to" > method the solution does not have the fields that I expect to be set? > (I would expect solution.escaped to be escaped if the pointee variable > vi_to has an id = escaped_id). solution.escaped is 1 if the pointer variable may point to everything that escaped. That is, solution.vars is really solution.vars | cfun->gimple_df->escaped (there is also ipa_escaped). Have a look at pt_solution_includes_1 for an outline how to interpret the fields when asking whether DECL is a member of the points-to solution. > > And also, how is the DECL_PT_UID different from the varinfo id field? > Shouldn't they be the same? It seems that during > "find_what_var_points_to" DECL_PT_UID is being used to set the bit in > the bitmap, but in previous instances it was the varinfo id offset? The "SSA" points-to sets contain DECL_UIDs, not varinfo IDs which are only transitional during points-to computation. DECL_PT_UID maintains the original UID also after inlining duplicates decls so the points-to solutions remain valid even after inlining (or cloning, or IPA ICF, etc.). Richard.