Re: fanalyzer: debugging zero state machine

Tim Lange Sun, 12 Jun 2022 11:27:22 -0700

On Do, Jun 9 2022 at 13:40:06 -0400, David Malcolm<[email protected]> wrote:

On Thu, 2022-06-09 at 16:49 +0200, Tim Lange wrote:


  > On Mi, Jun 8 2022 at 11:12:52 -0400, David Malcolm
 <[email protected]> wrote:
  > > On Wed, 2022-06-08 at 01:42 +0200, Tim Lange wrote:
  > >
  > > Hi Dave,


Hi Tim; various responses inline below...

  > >
  > > I did spent some time to think about the zero state machine. I
 first
  > > thought about distinguishing between "assigned zero" and "EQ 0
  > > condition on the path" for cases where e.g. unreachable() is
 used
 to
  > > say that some variable will never be zero according to the
 programmer.
  > > In that case the dev might not want zero warnings from
 conditions
  > > outside the if body itself for dev build where unreachable
 expands
 to
  > > some panic exit. But as the condition constraint is not pruned
 when the
  > > state machine is distinguishing the states, I'm not sure how to
 know
  > > whether the analysis already left the if body?
  >
  > The analyzer works on the gimple-ssa representation, which uses
 basic
  > blocks in a CFG, rather than an AST, so the only remants we have
 of
  > scope is in "clobber" statements (a special kind of assignment
 stmt),
  > which the gimplify adds as variables go out of scope.
 If the constraints only lived until the immediate dominator of `if
 (cond)`, I could easily distinguish:
 1. if (x == 0) && still inside the if => zero
 2. if (x == 0) && outside if => maybe zero
 but as this seems to be not the case, if I want to distinguish 1. &
 2.,
 I'd have to find another way.
  >
  > For pruning, the analyzer's state_machine class has a
 "can_purge_p"
  > virtual function:
  >
  > /* Return true if it safe to discard the given state (to help
  > when simplifying state objects).
  > States that need leak detection should return false. */
  > virtual bool can_purge_p (state_t s) const = 0;
  >
  > which should return true for a "zeroness" state machine, in that
 we
  > always consider pruning states for svalues that aren't needed
 anymore
  > along a path.
 Is implemented and returns true.
  >
  > Is there some other kind of state explosion you're running into?
 It's
  > hard to figure this out further without seeing code.
 No, my code is by far not that mature to be tested. I just had in my
 head that I wanted to find out if I can distinguish the two cases.
  >
  >
  > > Also, while trying out different things, it seems simple
 assignments on
  > > phi like here
  > > int x;
  > > if (argc == 1) {
  > > x = 1; // x_5
  > > } else {
  > > x = 0; // x_4
  > > }
  > > printf("%i", 5 / x); // x_2

> > automatically work such that x_2 already inherits the statefrom

  > > x_4/x_5 without me doing anything inside my sm's on_phi
 function.
 Same

> > for the simple y = 0; x = y; case. Where does this happeninside

 the
  > > code?
  >
  > With the caveat that I'm seeing your code, what's probably
 happening
 is
  > that we have either:
  >
  > BB (a):
  > x_5 = 1;
  > goto BB (c);
  >
  > BB (b):
  > x_4 = 0;
  > goto BB (c);
  >
  > BB (c):
  > x_2 = PHI (x_5 from (a), x_4 from (b));
 I compiled it with -g, so this one is like the dumped gimple.
  >
  > or (at higher optimization levels):
  >
  > BB (a):
  > goto BB (c);
  >
  > BB (b):
  > goto BB (c);
  >
  > BB (c):
  > x_2 = PHI (1 from (a), 0 from (b));
  >

> and I think that at the phi node we haveregion_model::handle_phi,> which is setting x_2 to either the constant 1 or the constant 0in

 the
  > store, and is calling the on_phi vfunc, leading to on_phi being
 called
  > for all state machines.
 Thanks, that is the case. The set_value inside handle_phi seems to
 this
 for me.
  >
  > BTW, are you implementing an override for this vfunc:
  > virtual state_machine::state_t get_default_state (const svalue *)
  > const;
  >
  > to capture the inherently known zeroness/nonzeroness of
 constant_svalue
  > instances? That would make those constants have that state.
 Yeah, I saw that on your nullness check. I tried it on a small
 example

with and without, but didn't noticed a difference in warnings(except

 for not having zero(x_4) inside the supergraph.dot). So if I
 understood
 this right, this is just to have one state less for that
 variable/value[0]?


The states are stored in sm_state_map using a mapping from svalue to
state.

Given e.g. a parameter, this will be "initial_svalue (parm)", but in
the above case where x_4 either has value 1 or has value 0, it's not
storing a state for x_4; it's storing a state for 0 or a state for 1.
So without implementing the get_default_state vfunc, the sm_state_maps
will gradually aquire explicit "this is zero" or "this is nonzero" for
all of the constants that get used, leading to an explosion of states
for all the different combinations of constants that have been
encountered along an execution path, which is probably not going to be

useful. If you do implement the get_default_state vfunc forconstants,then the constants will implicitly have the zero vs nonzero states,and

this information won't get stored in the sm_state_map instances.

program_state::can_merge_with_p compares the sm_state_map instances
between the two program_state instances - and so if the
get_default_state vfunc is implemented for constants, the peer

sm_state_maps for the "zero" state machine will both be empty, andthus

equal, and thus it will merge state.

So it might make sense to revamp this sm_state_map comparison so that

we consider implicit states in sm_state_maps when merging the store,so

that we see that "1" and "0" have different sm-states for the "zero"
state machine, and thus we shouldn't merge states for these.  See
binding_cluster::can_merge_p, which calls svalue::can_merge_p.  It

might make sense to have svalue::can_merge_p "know" aboutsm-state-maps

for this.

svalue::can_merge_p calls model_merger::mergeable_svalue_p, so that we
can reject merging values that explicitly have different state-machine
states, but that code could be reworked so that it considers whether a

pair of svalues can be merged, using implicit state-machine states(and

thus have it reject the merger of a zero with a nonzero constant).

Or, whilst prototyping, you could simply hack in a rejection ofmerging

zero vs non-zero integer values into svalue::can_merge_p.


 If that is right, is it also favorable to "merge" the stop state and
 non_zero state inside the zero state machine because - for now -
 there
 is no warning planned on non-zero values?

 [0] less states are favorable because then the analyzer maybe has
 less
 different enodes to visit and thus less runtime(?)
  >
  > Thanks
  > Dave
  >
  > > - Tim

Also another question unrelated to the ones before. I do have aweird

 bug in my zero sm[1] but I'm unsure where my sm is flawed. Take for
 example, the probably simplest case:
   int x = 0;h
   printf("%d", 42 / x);
 If I use inform to emit a notice for my state machine result, it
 seems

to be correct and I do get following traversal order (by printingthe

 gimple stmt inside on_stmt):
   x_2 = 0;
   _1 = 42 % x_2;
   # .MEM_4 = VDEF <.MEM_3(D)>
   printf ("%d", _1);
   _5 = 0;
   <L0>:
   # VUSE <.MEM_4>
   return _5;
 But if i use my zero_diagnostics and sm_ctxt->warn to emit the
 warning,
 I get an unexpected traversal order of:
   x_2 = 0;
   _1 = 42 / x_2;
   # .MEM_4 = VDEF <.MEM_3(D)>
   printf ("%d", _1);
   _5 = 0;
   <L0>:
   # VUSE <.MEM_4>
   return _5;
   x_2 = 0;              <-- why do I get these stmts again but they
 are not as
 duplicated in the e-supergraph?
   _1 = 42 / x_2;
   _5 = 0;
   _1 = 42 / x_2;
   _5 = 0;


What exactly is your test source file?

It is just the two lines wrapped into main():
 #include <stdio.h>
 int main(int argc, char **argv) {
   int x = 0;
   printf("%d", 42 / x);
   return 0;
 }

Note that a function being called by another function will getexplored

in the exploded_graph twice: once being analyzed "standalone", and
again being analyzed as called from the other function.

When you say "I get these stmts again", do you mean the state machine
code is seeing them again, or the region model code?

In the state_machine::on_stmt function.


  sm_ctxt->warn (new foo)

will save an instance of a pending_diagnostic into the
diagnotic_manager, associating it with a particular exploded_node (and
some other info).

After the exploded_graph has been built, the diagnostic_manager does

some deduplication work, and then tries to find a feasible path toeach

pending_diagnostic's exploded node.

This feasibility analysis reruns much of the region_model code thatwas

run when building the exploded graph.  So that's another way you can
"see" stmts repeatedly.

Okay, that probably explains it. I had two distinct problems where Ithought they might be both related to seeing the statements again:

   |   13 |   int x = 0;
   |      |       ^
   |      |       |
   |      |       (1) set to zero here
   |      |       (3) set to zero here
   |   14 |   printf("%d", 42 / x);
   |      |   ~~~~~~~~~~~~~~~~~~~~
   |      |   |
   |      |   (4) division by zero
   |   15 |
   |   16 |   return 0;
   |      |          ~
   |      |          |
   |      |          (2) set to zero here
   |      |          (5) set to zero here
* In line 13, there are two state changes while I'd expect only one.

=> I re-looked at the other state machines and they transfer to thestop state after the warning. If I do the same, my problem in line 13is fixed.* The other problem in line 16 is that return 0 is not relevant towardthe division by zero.=> After looking at the diagnostic code, I think those should befiltered out but aren't because x (in gimple x_2) and return 0 (ingimple _5) share the (int)0 state. Insidediagnostic_manager::prune_for_sm_diagnostic there is a check that thestate_change->m_sval is equal to the sval of the saved_diagnostic,which is not enough to filter unrelated events in the zero statemachine case.I tried to filter the event by comparing the state of interest but thatdoesn't work if the unrelated event is after the related event on thepath.Similarly, I tried to add a new field m_var to state_change_event andprovide this as a comparison but in most functions that call theconstructor of state_change_event, the var is not available.

Is there is any easy way to filter these unrelated events withoutconsiderable changes [1]?

By the way, I can trigger the same behavior on the malloc null_derefwith gcc version 12.1.1 shipped in Fedora 36. Assume this simpleprogram:

 #include <stddef.h>
 int main (void)
 {
   int *p = NULL;
   *p = 42;

   int *q = NULL;

   return 0;
 }
results in the following warning:
/home/tim/Projects/simple_c/main.c: In function ‘main’:

/home/tim/Projects/simple_c/main.c:12:6: warning: dereference of NULL‘p’ [CWE-476] [-Wanalyzer-null-dereference]

  12 |   *p = 42;
     |   ~~~^~~~
 ‘main’: events 1-4
   |
   |   11 |   int *p = NULL;
   |      |        ^
   |      |        |
   |      |        (1) ‘p’ is NULL
   |   12 |   *p = 42;
   |      |   ~~~~~~~
   |      |      |
   |      |      (4) dereference of NULL ‘p’
   |   13 |
   |   14 |   int *q = NULL;
   |      |        ~
   |      |        |
   |      |        (2) ‘p’ is NULL
   |      |        (3) ‘p’ is NULL

 I tracked the cause down the call stack to
 m_saved_diagnostics.safe_push (sd); inside
 diagnostic_manager::add_diagnostic. For whatever reason, the pushing
 of
 diagnostics bricks the zero sm. It might be a embarrassing error on
 my
 side, but I'm stuck on this and can't seem to find what I'm doing
 wrong.


FWIW, assuming you're using the gdb support scripts:

https://gcc-newbies-guide.readthedocs.io/en/latest/debugging.html#support-scripts-for-gdb

there's a handy:

(gdb) break-on-saved-diagnostic

command which adds a breakpoint on the analyzer's diagnostic_manager
saving a pending_diagnostic.

You might also what to try -fdump-analyzer-stderr, which will dumplots

of information on what the analyzer is doing to stderr, which although
verbose is usually very helpful in figuring out why the analyzer is
doing something.

Hope this is helpful

Definitely helped me understand more of the internals!

- Tim

Dave
 - Tim

 [1]
https://github.com/timll/gcc/blob/castsize/gcc/analyzer/sm-zero.cc#L181

Re: fanalyzer: debugging zero state machine

Reply via email to