[PATCH] D103317: [Analyzer][engine][solver] Simplify complex constraints

Valeriy Savchenko via Phabricator via cfe-commits Mon, 07 Jun 2021 12:29:06 -0700

vsavchenko added a comment.

In D103317#2803238 <https://reviews.llvm.org/D103317#2803238>, @ASDenysPetrov 
wrote:


> @vsavchenko
>
> I appologize for my immaturity and for the audacity of my infantile 
> assumptions. I admit any constructive criticism. Thank you for referencing 
> the theory. I will thoroughly study the domain in all.

No need to apologize!  I do like your enthusiasm, great things come from it.  
But as I said, I don't want you to waste your own time pursuing things that 
won't work.

> As for the **assignments** and why I brought them up, that's because I just 
> tried to look at the problem from another angle. Consider two snippets:

OK, let me annotate the code, so we can hopefully close this topic once and for 
all.

>   void f(int a, int b, int c, int x, int y){

At this point we have 5 memory regions `a`, `b`, `c`, `x` and `y`.  They 
represent the memory cells allocated for the corresponding parameters on stack. 
 When we call function, these actually have some values, but we don't know them 
due to the limitations of the analysis.  For this reason, we say that they 
contain **symbolic values**. So, let's simply enumerate that so that there is 
no misunderstanding. Values: #1, #2, #3, #4, and #5. These are not real 
numbers, we enumerate some unknown values.  As I said, we introduced them as 
initial values for parameters, so our store (mapping from memory to values) 
looks likes this: `a -> #1, b -> #2, c -> #3, x -> #4, y -> #5`.  We don't know 
anything about these values at this point.

>   a = b;

What do we know at this point? Do we get more information about the value that 
was stored in `a` before? No, we overwrote this value entirely, and what 
changed here is the mapping: `a -> #2, b -> #2, c -> #3, x -> #4, y -> #5` 
because `b` contained value `#2`.  Symbolic value `#1` is gone, it's dead.  
No-one will ever see or use it again.

>   x = y;

Using the same logic: `a -> #2, b -> #2, c -> #3, x -> #5, y -> #5`.

>   c = a;

And again: `a -> #2, b -> #2, c -> #2, x -> #5, y -> #5`.

>     // What can you say about a, x, c?
>     // `a` equals to `b` and 'c'
>     // `x` equals to `y`
>     // `c` equals to `a` and `b`
>   }

Yes, they are, but we don't need constraints for that.  We know which values 
are associated with each memory region and we can compare them.  That's all, 
this simple.  That's why assignments and initializations DON'T MATTER for the 
solver, these are simply STORE of some symbolic value into memory.  Solver 
cares only about symbolic values, memory is irrelevant.

> and
>
>   void f(int a, int b, int c, int x, int y){

As before: `a -> #1, b -> #2, c -> #3, x -> #4, y -> #5`.

>   if(a == b) {

Now, the value that we imagine to be in memory region `a` (namely `#1`) is NOT 
GONE!  It is still there, it is being used.  And what we know after this `if`, 
is that on one of the paths after it `#1 == #2` and on the other path `#1 != 
#2`.  These are just values that we know nothing about, but we started to gain 
some knowledge.
So, the store is the same (the values stored in those memory cell are the same 
as they were before): `a -> #1, b -> #2, c -> #3, x -> #4, y -> #5`, but now we 
add constraint `#1 == #2`.

>   if(x == y) {

The same here, store is the same, but constraints are now: `#1 == #2` and `#4 
== #5`.

>   if(c == a) {

`#1 == #2 == #3` and `#4 == #5`

>           // What can you say about a, x, c?
>           // `a` equals to `b` and 'c'
>           // `x` equals to `y`
>           // `c` equals to `a` and `b`
>         }
>       }
>     }
>   }

We can actually figure out that `#2 == #3` because of equality theory. We only 
knew `#1 == #2` and that `#3 == #1`, but equality is a transitive binary 
relation and we reflect that notion in our equivalence classes data-structure.

> They both generates the same relationship between the variables. 
> **Assignment** seemed to me as just a special case in a //true// branch of 
> `==` operator and can be substituted with it in calculations. So, I decided 
> to make some assumptions in this way. Digging more, it opened a lot of gaps 
> and lacks for me.

Relationship between variables are the same, but neither solver nor the 
analyzer reason about things in terms of variables.  It's memory and values 
that the memory carries.  In the first example, we have only 2 values stored in 
5 memory locations.  In the second example, it's 5 values stored in 5 memory 
locations.  On some of the paths we happen to know how those values are related 
(which is not true for other paths), that's constraints.

I hope now it's clear.

> Next time I will investigate deeper to present a finalized solution and not 
> wasting your time explaining me the basics.

We probably should not have discussions like these because they are not 
directly about the patch on review.  If you want to continue this discussion, 
let's bring it to **[cfe-dev]**.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103317/new/

https://reviews.llvm.org/D103317

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D103317: [Analyzer][engine][solver] Simplify complex constraints

Reply via email to