vsavchenko added a comment.

In D103317#2793557 <https://reviews.llvm.org/D103317#2793557>, @ASDenysPetrov 
wrote:

> Returning to the discussion raised in D102696 
> <https://reviews.llvm.org/D102696>, I'd like to share my vision.
> I think we can use much easier approach to use valid constraints at any point 
> of time.
> The main idea is lazy-reasoning of the ranges.
> This approach:
>
> - **doesn't** need to use **canonicalization**;
> - **doesn't** need to **update all** the constraints on **each** 
> `setConstraint` call (using brute-force or any other structure traversals);
> - **add** an additional bindings for **every** simple operand on **each** 
> //assignment// or //initialization//.
> - **remove** an invalid bindings for **every** simple operand on **each** 
> //assignment// or //initialization//.
> - **recursively** get the **range** when and only when needed 
> (lazy-reasoning).
>
> Example:
>
>   a = b + 1;
>   c = a - 5;
>   if (b != 42) return 0;
>   if(c == 38) return 1;
>   return 0;
>
> `a = b + 1;`
>
> 1. LHS `a` lookup. LHS not found.
> 2. Add binding `$a = $b + 1`
> 3. Traverse through RHS `b + 1`.
> 4. `b` lookup. `b` not found.
> 5. Built a new expression `a - 1` for `b`.
> 6. Add binding for `b`: `$b = $a - 1`
> 7. `1` is a constant. Skip.
>
> `c = a - 5;`
>
> 1. LHS `c` lookup. LHS not found.
> 2. Add binding `$c = $a - 5`
> 3. Traverse through RHS `a - 5`.
> 4. `a` lookup. `a` found.
> 5. If LHS not found and `a` found, then skip.
> 6. `5` is a constant. Skip.
>
> `if (b != 42) return;`
>
> 1. Get the range of `$b`. Direct range not found. Add `$b` to the list of 
> visited symbols.
> 2. Substitute bindings. From `$b != 42` to `$a - 1 == 42`.
> 3. Traverse through `$a - 1`.
> 4. Get the range of `$a`. Direct range not found. Add `$a` to the list of 
> visited symbols.
> 5. `1` is a constant. Skip.
> 6. Substitute bindings. From `$a - 1 == 42` to `$b + 1 - 1 == 42`.
> 7. Traverse through `$b + 1 - 1`.
> 8. `$b` is in the list of already visited symbols. No way to find a range.
> 9. Create a range from the type [MIN, MAX].
> 10. Perform comparison and produce a constraint for `b` [42, 42].
>
> `if(c == 38) return;`
>
> 1. Get the range of `$c`. Direct range not found. Add `$c` to the list of 
> visited symbols.
> 2. Substitute bindings. From `$c == 38` to `$a - 5 == 38`.
> 3. Traverse through `$a - 5`.
> 4. Get the range of `$a`. Direct range not found. Add `$a` to the list of 
> visited symbols.
> 5. `5` is a constant. Skip.
> 6. Substitute bindings. From `$a - 5 == 38` to `$b + 1 - 5 == 38`.
> 7. Traverse through `$b + 1 - 5`.
> 8. Get the range of `$b`. Direct range **found** [42, 42].
> 9. `1` is a constant. Skip.
> 10. Calculate(simplificate the expression etc.) a new range for `$c` [38, 38].
> 11. Condition is //true//.
>
> `return 1`
>
> P.S. I've just started thinking of the way to solve this problem. For now 
> it's just a sketch of where I'm going.

Great to know that you are also thinking about this problem!
The more the merrier!

I have a couple of notes about your suggestion.
First is minor, `a = b + 1` and `c = a + 5` won't actually produce any 
constraints. Starting from those statements if you request a symbol for `a` you 
will get `b + 1` and `b + 6` for `c`.
Of course, you can change your code so that we make an assumption that `a == b 
+ 1` and `c == a + 5` and those would be constraints.  That's why I said that 
it's minor.

Now, for the biggest concern.  You cover in a very detailed manner all the 
traversals and caches for visited symbols, but have **Substitute bindings** as 
a sub-step , which is HUGE and absolutely non-trivial.
In your examples, you have a linear chain of related symbolic expressions, i.e. 
something like this `x -> y + 2`, `y -> z + 3`, `z -> 4` so that we have `x -> 
9` after three substitutions. But the problem it is generally not one-to-one 
relationship, so `x -> y1 + 1`, `x -> y2 + 2`, ... , `x -> yN + N`.  Imagine 
that each `yK` is involved with another `M` constraints. So, you would need to 
check `x -> y1 -> z11`, if it's not working - check `x -> y1 -> z12` and so on. 
 When you run out of options for `z1K`, you need to backtrack and try `x -> y2 
-> z21`, etc.  As you can see, this is BRUTE FORCE.

And it couldn't be easier theoretically. So, instead of trying that many 
options every time we need to judge about, we take a bit simpler shortcut.  
It's far from being complete, but honestly we're not pursuing completeness.  
For my example above, whenever we find out that `zKM -> 5`, we simplify 
everything that involves `zKM` and get `yK -> 10` and get `x -> 15`.  It is far 
from being perfect, I don't like that we need to go through all symbols and try 
to simplify every single symbol, and it would be good to optimize this scenario 
(mapping symbols to all symbolic expressions that contain them, for example), 
but essentially it is cheaper in the long run.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103317/new/

https://reviews.llvm.org/D103317

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to