Robert, > [Robert Stoll] > Sure, "a" was just an example to illustrate the problem. I figured it would > not be necessary to say that the value of $b can be completely unknown by the > static analyser -> could come from user input, from a database, from > unserialising code etc. (but probably that is what you meant with "this isn't > the general case" below). > > Assuming statically that $a is int or $b is string is erroneous in this > context. > > Another problem to illustrate that a top type or at least some form of union > type is required: > > function foo($x, $y, $z){ > $a = 1; > if($x){ > $a = "1"; > } > If($y > 10){ > $a = []; > } > If($z->foo() < 100){ > $a = new Exception(); > } > echo $a; > return $a; > } > > How do you want to type $a without using a union type?
Actually, this case is reasonably easy to handle. There's a representation called SSA (Static-Single-Assignment) that you move code to prior to doing type analysis. Basically, at a really high level, it would rewrite the code to this: function foo($x, $y, $z){ $a = 1; if($x){ $a1 = "1"; } $a2 = Φ($a, $a1); If($y > 10){ $a3 = []; } $a4 = Φ($a2, $a3); If($z->foo() < 100){ $a5 = new Exception(); } $a6 = Φ($a4, $a5); echo $a6; return $a6; } Where Φ is a function that chooses the value based on the branch of the graph that entered it. There are a few ways to implement it in practice, one would be to generate a variant. But another would be to generate different code paths. Considering that $a5 will be the result if $z->foo() < 1000 no matter what the prior conditionals are, you could invert the code to push that check first, making it: function foo($x, $y, $z) { if ($z->foo() < 1000) { $a = new Exception(); echo $a; return $a; } if ($y > 10) { echo []; return []; } if ($x) { echo "1"; return "1"; } echo 1; return 1; } That transform can be done by the compiler, and hence never need you to do anything. We still compiled without variants, and the analysis job wasn't that difficult. There will be cases of course where this won't work. And in those cases we could either not compile, or generate a variant. However, I would like to point out something. If you added a return type, and ran that code in strict mode, it would error. A static analyzer can pick up that error and tell you about it. So really, we're not talking about valid strict code here (tho the same problem does exist inside strict bodies, and techniques can be done here the same. For more info, check out: https://github.com/google/recki-ct/blob/master/doc/5_phi_resolving.md >> >> And hence know **at compile time** that's an error. >> >> This isn't the general case, but we can error in that case (from a static >> analysis perspective at least) and say "this code is too >> dynamic". In strict mode at least. > > [Robert Stoll] > If you go and implement a more conservative type system than the actual > dynamic type system of PHP well then... you can do whatever you want of > course. > But if you do not just want to support just a limited set of PHP then you > will need to include dynamic checks in many places. Or do you think that is > not true? I think with strict type declarations, the 'limitations' are far less than you'd think. Yes, there will be cases (like variable variables, etc) where valid strict code won't be analyzable. But I haven't seen var-vars in the wild in a while. So I think my assertion is fair: the majority of *valid* strict-typed code will be analyzable. Where the majority of coercive won't be. Anthony -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php