Stas, >> The reason is, with strict typing you only need to deal with 8 >> possible states for every variable (int, float, bool, resource, array, >> object, callable, null and string). And many of those states you can >> eliminate based on context (you know an argument passed to an int >> declared parameter will always be an int). > > This is possible with coercive typing too, if the argument is declared > int then it will be coerced to int.
At the function border, yes. But not beyond that. >> Without strict typing, you also need to consider virtual states: >> castable object (internally implementing get()), stringable object >> (implementing __tostring), numeric string ("123"), kind-of numeric >> string ("123 abc"), etc. And many of those states *can't* be >> eliminated based on context. > > I don't see why you don't need to consider them anyway - if your > analyzer cares about ability of the object to be casted to a string, > you'd have to carry this information anyway - for use in (string), echo, > "$foo", internal functions, etc. So again no advantage here. > >> An example is that ($a + $b) we know can never produce a string. We >> know it can never produce a bool, resource, array, callable or null. >> The only things it can produce is an object, an int or a float. So if >> we see expressions of the form `$c = ($a + $b)`, we can immediately >> deduce something about $c and therefore reduce the number of possible >> states. >> With strict typing, if we then pass $c to a function expecting int, we >> know that C must be an int. Which means that $a and $b therefore > > That seems to be an incorrect assumption - passing something to function > asking for an int doesn't mean it is an int. It means that inside > function it would be int or fail happens, but that is achieved with > coercive typing too, so no advantage to strict typing here. Of course, > since we already know $c is an int or a float or an object, no > additional information here is obtained, and we do not know anything > about $c itself - we only know the "A->B" is true but we don't know > whether A is true and thus not whether B is true. Sort-of correct. We don't know if A is true, but we know what conditions need to exist for A->B to be true. We know that A can be any value. It could be a float. But that would lead to an error down the road when the sum is passed to the function typing for an int. Therefore, by knowing $c is passed to a function expecting an int, we have information about stable states for $a and $b which would result in a stable state for $c. With strict mode, then we can work backwards and see that if $a is set to 3.5, that we know that it's not a stable state (because at best an error will be created). With coercive typing (weak), you can't work backwards there because 3.5 + 1.5 is 5.0 which would be acceptable for an int hint. So you have a LOT more possibilities to contend with. This isn't so much about proving an application correct as it is about proving information about the stable (non-error) states of the program. So with coercive typing (weak), the ability to detect errors and prove stable states of the application is reduced and requires significantly more computation. >> **must** be castable to an integer (otherwise an error will be thrown >> down the road). So we've just reduced $a down to 5 types: int, numeric >> string, null, bool and castable object (from at least 12). And $b the > > Yes, after the expression is done and *if* it did not error out, we can > deduce some things about $a's value *after* the expression. But we have > no guarantees the expression would succeed and no advantage to string > typing which does not come into play here at all. I'm talking about using the static analyzer to prove properties of the stable (non-error) state. Yes, at runtime you only know things about $a after the expression. But at static time, we can prove based on what we know of $c, the acceptable input states that could possibly lead to a stable state. Yes, you can generate errors. But the entire point of static analysis is to detect the possibility of these errors ahead of time rather than at runtime. >> same. So our expression has 10 possible valid (non-error-case) >> permutations. Down from the approximately 432 possibilities before >> hand (12 for $a, 12 for $b, 3 for $c). And that's with unknown $a and >> unknown $b. > > Still don't see any advantage to strict typing here. Deduction about the > behavior of + can be made without any strict typing involved, as far as > I can see. But we're talking about much more than the behavior of one operation. We're talking about the behavior of the entire set of function calls around it (and the flow of type information based on these predictable operations). So while + itself has no advantage from strict types, passing the result to a function expecting an integer does give you a lot of information about the non-error possibilities that the + operation can have. >> With strictly typed $a and $b, the expression drops to 1 possible >> permutation. And you can detect if it's a valid one. And many static >> analysis engines do this. > > I didn't see any proposal that proposes strictly types variables. As for > parameters, both strict and coercive typing provide knowledge about the > types of parameter inside the function they are defined in, so no > advantage to strict typing here. It's not about how data gets in, it's about how data moves once it's in. It's about knowing how types change and flow into other functions that's important. Because that lets you determine more data about the stable (non-error) state of the application. Anthony -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php