This is first answer that makes sense for community needs. Em 23/02/2015 13:01, "Matthew Weier O'Phinney" <matt...@zend.com> escreveu:
> I'm writing this as an author and maintainer of a framework and many > libraries. > Caveat, for those who aren't already aware: I work for Zend, and report to > Zeev. > If you feel that will make my points impartial, please feel free to stop > reading, but I do think my points on STH bear some consideration. > > I've been following the STH proposals off and on. I voted for Andrea's > proposal, > and, behind the scenes, defended it to Zeev. On a lot of consideration, > and as > primarily a _consumer_ and _user_ of the language, I'm no longer convinced > that > a dual-mode proposal makes sense. I worry that it will lead to: > > - A split within the PHP community, consisting of those that do not use > typehints, those who do use typehints, and those who use strict. > - Poor programming practices and performance degradation by those who adopt > strict, due to poor usage of type casting. > > Let me explain. > > The big problem currently is that the engine behavior around casting can > lead to > data loss quickly. As has been demonstrated elsewhere: > > $value = (int) '100 dogs'; // 100 - non-numeric trailing values are > trimmed > $value = (int) 'dog100'; // 0 - non-numeric values leading > values -> 0 ... > $value = (int) '-100'; // -100 - ... unless indicating sign. > $value = (int) ' 100'; // 100 - space is trimmed; data loss! > $value = (int) ' 100 '; // 100 - space is trimmed; data loss! > $value = (int) '100.0'; // 100 - probably correct, but loss of > precision > $value = (int) '100.7'; // 100 - precision and data loss! > $value = (int) 100.7; // 100 - precision and data loss! > $value = (int) 0x1A; // 26 - hex > $value = (int) '0x1A'; // 0 - shouldn't this be 26? why is > this different? > $value = (int) true; // 1 - should this be cast? > $value = (int) false; // 0 - should this be cast? > $value = (int) null; // 0 - should this be cast? > > Today, without scalar type hints, we end up writing code that has to first > validate that we have something we can use, and then cast it. This can > often be > done with ext/filter, but it's horribly verbose: > > $value = filter_var( > $value, > FILTER_VALIDATE_INT, > FILTER_FLAG_ALLOW_OCTAL | FILTER_FLAG_ALLOW_HEX > ); > if (false === $value) { > // throw an exception? > } > > Many people skip the validation step entirely for the more succinct: > > $value = (int) $value; > > And this is where problems occur, because this is when data loss occurs. > > What I've observed in my 15+ years of using PHP is that people _don't_ > validate; > they either blindly accept data and assume it's of the correct type, or > they > blindly cast it without validation because writing that validation code is > boring, verbose, and repetitive (I'm guilty of this myself!). Yes, you can > offload that to libraries, but why introduce a new dependency in something > as > simple as a value object? > > The promise of STH is that the values will be properly coerced, so that if > I > write a function that expects an integer, but pass it something like '100' > or > '0x1A', it will be cast for me — but something that is not an integer and > cannot > be safely cast without data loss will be rejected, and an error can bubble > up my > stack or into my logs. > > Both the Dual-Mode and the new Coercive typehints RFCs provide this. > > The Dual-Mode, however, can potentially take us back to the same code we > have > today when strict mode is enabled. > > Now, you may argue that you won't need to cast the value in the first > place, > because STH! But what if the value you received is from a database? or > from a > web request you've made? Chances are, the data is in a string, but the > _value_ > may be of another type. With weak/coercive mode, you just pass the data > as-is, > but with strict enabled, your choices are to either cast blindly, or to do > the > same validation/casting as before: > > $value = filter_var( > $value, > FILTER_VALIDATE_INT, > FILTER_FLAG_ALLOW_OCTAL | FILTER_FLAG_ALLOW_HEX > ); > if (false === $value) { > // throw an exception? > } > > Interestingly, this adds overhead to your application (more function > calls), and > makes it harder to read and to maintain. Ironically, I foresee "strict" as > being > a new "badge of honor" for many in the language ("my code works under > strict > mode!"), despite these factors. > > If I don't enable strict mode on my code, and somebody else turns on > strict when > calling my code, there's the possibility of new errors if I do not perform > validation or casting on such values. This means that the de facto > standard will > likely be to code to strict (I can already envision the flood of PRs > against OSS > projects for these issues). > > You can say, "But, Static Analysis!" all you want, but that doesn't lead > to me > writing less code to accomplish the same thing; it just gives me a tool to > check > the correctness of my code. (Yes, this _is_ important. But we also have a > ton of > tooling around those concerns already, even if they aren't proper static > analyzers.) > > From a developer experience factor, I find myself scratching my head: what > are > we gaining with STH if we have a strict mode? I'm still writing exactly > the same > code I am today to validate and/or cast my scalars before passing them to > functions and methods if I want to be strict. > > The new coercive RFC offers much more promise to me as a consumer/user of > the > language. The primary benefit I see is that it provides a path forward > towards > better casting logic in the language, which will ensure that — in the > future — > this: > > $value = (int) $value; > > will operate properly, and raise errors when data loss may occur. It means > that > immediately, if I start using STH, I can be assured that _if_ my code > runs, I > have values of the correct type, as they've been coerced safely. The lack > of a > strict mode means I can drop that defensive validation/casting code safely. > > My point is: I'm sick of writing code like this: > > /** > * @param int $code > * @param string $reason > */ > public function setStatus($code, $reason = null) > { > $code = filter_var( > $value, > FILTER_VALIDATE_INT, > FILTER_FLAG_ALLOW_OCTAL | FILTER_FLAG_ALLOW_HEX > ); > if (false === $code) { > throw new InvalidArgumentException( > 'Code must be an integer' > ); > } > if (null !== $reason && ! is_string_$reason) { > throw new InvalidArgumentException( > 'Reason must be null or a string' > ); > } > > $this->code = $code; > $this->reason = $reason; > ); > > I want to be able to write this: > > public function setStatus(int $code, string $reason = null) > { > $this->code = $code; > $this->reason = $reason; > ); > > and _not_ push the burden on consumers to validate/cast their values. > > This is what I want from STH, no more no less: sane casting rules, and the > ability to code to scalar types safely. While I can see some of the > benefits of > strict mode, I'm concerned about the schism it may create in the PHP > library > ecosystem, and that many of the benefits of the coercive portion of that > RFC > will be lost when working with data from unknown data sources. > > If you've read thus far, thank you for your consideration. I'll stop > bugging you > now. > > -- > Matthew Weier O'Phinney > Principal Engineer > Project Lead, Zend Framework and Apigility > matt...@zend.com > http://framework.zend.com > http://apigility.org > PGP key: http://framework.zend.com/zf-matthew-pgp-key.asc > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > >