Re: [PHP-DEV] User perspective on STH

Anthony Ferrara Mon, 23 Feb 2015 08:22:56 -0800

Matt,


> The big problem currently is that the engine behavior around casting can lead 
> to
> data loss quickly. As has been demonstrated elsewhere:
>
>     $value = (int) '100 dogs'; // 100  - non-numeric trailing values are 
> trimmed
>     $value = (int) 'dog100';   // 0    - non-numeric values leading
> values -> 0 ...
>     $value = (int) '-100';     // -100 - ... unless indicating sign.
>     $value = (int) ' 100';     // 100  - space is trimmed; data loss!
>     $value = (int) ' 100 ';    // 100  - space is trimmed; data loss!
>     $value = (int) '100.0';    // 100  - probably correct, but loss of 
> precision
>     $value = (int) '100.7';    // 100  - precision and data loss!
>     $value = (int) 100.7;      // 100  - precision and data loss!
>     $value = (int) 0x1A;       // 26   - hex
>     $value = (int) '0x1A';     // 0    - shouldn't this be 26? why is
> this different?
>     $value = (int) true;       // 1    - should this be cast?
>     $value = (int) false;      // 0    - should this be cast?
>     $value = (int) null;       // 0    - should this be cast?
>
> Today, without scalar type hints, we end up writing code that has to first
> validate that we have something we can use, and then cast it. This can often 
> be
> done with ext/filter, but it's horribly verbose:
>
>     $value = filter_var(
>         $value,
>         FILTER_VALIDATE_INT,
>         FILTER_FLAG_ALLOW_OCTAL | FILTER_FLAG_ALLOW_HEX
>     );
>     if (false === $value) {
>         // throw an exception?
>     }
>
> Many people skip the validation step entirely for the more succinct:
>
>     $value = (int) $value;
>
> And this is where problems occur, because this is when data loss occurs.

And what about other languages that have exactly this behavior? Such
as Go/Hack/Haskell/etc. Do you see casts everywhere? No. You see them
where it needs to be explicit. Otherwise, people just write using the
correct types.

And it also hand-waves over the fact that the same problem exists with
coercive types. You're going to get the error anyway if you try to
pass "apple" to an int parameter. So if someone was going to cast with
strict, they will cast with coercive.

The difference is strict tells you ahead of time there's an error.
Where Coercive tells you at runtime. Where your app may blow up while
in prod. Perhaps what you want, perhaps not.


> If I don't enable strict mode on my code, and somebody else turns on strict 
> when
> calling my code, there's the possibility of new errors if I do not perform
> validation or casting on such values. This means that the de facto standard 
> will
> likely be to code to strict (I can already envision the flood of PRs against 
> OSS
> projects for these issues).

Incorrect. The only person that can turn on strict mode is you, the
author. Now someone can install your library, and edit it to turn on
strict mode (add the declares at the top of the file). But that's very
different from what strict proposes. And that's a problem you have
already today (how many bug reports do you get for "I modified XYZ
class and now it doesn't work").

However, with 2/3 of the options presented in the coercive RFC, you'll
have an INI setting that changes the behavior of your code for you
(the other 1/3 is potentially a significant BC break). How is that
better than a per-file switch? Something you as a library developer
have no control over...

> My point is: I'm sick of writing code like this:
>
>     /**
>      * @param int $code
>      * @param string $reason
>      */
>     public function setStatus($code, $reason = null)
>     {
>         $code = filter_var(
>             $value,
>             FILTER_VALIDATE_INT,
>             FILTER_FLAG_ALLOW_OCTAL | FILTER_FLAG_ALLOW_HEX
>         );
>         if (false === $code) {
>             throw new InvalidArgumentException(
>                 'Code must be an integer'
>             );
>         }
>         if (null !== $reason && ! is_string_$reason) {
>             throw new InvalidArgumentException(
>                 'Reason must be null or a string'
>             );
>         }
>
>         $this->code = $code;
>         $this->reason = $reason;
>     );
>
> I want to be able to write this:
>
>     public function setStatus(int $code, string $reason = null)
>     {
>         $this->code = $code;
>         $this->reason = $reason;
>     );
>
> and _not_ push the burden on consumers to validate/cast their values.

Again, you're completely misunderstanding the dual-mode proposal. Even
if you declared that code in strict mode, the determination of how the
call is made is up to the caller. Not the callee.

So in the exact example you showed, even if you declared strict, I
could call ->setStatus("10", new ObjectImplementingToString()); from
my non-strict code **and it will work fine**. In fact, it's designed
to work that way.

> This is what I want from STH, no more no less: sane casting rules, and the
> ability to code to scalar types safely. While I can see some of the benefits 
> of
> strict mode, I'm concerned about the schism it may create in the PHP library
> ecosystem, and that many of the benefits of the coercive portion of that RFC
> will be lost when working with data from unknown data sources.

Considering the strict mode is file-local, it's not all or nothing.
It's up to the author writing code to determine how to handle the
calls (s)he will make.

Anthony

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] User perspective on STH

Reply via email to