> -----Original Message-----
> From: Theodore Brown [mailto:theodor...@outlook.com]
> Sent: Thursday, February 26, 2015 5:29 PM
> To: internals@lists.php.net
> Subject: [PHP-DEV] A different user perspective on scalar type
declarations
>
> I am a full-time PHP developer responsible for maintaining several large
> enterprise applications, as well as a number of libraries and personal
apps.
> I have been following the scalar type proposals quite closely, as along
with
> return type declarations, scalar types have the potential to reduce
errors,
> simplify API documentation, and improve static code analysis.
>
> I am in favor of Anthony's Scalar Type Declarations RFC, for two simple
> reasons:
>
> 1. It doesn't change the behavior of existing weak types.
>
> PHP has long had an emphasis on backwards compatibility, and I'm worried
> that those not in favor of strict types are treating backwards
compatibility
> more recklessly than they otherwise would in their fervor to avoid two
ways
> of handling scalar types. In my experience dealing with large enterprise
apps,
> however, there are hundreds of places where code relies on GET/POST
> parameters being automatically trimmed when passed to a function
> expecting an integer.
> The current coercive proposal would deprecate this and later make it an
> error.
> To avoid these notices/errors when upgrading, developers may take the
> "easy"
> route of casting any input passed to a function expecting an int or
float.
> This is the same "too strict may lead to too lax" problem pointed out by
the
> coercive RFC itself. There's a reason that integer handling was actually
> *relaxed* back in PHP 5.1 (see
> http://php.net/manual/en/migration51.integer-parameters.php).
> Why suddenly make the default more strict again?
>
> I am not against tightening up some of the default weak conversions
(e.g. to
> not allow "99 bugs" for an int type), but in my opinion this should be
done
> very carefully, and separately from any scalar type declaration
proposal.
> Major changes to the casting rules have the potential to seriously harm
PHP
> 7 adoption, especially in enterprises with large amounts of legacy code.
The
> Scalar Type Declarations v0.5 RFC has the advantage here because it
"just
> works" when type hints are added to existing code in the default weak
mode.

You may have a point there.  As Francois said, he was in favor of allowing
leading and trailing spaces.  I'll definitely reconsider.  Would love to
hear any additional feedback you may have about the conversion rules!
My goal is to balance the 'Just works' aspect with the strict aspect, and
still be able to put it into one rule-set, because I believe this has some
inherent advantages.

> 2. Strict types are important in some cases.
>
> When it comes to authentication and financial calculations (a couple of
areas
> I routinely deal with) it is extremely important that errors are caught
and
> fixed early in the development process. In financial or
security-sensitive
> code, I would *want* any value with the wrong type (even a string like
"26")
> to be flagged as an error when passed to a function expecting an
integer.


I agree completely;  However, such use cases like this are a lot less
common than the situations where you do want sensible coercion to be
allowed.  Not introducing language constructs to support strict typing
doesn't mean I think it's never useful.  I think it's at the level where
it's better to leave it up to (very very simple) custom code, in the form
of if (!is_int($foo)) errorout();, as opposed to introducing a whole 2nd
mode into the language, with cognitive burden it brings.  When I read
Anthony's comment about the random number generators a couple of days ago:
"I think the case you have to look at here is the target audience. Are you
looking to be all things to all users? Or are you attempting to be an
opinionated tool to help the 99%. Along with password_hash, I think this
random library serves the 99%."
I couldn't help but think the very same could be said about strict type
hints (paraphrasing it myself, "I think the case we have to look at here
is the target audience. Are we looking to be all things to all users? Or
are we attempting to be an opinionated tool to help the 99%. With coercive
types I think we serve the 99%." - whether it's 99% or 95% or 90% is
negotiable - but it doesn't change the takeaway, I think).  Now, the same
can't be said when we use weak types.  Weak type hints are completely
useless for developers who want strict type hints, as their behavior is
completely off from what they expect, and they'd never use them.  But with
the newly proposed coercive type hints - the gap narrows radically.  The
most common real world use cases strict campers brought up in the past as
problematic with weak types - are gone.  We're still left with some useful
use cases for strict, but not at the level where it makes sense to add
language-level support, especially in the form of dual mode, with all its
downsides.


> The option for type-based (rather than value-based) validation is
equally
> important when it comes to return types. Unless I have missed something,
> the "Coercive Types for Function Arguments" RFC currently doesn't deal
with
> return types at all (they aren't mentioned in the RFC). Would it handle
scalar
> return types the same way as it does function arguments? If I declare a
> function to return an int, and I return a string instead (even if the
string is
> numeric), there are many cases where it would be an unintentional error.
> And if it errors depending on the value, rather than the type, it often
> wouldn't be possible to catch the problem statically.

We'll update the RFC to explicitly mention return.  Yes, return values
will be validated using the same coercive rules as function arguments -
similarly to how they're dealt with in the v0.5 RFC.

> Here's a simple example of the advantage offered by strict types and
static
> analysis in the Scalar Type Declarations v0.5 RFC:
>
> <?php
> declare(strict_types=1);
>
> function getCustomerName(int $customerId): string {
>     // look up customer name from database and return }
>
> function getInvoiceByCustomer(int $customerId): Invoice {
>     // retrieve invoice data and return object }
>
> $id = filter_input(INPUT_GET, 'customer_id', FILTER_VALIDATE_INT);
>
> if ($id === false) {
>     echo 'Customer ID must be an integer'; } else {
>     $customer = getCustomerName($id);
>     $invoice = getInvoiceByCustomer($customer);
>     // display invoice
> }
>
> Strict types + static analysis can tell you that this will fail (because
it's based
> purely on types, and a string is being passed to a function expecting an
> integer). Coercive typing cannot statically tell you that it will fail,
because it
> doesn't know whether the string passed to `getInvoiceByCustomer` is
> acceptable as an integer without also knowing its value.

Correct.  But a static analyzer can tell you it MAY fail, just as easily
as a static analyzer for strict types can tell you it will fail.  Now,
which is better is up for debate.  Personally I think the latter is
better, or at the very least just as good.  If, in fact, the string you're
passing is really a numeric string (which if I'm reading you're code
correctly, it probably is), then in the static case, seeing the error in
the static analyzer - or at runtime - you're likely to resort to explicit
casting.  Explicit casting that may hide data loss if - for whatever
reason - what you get (in some error situation or unexpected flow) ends up
being a non-numeric string.  In the coercive case - seeing the warning in
the static analyzer - you're likely to take a look at it and verify that
it's indeed getting the right value, but you'd keep it as-is, and let the
language do a better job at converting the string to an int than an
explicit cast would.  This will actually result in more robust code that,
in the unexpected event that a non-numeric string is received in the
future - would reject it, instead of happily accepting it silently.

> Conceptually, the optional
> strict mode proposed in Anthony's RFC is not very different from == vs.
===,
> or `in_array` with the $strict argument set to true. And I certainly am
glad
> that PHP offers these options!

Happy you like it :)  But === is very different than strict mode.  When we
added it, it allowed you to do something that was just not possible to do
before - and that was actually a perfect fit for a fairly common usecase
(being able to differentiate between NULL and false and 0 in return
values, for instance).  The same cannot be said about strict type hints.
They can be done easily today (using is_int() and friends), and - with the
presence of coercive type hints - they're not nearly as commonly needed as
===.

> community into separate camps, I would say "It's too late!" The
community
> has already been split over this issue for years.

Splitting isn't a binary thing.  Of course, there are already lots of
different camps in the PHP community - procedural vs. OO, frameworks vs.
lean, etc.  This would add *additional* fragmentation - as it doesn't
cleanly map into any of the existing splits that already exist.

Thanks for the feedback!  It took me a while to answer this, I'm
definitely leaning towards accepting leading and trailing whitespace for
numeric strings now :)

Zeev

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to