Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling

Nicolas Grekas Mon, 12 Sep 2022 12:46:32 -0700

> On 9/8/22 19:06, Nicolas Grekas wrote:
> >  From what I understand, there are two things in the RFC:
> >
> >     1. a proposal to wrap any throwables thrown during unserialization
> >     inside an UnserializationFailedException
>
> Correct.
>
> >     2. a discussion to figure out a way to turn these notices+warnings
> into
> >     exceptions.
>
> Partly correct:
>
> The RFC primarily proposes unifying the existing non-Exception errors
> (which are currently split between E_WARNING and E_NOTICE with no
> apparent pattern). To *what* exactly (E_WARNING or some Exception) they
> are unified is up for discussion/vote.
>


Thanks for clarifying. Our accepted release process policy
<https://wiki.php.net/rfc/releaseprocess> says "Backward compatibility must
be respected with the same major releases". I think it's critical to stick
to this policy as much as possible and also to plan for the smoothest
upgrade path when preparing the next major (deprecations FTW.)
That's what I have in mind when I review the RFC and where I don't see the
match yet. I know we're only at the beginning of the discussion and we're
still looking for what would work best.



> > About 1., I read that you think "this is not considered an issue", but to
> > me, changing the kind of exceptions that a piece of code might throw is a
> > non negligible BC break. There needs to be serious justification for it.
> I
> > tried looking for one, but I'm not convinced there is one: replacing the
> > existing catch(Throwable) by catch(UnserializationFailedException) won't
> > provide much added value to userland, if any. And "reliably include more
> > than just the call unserialize() in the try" is not worth the BC break
> IMHO.
>
> unserialize() is a generic function that will also call arbitrary
> callbacks. You already have no guarantees whatsoever about the kind of
> exception that is thrown from it. I am unable to think of a situation
> where the input data is well-defined enough to reliably throw a specific
> type of exception, but not well-defined enough to successfully unserialize.
>
> So on the contrary wrapping any Exceptions with a well-defined Exception
> allows you to rely on a specific and stable Exception to catch in the
> first place.
>
> As you specifically mention catch(Throwable), I'd like to point out that
> this will continue to work, because UnserializationFailedException
> implements Throwable.
>

You might have misunderstood my point so let me give two examples:

Example 1.
set_error_handler(fn ($t, $m) => throw new ErrorException($m));
try {
   $ser = serialize($value);
catch (ErrorException $e) {
  // deal with $e
finally {
  restore_error_handler();
}

Example 2.
try {
   $ser = @serialize($value);
catch (Exception $e) {
  // deal with $e but don't catch Error on purpose, to let them bubble down
}

Changing serialize to wrap any throwables into an
UnserializationFailedException would break both examples. That's what I
mean when I say this breaks BC. As any BC-break, this might cause big
troubles for the community and this should be avoided. The release process
policy I mentioned above is a wise document.



> > About 2., unserialize() accepts a second argument to give it options. Did
> > you consider adding a 'throw_on_error' option to opt-in into the behavior
> > you're looking for? I think that would be a nice replacement for the
> > current logics based on set_error_handler(). If we want to make PHP 9
> throw
> > by default, we could then decide to deprecate *not* passing this option.
>
> I did not consider this when writing the RFC. I now considered it, and I
> do not believe adding a flag to control this a good thing.
>
> 1. "No one" is going to set that flag, because it requires additional
> effort. I strongly believe that the easiest solution should also the
> correct solution for the majority of use cases. The flag fails this
> "test", because the correct solution should be "don't fail silently".
>

Unless we deprecate calling the method without the argument as I suggested.
I agree, it might not be ideal to ask the community to rewrite every call
to serialize($v) to serialize($v, ['throw_on_error' => true/false]) but
that's still way better than a BC break.
Maybe there's another way that doesn't break BC. I'm looking forward to one.

2. If you actually want to set that option in e.g. a library, then you
> break compatibility with older PHP versions for no real gain. If you go
> all the way and remember to add that extra flag, then you can also write
> an unserialize wrapper that does the set_error_handler dance I've shown
> in the introduction of the RFC. Similar effort to adding the flag, but
> better compatibility.
>

I think my proposal provides BC with older versions of php: passing extra
(unused) options is allowed and polyfilling UnserializationFailedException
is trivial, so one could set_error_handler(fn (...) => throw new
UnserializationFailedException(...)) and be both BC and FC.

About writing a wrapper, that's what we all do right now, each with our own
wrapping logic. You're looking for unifying those logics and I support that
goal.

But this leads to another idea : what about adding a new function, let's
name it unserialize2() for now, and deprecate unserialize()? That'd provide
the upgrade path we need.



> 3. It does literally nothing for users that use a throwing error
> handler, which to my understanding includes the vast majority of
> framework code.
>

I'm not sure what you mean here. The flag would allow removing those error
handlers, which is what you want to achieve in the end I believe (well,
except when BC with older versions is desired of course, see my previous
comment.)



> 4. Even for users that do not use a throwing error handler, omitting the
> option literally does nothing, because unserialize() might already throw
> depending on what the unserialize handlers of unserialized objects do.
>

Sure. I don't see how that's a problem. Throwing is what almost any lines
of PHP code can do. We're used to dealing with that and I don't see why
serialize() needs special care. I might have missed the point you want to
make here though.



> > Lastly I'd like to add a 3. to your proposal, because there is one more
> > thing that makes unserialization annoying: the unserialize_callback_func
> > <
> https://www.php.net/manual/var.configuration.php#ini.unserialize-callback-func
> >
> > ini setting. It would be great to be able to pass a 'callback_func'
> option
> > to unserialize to provide it, instead of calling ini_set() as we have to
> > quite often right now.
> >
> > Would that make sense to you?
> >
>
> TIL about that ini setting. Can you clarify where this callback comes in
> helpful? What can it do for you what your autoloader can't do? I've
> attempted to search GitHub to find out about the use cases, but almost
> all of the results are copies of php-src that match the .phpt tests.
>

Sure: that callback is called when PHP tries to create an object of class
Foo but the autoloader fails to load said class. When there is no handler,
PHP creates a __PHP_Incomplete_Class object, but I regularly use this
callback to throw instead when I don't want those strange incomplete
objects (note that I also have seen use cases where generating a
__PHP_Incomplete_Class *is* the desired behavior.)


However as of now I do not believe that it is appropriate to include in
> my RFC, because it is only indirectly related to error handling. I'd
> like to keep the RFC focused. This makes it easier for readers to
> understand the proposal, allowing voters to make an educated vote and
> serving as longer-term documentation if the vote passes.
>

I proposed that because in many situations unserializing a payload to
something that contains __PHP_Incomplete_Class objects is not desired and
should be leveled up to an error instead. That's the relation with error
handling and thus with the RFC.

Nicolas

Re: [PHP-DEV] RFC [Discussion]: Improve unserialize() error handling

Reply via email to