On Tue, Dec 29, 2020 at 5:58 PM Máté Kocsis <kocsismat...@gmail.com> wrote:

> Hi Internals,
>
> I think this will be my last proposal for quite some while :)
> But this time, I'd like to propose bundling the
> https://github.com/crazyxman/simdjson_php extension
> with some major modifications.
>
> The proposed OO API is included in the description of the
> PR that I've just created: https://github.com/php/php-src/pull/6551
>
> The main motivation behind this RFC is two-fold:
> - the underlying simdjson library (https://github.com/simdjson/simdjson)
> which is used by ext/simdjson provides huge performance gains
> compared to ext/json (see some benchmark results in the PR)
> - we can support new use-cases, most notably the so called "on-demand"
> parsing: https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md
> (This is not implemented currently)
>
> Originally, I planned to include the new API in ext/json, but
> unfortunately,
> simdjson is written is C++, so it would make C++ as a hard dependency,
> which was not the case so far. That's why I opted for creating
> ext/simdjson.
>
> Please let me know if you have any feedback.
>
> Regards:
> Máté
>

Same as the others, I don't think it makes sense to bundle this at this
point in time -- though I'm also pretty skeptical about bundling it at all.
The end result would be that we have two JSON APIs, one that is always
available but slower, and another that is optional but faster. That is
really not great. Would make a bit more sense if simdjson was just a
backend that could be optionally enabled to speed up the normal JSON API.

I'd also be concerned about compatibility -- JSON parsing is a bit of a
minefield and many parsers have subtle differences. I remember this
comparison page http://seriot.ch/json/pruned_results.png which showed that
PHP 7 (which is when we switched to jsond) had one of the only fully
conforming JSON implementations. It doesn't seem unlikely that simdjson
will have differences somewhere, and it would be unfortunate if there were
two JSON APIs with subtly different behavior.

I think the main value proposition (even though it is not part of your
initial proposal) here would be a streaming JSON API, that does not require
parsing or even loading the whole JSON document at once. I think the
addition of such an API would be valuable -- but wouldn't it be possible to
introduce this based on our existing JSON parser?

Nikita

Reply via email to