On Tue, Dec 29, 2020 at 5:58 PM Máté Kocsis <kocsismat...@gmail.com> wrote:
> Hi Internals, > > I think this will be my last proposal for quite some while :) > But this time, I'd like to propose bundling the > https://github.com/crazyxman/simdjson_php extension > with some major modifications. > > The proposed OO API is included in the description of the > PR that I've just created: https://github.com/php/php-src/pull/6551 > > The main motivation behind this RFC is two-fold: > - the underlying simdjson library (https://github.com/simdjson/simdjson) > which is used by ext/simdjson provides huge performance gains > compared to ext/json (see some benchmark results in the PR) > - we can support new use-cases, most notably the so called "on-demand" > parsing: https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md > (This is not implemented currently) > > Originally, I planned to include the new API in ext/json, but > unfortunately, > simdjson is written is C++, so it would make C++ as a hard dependency, > which was not the case so far. That's why I opted for creating > ext/simdjson. > > Please let me know if you have any feedback. > > Regards: > Máté > Same as the others, I don't think it makes sense to bundle this at this point in time -- though I'm also pretty skeptical about bundling it at all. The end result would be that we have two JSON APIs, one that is always available but slower, and another that is optional but faster. That is really not great. Would make a bit more sense if simdjson was just a backend that could be optionally enabled to speed up the normal JSON API. I'd also be concerned about compatibility -- JSON parsing is a bit of a minefield and many parsers have subtle differences. I remember this comparison page http://seriot.ch/json/pruned_results.png which showed that PHP 7 (which is when we switched to jsond) had one of the only fully conforming JSON implementations. It doesn't seem unlikely that simdjson will have differences somewhere, and it would be unfortunate if there were two JSON APIs with subtly different behavior. I think the main value proposition (even though it is not part of your initial proposal) here would be a streaming JSON API, that does not require parsing or even loading the whole JSON document at once. I think the addition of such an API would be valuable -- but wouldn't it be possible to introduce this based on our existing JSON parser? Nikita