I'm not a voter on RFCs so my input may be largely irrelevant here but for
discussion purposes:

I remain unconvinced regarding the justification for this proposal. I'm not
saying there's a strong reason to NOT implement it, but I'm not convinced
it's really going to be a significant benefit to many people at all.

I agree that the number of userland implementations for a "is_valid_json"
type function including in some widely used frameworks and systems
indicates there's some degree of demand in the ecosystem for validating a
JSON string.

But the more salient question is whether there is a significant demand for
whatever memory and speed benefit the implementation of a new core ext_json
function delivers; that is, has it been established that the use of
json_decode or common userland solutions are in practice not good enough?

There are many examples of userland code which could be faster and more
memory efficient if they were written in C and compiled in, so the mere
fact this proposal may introduce a somewhat faster way of validating a JSON
string over decoding it is not necessarily a sufficient reason to include
it.

Are there are examples of raising issues for frameworks or systems saying
they need to validate some JSON but the only existing solutions available
to them are causing memory limit errors, or taking too long? The Stack
Overflow question linked on the RFC says "I need a really, really fast
method of checking if a string is JSON or not."

"Really, really fast" is subjective. No context or further information is
given about what that person would regard as an acceptable time to validate
what size blob of valid or invalid JSON, or why. Indeed that same page
offers a userland solution based around only going to json_decode if some
other much simpler checks on the input are indeterminate for validation
purposes. Haven't tested it personally but no doubt in the vast majority of
cases it is sufficiently performant.

In most real world use cases [that I've encountered over the years] JSON
blobs tend to be quite small. I have dealt with much, much larger JSON
blobs, up to a few hundred MB, and in those cases I've used a streaming
parser. If you're talking about JSON that size, a streaming parser is the
only realistic answer - you probably don't want to drop a 300MB string in
to this RFC's new function either, if performance and memory efficiency is
your concern.

So I'm curious as to whether a real world example can be given where the
efficiency difference between json_decode and a new json_validate function
would be important to the system, whether anyone's encountered a scenario
where this would have made a real difference to them.

Reply via email to