Re: Towards a JSON API for the JDK

Brian Goetz Fri, 16 May 2025 06:17:52 -0700

At first, we were hopeful that we could jump right to JSON5, whichappears at first glance to be a strictly lexical, more permissivegrammar for JSON (supporting comments, trailing commas, more flexiblequoting, white space, etc.) If that were actually true, this would havebeen a slam dunk, since all these lexical niceties don't have an impacton the parsed results. And JSON5 has been gaining some traction, so itprobably could have been a justifiable move to jump right to that.

But then we discovered that JSON5 also sneaks in some semantics, by alsosupporting the exotic numeric values (NaN, infinities, signed zero),which now has consequences for "what is a number", the numericrepresentation, the API for unpacking numeric values, etc. (Havingmultiple parsers is one thing; having multiple parsers that producedifferent semantics is entirely another.) And inventing a new "JSON5but not quite" subset would be doing no one any favors.

Jsonc seems to be entirely a MS-ecosystem thing; it does not have broadenough traction to be the "one grammar" we accept. So pure JSON, asspecification-challenged as it is, is the logical, though sad,conclusion (for now.)




On 5/16/2025 9:02 AM, Lars Bruun-Hansen wrote:



Great work.

I feel the elephant in the room needs to addressed: JSON comments. Ihaven't tested the proposed lib but I cannot see it mentioned so I'massuming that comments are not supported.

For better or worse, the use of jsonc (JSON with comments) iseverywhere in some ecosystems. Unsurprisingly this happens often whenJSON is used as a config file format. Looking at you, Microsoft.

It would be nice if the JDK's build-in JSON parser at least recognizedthis.

I'm well aware that comments are frowned upon in JSON and not part ofneither the spec at www.json.org nor the RFC-8259.

Yet, I advocate the JDK JSON library should optionally allow commentsto be ignored when PARSING. This should be an opt-in feature thatwould technically treat comments as whitespace during the parsing process.

This would also be in line with what many other parsers do. Forexample, Jackson has "ALLOW_COMMENTS" feature [1]. Also, bycomparison, the build-in parser in the .NET world, known asSystem.Text.Json, also supports this [2].

The "discoverer" of JSON, Douglas Crowford, had this to say [3] on thetopic:



[QUOTE]

I removed comments from JSON because I saw people were using them tohold parsing directives, a practice which would have destroyedinteroperability. I know that the lack of comments makes some peoplesad, but it shouldn't.

Suppose you are using JSON to keep configuration files, which youwould like to annotate. Go ahead and insert all the comments you like.Then pipe it through JSMin before handing it to your JSON parser.


 [/QUOTE]

By not having the ability to ignore comments when parsing we wouldeffectively force users to use another parser first or a minifier. Idoubt beginners would appreciate that.



BTW: The test suite already has tests for comments.


/Lars

[1]:https://www.javadoc.io/static/com.fasterxml.jackson.core/jackson-core/2.19.0/com/fasterxml/jackson/core/JsonParser.Feature.html#ALLOW_COMMENTS

[2]:https://learn.microsoft.com/en-us/dotnet/api/system.text.json.jsonreaderoptions?view=net-9.0#properties


[3]: https://plus.google.com/118095276221607585885/posts/RK8qyGVaGSr




On 16/05/2025 01.44, Ethan McCue wrote:

I present for your consideration the library I made when spiralingabout this problem space a few years ago


https://github.com/bowbahdoe/json

https://javadoc.io/doc/dev.mccue/json/latest/dev.mccue.json/dev/mccue/json/package-summary.html

Notably missing during the design process here were patterns, hencethe JsonDecoder design. I haven't been able to evaluate how patternsaffect that on account of them not being out.

I will more thoroughly peruse the draft of java.util.json at a laterdate, but my initial observations/comments:


* I am not sure having JsonValue be distinct from Json has value.

* toUntyped feels a little strange to me - the only type informationpresumably lost is the sealed-ness of the hierarchy. The interplaybetween that and toNumber is also a little unnerving.* One notion that I found helpful was that a class could be "jsonencodable," meaning there is a method to call to obtain a canonicaljson representation.


record Person(String name) implements JsonEncodable {
@Override
    public Json toJson() {
        return Json.objectBuilder()
            .put("namen", name)
            .build();
    }
}

Which helper methods like Json#of(List<? extendsJsonEncodable>) could make use of. Json itself (JsonValue in yourprototype) could then have a vacuous implementation.

* Terminology wise - I went with reading/writing for the actualparsing/generation of json and encoding/decoding for the mapping ofthose representations to/from specific classes. The merits are nottop of mind, just noting the difference. read/write vsparse/toString+toDisplayString* One thing I did was make the helper methods in Json null tolerantand the ones in the specific subtypes like JsonString not. This wasbecause from what I saw of usages of javax.json/jakarta.json thatnullability was a footgun and correcting for it required changes tocode structure (breaking up builder chains with if (x != null) checks)* The functionality you want from JsonNumber could be achieved bymaking it just extend Number(https://github.com/bowbahdoe/json/blob/main/src/main/java/dev/mccue/json/JsonNumber.java)instead of a bespoke toNumber. You need the extra methods to go tobig decimal and co, but it's just an extension to the behavior ofNumber at that point.* JsonObject and JsonArray could implement Map<String, Json> andList<Json> respectively. This lowers the need for toUntyped() - sincepresumably one of the use cases for that is turning the json treeinto something that more generic map/list traversal code can handle.It also complicates any lazy loading somewhat.* Assuming patterns can be placed on interfaces, you might want toconsider something similar to JsonDecoder, but with a pattern insteadof a method that throws an exception.

// Where here fromJson would box up the logic for testing andextracting from each element in the array.

List<Person> people = array(json, Person::fromJson);

* I don't think there is sufficient cause for anything to benon-sealed at this point.* JsonBoolean and JsonNull do not have reasonable alternativeimplementations - as far as I can imagine, maybe i'm wrong - so maybethose can just be final classes?* If you seal up the whole hierarchy then its pretty trivial to makeit serializable(https://github.com/bowbahdoe/json/blob/main/src/main/java/dev/mccue/json/serialization/JsonSerializationProxy.java)





On Thu, May 15, 2025 at 11:29 PM Remi Forax <fo...@univ-mlv.fr> wrote:

    Hi Paul,
    yes, not having a simple JSON API in Java is an issue for beginners.

    It's not clear to me why JsonArray (for example) has to be an
    interface instead of a record ?

    I understand why Json.parse() only works on String and char[] but
    the API make it too easy to have many performance issues.
    I think you need versions using a Reader and a Path.
    Bonus point, if there is a method walk() that also returns a
    JsonValue but the List/Map inside JsonArray/JsonObject are
    populated lazily.

    Minor point: Json.toDisplayString() should takes a second
    parameters indicating the number of spaces used for the
    indentation (like JSON.stringify in JS).

    regards,
    Rémi

    ----- Original Message -----
    > From: "Paul Sandoz" <paul.san...@oracle.com>
    > To: "core-libs-dev" <core-libs-dev@openjdk.org>
    > Sent: Thursday, May 15, 2025 10:30:42 PM
    > Subject: Towards a JSON API for the JDK

    > Hi,
    >
    > We would like to share with you our thoughts and plans towards
    a JSON API for
    > the JDK.
    > Please see the document below.
    >
    > -
    >
    > We have had the pleasure of using a clone of this API in some
    experiments we are
    > conducting with
    > ONNX and code reflection [1]. Using the API we were able to
    quickly write code
    > to ingest and convert
    > a JSON document representing ONNX operation schema into
    instances of records
    > modeling the schema
    > (see here [2]).
    >
    > The overall out-of-box experience with such a minimal
    "batteries included” API
    > has so far been positive.
    >
    > Thanks,
    > Paul.
    >
    > [1] https://openjdk.org/projects/babylon/
    > [2]
    >
    
https://github.com/openjdk/babylon/blob/code-reflection/cr-examples/onnx/opgen/src/main/java/oracle/code/onnx/opgen/OpSchemaParser.java#L87
    >
    > # Towards a JSON API for the JDK
    >
    > One of the most common requests for the JDK is an API for
    parsing and generating
    > JSON. While JSON originated as a text-based serialization
    format for JSON
    > objects ("JSON" stands for "JavaScript Object Notation"),
    because of its simple
    > and flexible syntax, it eventually found use outside the
    JavaScript ecosystem as
    > a general data interchange format, such as framework
    configuration files and web
    > service requests/response formats.
    >
    > While the JDK cannot, and should not, provide libraries for
    every conceivable
    > file format or protocol, the JDK philosophy is one of
    "batteries included",
    > which is to say we should be able to write basic programs that
    use common
    > protocols such as HTTP, without having to appeal to third party
    libraries.
    > The Java ecosystem already has plenty of JSON libraries, so
    inclusion in
    > the JDK is largely meant to be a convenience, rather than
    needing to be the "one
    > true" JSON library to meet the needs of all users. Users with
    specific needs
    > are always free to select one of the existing third-party
    libraries.
    >
    > ## Goals and requirements
    >
    > Our primary goal is that the library be simple to use for
    parsing, traversing,
    > and generating conformant JSON documents. Advanced features,
    such as data
    > binding or path-based traversal should be possible to implement
    as layered
    > features, but for simplicity are not included in the core API.
    We adopt a goal
    > that the performance should be "good enough", but where performance
    > considerations conflict with simplicity and usability, we will
    choose in favor
    > of the latter.
    >
    > ## API design approach
    >
    > The description of JSON at `https:://json.org
    <http://json.org>` describes a JSON document using
    > the familiar "railroad diagram":
    > ![image](https://www.json.org/img/value.png)
    >
    > This diagram describes an algebraic data type (a sum of
    products), which we
    > model directly with a set of Java interfaces:
    >
    > ```
    > interface JsonValue { }
    > interface JsonArray extends JsonValue { List<JsonValue> values(); }
    > interface JsonObject extends JsonValue { Map<String, JsonValue>
    members(); }
    > interface JsonNumber extends JsonValue { Number toNumber(); }
    > interface JsonString extends JsonValue { String value(); }
    > interface JsonBoolean extends JsonValue  { boolean value(); }
    > interface JsonNull extends JsonValue { }
    > ```
    >
    > These interfaces have (hidden) companion implementation classes
    that admit
    > greater flexibility of implementation than modeling them
    directly with records
    > would permit.
    > Further, these interfaces are unsealed. We compromise on the
    sealed sum of
    > products to enable
    > alternative implementations, for example to support alternative
    formats that
    > encode the same information in a JSON document but in a more
    efficient form than
    > text.
    >
    > The API has static methods for parsing strings into a
    `JsonValue`, conversion to
    > and from purely untyped representations (lists and maps), and
    factory methods
    > for building JSON documents. We apply composition consistently,
    e.g, a
    > JsonString has a string, a JsonObject has a map of string to
    JsonValue, as
    > opposed to extension for structural JSON values.
    >
    > It turns out that this simple API is almost all we need for
    traversal. It gives
    > us an immutable representation of a document, and we can use
    pattern matching to
    > answer the myriad questions that will come up (Does this object
    have key X? Does
    > it map to a number? Is that number representable as an
    integer?) when going
    > from an untyped format like JSON to a more strongly typed
    domain model.
    > Given a simple document like:
    >
    > ```
    >    {
    >        "name": "John”,
    >        "age": 30
    >    }
    > ```
    >
    > we can parse and traverse the document as follows:
    >
    > ```
    > JsonValue doc = Json.parse(inputString);
    > if (doc instanceof JsonObject o
    >    && o.members().get("name") instanceof JsonString s
    >    && s.value() instanceof String name
    >    && o.members().get("age") instanceof JsonNumber n
    >    && n.toNumber() instanceof Long l && l instanceof int age) {
    >            // use "name" and "age"
    >        }
    > ```
    >
    > Later, when the language acquires the ability to expose
    deconstruction patterns
    > for arbitrary interfaces (similar to today's record patterns, see
    >
    
https://openjdk.org/projects/amber/design-notes/patterns/towards-member-patterns),
    > this will be simplifiable to:
    >
    > ```
    > JsonValue doc = Json.parse(inputString);
    > if (doc instanceof JsonObject(var members)
    >    && members.get("name") instanceof JsonString(String name)
    >    && members.get("age") instanceof JsonNumber(int age)) {
    >            // use "name" and "age"
    >        }
    > ```
    >
    > So, overtime, as more pattern matching features are introduced
    we anticipate
    > improved use of the API. This is a primary reason why the API
    is so minimal.
    > Convenience methods we add today, such as a method that
    accesses a JSON
    > object component as say a JSON string or throws an exception,
    will become
    > redundant in the future.
    >
    > ## JSON numbers
    >
    > The specification of JSON number makes no explicit distinction
    between integral
    > and decimal numbers, nor specifies limits on the size of those
    numbers.
    > This is a common source of interoperability issues when
    consuming JSON
    > documents. Generally users cannot always but often do assume
    JSON numbers are
    > parsable, without loss of precision, to IEEE double-precision
    floating point
    > numbers or 32-bit signed integers.
    >
    > In this respect the API provides three means to operate on the
    JSON number,
    > giving the user full control:
    >
    > 1. Underlying string representation can be obtained, if
    preserving syntactic
    >   details such as leading or trailing zeros is important.
    > 2. The string representation can be parsed to an instance of
    `BigDecimal`, using
    >   `toBigDecimal` if preserving decimal numbers is important.
    > 3. The string representation can be parsed into an instance of
    `Long`, `Double`,
    >   `BigInteger`, or `BigDecimal`, using `toNumber`. The result
    of this method
    >   depends on how the representation can be parsed, possibly
    losing precision,
    >   choosing a suitably convenient numeric type that can then be
    pattern
    >   matched on.
    >
    > Primitive pattern matching will help as will further pattern
    matching features
    > enabling the user to partially match.
    >
    > ## Prototype implementation
    >
    > The prototype implementation is currently located into the JDK
    sandbox
    > repository
    > under the `json` branch, see
    > here
    >
    
https://github.com/openjdk/jdk-sandbox/tree/json/src/java.base/share/classes/java/util/json
    > The prototype API javadoc generated from the repository is also
    available at
    >
    
https://cr.openjdk.org/~naoto/json/javadoc/api/java.base/java/util/json/package-summary.html
    >
    > ### Testing and conformance
    >
    > The prototype implementation passes all conformance test cases
    but two,
    > available
    > on https://github.com/nst/JSONTestSuite. The two exceptions are
    the ones which
    > the
    > prototype specifically prohibits, i.e, duplicated names in JSON
    objects
    >
    (https://cr.openjdk.org/~naoto/json/conformance/results/parsing.html#35).
    >
    > ### Performance
    >
    > Our main focus so far has been on the API design and a functional
    > implementation.
    > Hence, there has been less focus on performance even though we
    know there are a
    > number of performance enhancements we can make eventually.
    > We are reasonably happy with the current performance. The
    > implementation performs well when compared to other JSON
    implementations
    > parsing from string instances and traversing documents.
    >
    > An example of where we may choose simplicity and usability over
    performance
    > is the rejection of JSON documents containing objects that in
    turn contain
    > members
    > with duplicate names. That may increase the cost of parsing,
    but simplifies the
    > user
    > experience for the majority of cases since if we reasonably
    assume JsonObjects
    > are
    > map-like, what should the user do with such members, pick one
    the last one?
    > merge
    > the values? or reject?
    >
    > ## A JSON JEP?
    >
    > We plan to draft JEP when we are ready. Attentive readers will
    observe that
    > a JEP already exists, JEP 198: Light-Weight JSON API
    > (https://openjdk.org/jeps/198). We will
    > either update this JEP, or withdraw it and draft a new one.

Re: Towards a JSON API for the JDK

Reply via email to