[DISCUSS] Json libraries for Kafka

Ismael Juma Mon, 13 Jul 2015 14:06:21 -0700

Hi all,

Kafka currently use scala.util.parsing.json.JSON as its json parser and it
has a number of issues:


* It encourages unsafe casts (returns `Option[Any]`)
* It's slow (it relies on parser combinators under the hood)
* It's not thread-safe (so external locks are needed to use it in a
concurrent environment)
* It's deprecated (it should have never been included in the standard
library in the first place)

KAFKA-1595[1] has been filed to track this issue.

I initially proposed a change using spray-json's AST with the jawn
parser[2]. Gwen expressed some reservations about the choice (a previous
discussion had concluded that Jackson should be used instead) and asked me
to raise the issue in the mailing list[3].

In order to have a fair comparison, I implemented the change using Jackson
as well[4]. I paste part of the commit message:

"A thin wrapper over Jackson's Tree Model API is used as the replacement.
This wrapper
increases safety while providing a simple, but powerful API through the
usage of the
`DecodeJson` type class. Even though this has a maintenance cost, it makes
the API
much more convenient from Scala. A number of tests were added to verify the
behaviour of this wrapper. The Scala module for Jackson doesn't provide any
help for our current usage, so we don't
depend on it."

A comparison between the two approaches as I see it:

Similarities:

   1. The code for users of the JSON library is similar
   2. No third-party dependencies
   3. Good performance

In favour of using Jackson:

   1. Same library for client and broker
   2. Widely used

In favour of using spray-json and jawn:

   1. Simple type class based API is included and it has a number of nice
   features:
      1. Support for parsing into case classes (we don't use this yet, but
      we could use it to make the code safer and more readable in some
cases)[5].
      2. Very little reflection used (only for retrieving case classes
      field names).
      3. Write support (could replace our `Json.encode` method).
   2. Less code to maintain (ie we don't need a wrapper to make it nice to
   use from Scala)
   3. No memory overhead from wrapping the Jackson classes (probably not a
   big deal)

I am happy to go either way as both approaches have been implemented and I
am torn between the options.

What do you think?

Best,
Ismael

[1] https://issues.apache.org/jira/browse/KAFKA-1595
[2]
https://github.com/ijuma/kafka/commit/80974afefc00eb6313a7357e7942d5d86ffce84d
[3]
https://issues.apache.org/jira/browse/KAFKA-1595?focusedCommentId=14512881&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14512881
[4]
https://github.com/ijuma/kafka/commit/4ca0f1111eb37e8be2d388b60efacc19bc6788b6
[5] The Scala module for Jackson (which is not being used in the commit
above) also supports this, but it uses a reflection-based approach instead
of type classes.

[DISCUSS] Json libraries for Kafka

Reply via email to