Hi,
In the context of the current migration process from Log4j 1.x/Reload4j
to Log4j Core 2.x[1], I believe that the choice of configuration format
used by the Kafka binary distribution, should receive a particular
attention.
Log4j Core 2.x supports four native configuration formats (XML, JSON,
YAML and Java Properties[2]). The version 1.x XML and Java Properties
configuration file formats are incompatible with the new formats, but
they can be converted at runtime, using the `log4j-1.2-api` artifact[3].
This is of course a transitional option, since the old formats are not
extensible and do not offer most of the features of Log4j Core 2.x.
While the 2.x Java Properties configuration format might seem as the
natural migration path for the current Apache Kafka configuration, I
would strongly advise against this choice. The Log4j Core 2.x runtime
has a hierarchical structure, which can be easily reflected by formats
like XML, JSON or YAML, but not so much by Java Properties. For this
reason the `*.properties` configuration format is:
* very verbose,
* contains a lot of quirks to make it less verbose[4].
If we exclude Java Properties, only three choices remain:
* The default XML format, which has no dependencies (if we exclude the
JPMS `java.xml` module) and has a schema[5] that can be used to validate
the configurations. This might, however, strongly contrast with the
other Kafka configuration files that are maintained as Java Properties.
* The JSON format has a dependency on `jackson-databind`, which is
already present in the Kafka binary distribution. It is a matter of
personal taste, but I find it even more verbose than the Java Properties
format (although it does not have quirks). In Log4j Core 3.x the
dependency on `jackson-databind` has been replaced with an in-house parser.
* My favorite would be the YAML format, that would require the addition
of `jackson-dataformat-yaml` (and its `snakeyaml` transitive dependency)
to the Kafka runtime. The advantage, however, would be that it is
probably the less verbose of the available formats.
What do you think, which one of the configuration formats available in
Log4j Core 2.x should be used by default by Kafka?
Piotr
[1] https://github.com/apache/kafka/pull/17373
[2]
https://logging.apache.org/log4j/2.x/manual/configuration.html#configuration-factories
[3]
https://logging.apache.org/log4j/2.x/migrate-from-log4j1.html#ConfigurationCompatibility
[4]
https://logging.apache.org/log4j/2.x/manual/configuration.html#java-properties-features
[5] https://logging.apache.org/xml/ns/