Hello Pulsar community, I would like to start a discussion about PIP 197: Add Schema hash and equals to public API. You can find the proposal at https://github.com/apache/pulsar/issues/16959 as well as pasted below.
Looking forward to hearing your thoughts, Alex ## Motivation Currently, the `Schema` interface in the public client-api does not provide access to a sensible hash function. The fallback to Java’s object-equality makes it unfit for use in most hash-based collections. For example, it prevents usage as a key in a cache. Further, the lack of a reliable equals function means that there is no way to identify if two schemas are the same thing. ## Goal The goal of this proposal is to provide a sensible `hashCode` and `equals` implementation for Schema as part of the public API. Currently, pulsar-common contains `SchemaHash`, a wrapper class that exists to solve the aforementioned problems. However, `SchemaHash` is not part of the public API, so users should not depend on it. ## API Changes There is no further change required as moving `SchemaHash` from pulsar-common to the public API. The only further change could be to re-think the class name, as the wrapper offers more than just a schemas hash. ## Implementation Move `SchemaHash` from pulsar-common `org.apache.pulsar.common.protocol.schema` package into pulsar-client-api `org.apache.pulsar.common.schema` package. ## Reject Alternatives Providing default methods for `equals` and `hashCode` directly on the `Schema` interface is not possible because Java prohibits overriding the base Object methods. Another option would be to provide the `hashCode` and `equals` functionality through similarly-named default methods that could be used by any `Schema` implementation. The drawback of this idea is that it requires developers to override the equals and hashCode to use these provided methods, as well as possibly polluting the interface.