Hi Hong,

Thanks for the detailed reply.

It seems that there are roughly 32 public classes in flink-avro that are
not marked with @PublicEvolving. Some of them are explicitly marked
as @Internal. Are you suggesting to mark all these classes as
@PublicEvolving?

I agree it would be nice to mark more classes as @PublicEvolving so that
projects outside apache/flink* can take advantage of these APIs. On the
other hand, it is probably useful to make sure that we only expose the
minimal set of APIs for our target use cases and that all public APIs are
well-documented.

Typically public APIs are exposed via interface rather than concrete class.
So it is not clear to me whether we can just mark all those classes
as @PublicEvolving and still achieve the high-level goal described above.
Maybe we should follow the typical FLIP process by listing all APIs that
need to be exposed as public in a FLIP so that other developers will help
double check those APIs and vote for it, if we decide to add more public
APIs in Flink.

On the other hand, I still think it is OK for flink-connector-aws (or any
projects managed by Flink community) to use Internal APIs, for the reasons
described in my reply to Chesnay's email. We can still try to make more
APIs public. I just don't think it has to block the effort to
externalize AWS formats.

What do you think?

Regards,
Dong



On Tue, Jan 31, 2023 at 2:18 AM Teoh, Hong <lian...@amazon.co.uk> wrote:

> Hi @Chesnay @Dong Lin,
>
> Thanks for flagging up the concerns.
>
> I agree that we should make clear which APIs are public in flink-avro so
> that we can make the connector code more independent. IMO that moves in the
> right direction for Flink in general!
>
>
> 1. Public classes/interfaces in flink-avro
> - I took a look at the classes in flink-avro that are used in
> flink-avro-glue-schema-registry, and it turns out those classes are already
> exposed as part of the public interface in flink-avro-glue-schema-registry
> and flink-avro-confluent-registry. (They are already inherently
> @PublicEvolving (RegistryAvroDeserializationSchema, SchemaCoder,
> RegistryAvroSerializationSchema), so I recommend that we should just mark
> them as so for clarity.
> - The only class that isn't inherently exposed as @PublicEvolving is
> MutableByteArrayInputStream, but I took a look, and I think we can remove
> this dependency from flink-avro-glue-schema-registry
> - Also, while looking, I found that there are a couple of classes that are
> recommended for users in our doc [1] but are not marked as @PublicEvolving
> . We can probably also fix this by adding the @PublicEvolving annotation
>
> 2. Our public API depends on Apache Avro classes
> - I agree, ideally we don't have this dependency, but since we already
> inherently expose it to users, we should maintain this for backwards
> compatibility.
> - Also, some of the @PublicEvolving classes in flink-avro already have
> dependencies on Avro classes [2]
>
> As such, I propose that we:
> - Mark the inherently public classes in flink-avro as @PublicEvolving
> - Proceed with externalizing flink-avro-glue-schema-registry
>
>
> Let me know what you think!
>
> Regards,
> Hong
>
> [1] AvroInputFormat
> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/formats/avro/
> [2] AvroFormatOptions
> https://github.com/apache/flink/blob/f75cf38e28143bb36acbc2ee6a5ea9c8852916dd/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/AvroFormatOptions.java
>
>
> On 30/01/2023, 12:53, "Chesnay Schepler" <ches...@apache.org> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
>     On 30/01/2023 13:24, Dong Lin wrote:
>     > it should be OK for
>     > apache/flink-connector-aws to use non-public API, similar to how
> classes
>     > inside Flink can use APIs marked with @Internal.
>     >
>     > We can mark related classes with @Internal as appropriate, without
>     > requiring any new FLIP.
>
>     It's not though, because these APIs don't offer any compatibility
>     guarantees.
>
>     We want to arrive at a state where connectors are fully independent of
>     Flink releases. Using non-public APIs runs counter to that.
>
>
>

Reply via email to