Hi @Dong Lin @Chesnay Schepler, 

We have deviated from the original purpose of this thread.
Let's keep this thread to the original proposal of externalising 
flink-avro-glue-schema-registry and flink-json-glue-schema-registry formats to 
the flink-connector-aws repository. 

The way I see it, @Chesnay Schepler raised a potential issue regarding the 
public APIs in flink-avro + their stability. 
So my answer to that is to create a JIRA to tidy up the public API in 
flink-avro, but we will not block this work since the classes used in 
flink-avro-glue-schema-registry, namely ( RegistryAvroDeserializationSchema, 
SchemaCoder and RegistryAvroSerializationSchema) are effectively already Public 
(Even though they are not labelled as such), since they are extended by classes 
that are @PublicEvolving. This means that whatever work we do to solidify the 
public API needs to ensure it is backwards compatible anyways, since it is 
already @PublicEvolving.

@Chesnay Schepler are you happy with this followup?


Regarding the other questions you had, @Dong Lin, see answers below.

> It seems that there are roughly 32 public classes in flink-avro that are
not marked with @PublicEvolving. Some of them are explicitly marked
as @Internal. Are you suggesting to mark all these classes as
@PublicEvolving?

No, as @Danny Cranmer clarified, I'm just suggesting that we make the classes 
that are already inherently "public", because they are extended by 
"@PublicEvolving" classes, specifically RegistryAvroDeserializationSchema, 
SchemaCoder and RegistryAvroSerializationSchema. We do need a more detailed 
look at the flink-avro Public classes to expose only a minimal set of APIs, we 
can do that as part of the followup JIRA I will create.

> Typically public APIs are exposed via interface rather than concrete class.

Yes, but the point is that here we have already made these classes public (even 
if they are not annotated with @PublicEvolving). So the work to "tidy up" the 
Public API for flink-avro does not need to block the externalisation of the 
flink-avro-glue-schema-registry module, because we need to ensure backwards 
compatibility with our work (i.e. the shape should not change incompatibly).

Feel free to start another thread on discussing Public APIs in general if there 
are more questions!


Regards,
Hong


On 31/01/2023, 09:27, "Chesnay Schepler" <ches...@apache.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



     > Existing APIs in Flink which are marked with @Internal also don't
    offer any compatibility guarantee. Yet we are OK to have classes in
    Flink use them.

    Well of course, it's the same codebase. If we change the API we can
    change all usages as well. Hence, there's no problem using internal APIs.

     > After all, externalized connectors are Flink project's code, not
    Flink users' code, and thus can be considered internal code by default

    That's not how API compatibility works. Being "part of the Public API"
    means that we intend other software projects to make use of these APIs.
    This also includes other Flink software projects, like statefun,
    connectors, flink-ml and so on.

     > By "fully independent", I assume you mean that we want to be able to
    bump up the version of Flink in externalized connectors without having
    to change those connectors.

    What I mean is that a connector build for 1.x continues to work for 1.y.

     > Strictly speaking, we can not be fully independent because even APIs
    marked with @Public can still have backward incompatible changes across
    the major release.

    Of course, that's what makes a major release a major release. It's
    expected that things might not work across 1.x and 2.y.

    On 31/01/2023 01:06, Dong Lin wrote:
    > On Mon, Jan 30, 2023 at 8:51 PM Chesnay Schepler <ches...@apache.org> 
wrote:
    >
    >> On 30/01/2023 13:24, Dong Lin wrote:
    >>> it should be OK for
    >>> apache/flink-connector-aws to use non-public API, similar to how classes
    >>> inside Flink can use APIs marked with @Internal.
    >>>
    >>> We can mark related classes with @Internal as appropriate, without
    >>> requiring any new FLIP.
    >> It's not though, because these APIs don't offer any compatibility
    >> guarantees.
    >>
    > Existing APIs in Flink which are marked with @Internal also don't offer 
any
    > compatibility guarantee. Yet we are OK to have classes in Flink use them.
    > This is because we (the Flink community developers) have full control of
    > those code in apache/flink and are free to update these code to work with
    > any backward incompatible changes made to those internal API. After all,
    > externalized connectors are Flink project's code, not Flink users' code,
    > and thus can be considered internal code by default
    >
    > I think the same logic apply to those code in apache/flink-connector-aws.
    > Can you explain why it is not OK?
    >
    >
    >> We want to arrive at a state where connectors are fully independent of
    >> Flink releases. Using non-public APIs runs counter to that.
    >>
    > By "fully independent", I assume you mean that we want to be able to bump
    > up the version of Flink in externalized connectors without having to 
change
    > those connectors.
    >
    > Strictly speaking, we can not be fully independent because even APIs 
marked
    > with @Public can still have backward incompatible changes across the major
    > release. Yes we can minimize the changes needed to externalized connectors
    > by using only public PAI. Though it is a nice property, I don't think this
    > is a strict requirement due to the reasons explained above.
    >
    > Maybe we should discuss this in a separate thread to reach an agreement on
    > this topic?
    >


Reply via email to