Hi, Ran Tao.
Thanks for bring it up. 
TBH, to me, it's not as so confusing. 
Is that the fact that the applyReadableMetadata and applyProjection all will 
pass producedDataType and the source conneector developer will
need to choose which one as the finnal output type? 

As the Java doc of applyReadableMetadata said, use the producedDataType in this 
method instead of applyProjection.

For you question, I think the responsibilities of these two interfaces are 
quite independent. What kind of independence are you expecting?

Btw, to aovid confusing, i think we may need to specific it that the 
applyProjection will be called before method applyReadableMetadata
in the java doc of applyReadableMetadata.


Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Ran Tao" <chucheng...@gmail.com>
收件人: "dev" <dev@flink.apache.org>
发送时间: 星期三, 2023年 2 月 08日 下午 8:46:45
主题: Confusion about some overlapping functionality of 
SupportsProjectionPushDown and SupportsReadingMetadata

Currently we use SupportsProjectionPushDown to push down physical columns,
and SupportsReadingMetadata is used to read metadata columns.
There is no problem when implementing one of the interfaces alone. If two
interfaces are implemented at the same time, there will be confusing
semantics.

For example, if we update the schema or producedDataType in
SupportsProjectionPushDown#applyProjection and
SupportsReadingMetadata#applyReadableMetadata at the same time, the former
is actually invalid, because the former is called first, and then the
latter will overwrite it.

There are some similar usage notes in the interface's documentation. But
this is very confusing. In this case, you only need to implement
SupportsReadingMetadata#applyReadableMetadata (only implement
SupportsProjectionPushDown, the override method is empty), and the rule
match logic of SupportsReadingMetadata will push down the physical column
and metadata columns to generate producedDataType and return it.

At this point SupportsProjectionPushDown is more like a marker interface.
In addition, if some member variables are relied on in the implementation
of SupportsReadingMetadata, and the member variables are also updated in
SupportsProjectionPushDown, unexpected problems may occur. Developers
should clearly read the implementation of these two interfaces and
understand that these overlapping functions will cause a certain
development cost to the developer of the connector (normally, the two
interfaces should be isolated functions, developers see the meaning of the
name ).

I wonder if the community has considered making the responsibilities of
these two interfaces more independent and clear in subsequent updates.
Maybe my understanding is not very sufficient, looking forward to your
opinions.

-- 
Best Regards,
Ran Tao
https://github.com/chucheng92

Reply via email to