Hi Jark,

thanks for your feedback. Let me answer some of your comments:

1) Since we decided to build an entire new stack, we can also introduce better names for columns, constraints, and watermark spec. My goal was to shorten the names during this refactoring. Therefore, `TableSchema` becomes `Schema` and `TableColumn` becomes `Column`. This also fits better to a `CatalogView` that has a schema but is actually not a table but a view. So `Column` is very generic. What do you think?

2) `ComputedColumn` and `WatermarkSpec` of the new generation will store `ResolvedExpression`.

3) I adopted most of the methods from `TableSchema` in `ResolvedSchema`. However, I skipped `getColumnDataTypes()` because the behavior is not clear to me. Should it include computed columns or virtual metadata columns? I think we should force users to think about what they require. Otherwise we implicitly introduce bugs.

Regards,
Timo

On 09.02.21 10:56, Jark Wu wrote:
Hi Timo,

The messy TableSchema confuses many developers.
It's great to see we can finally come up with a clean interface hierarchy
and still backward compatible.

Thanks for preparing the nice FLIP. It looks good to me. I have some minor
comments:

1) Should `ResolvedSchema#getColumn(int)` returns `TableColumn` instead of
`Column`?

2) You mentioned ResolvedSchema should store ResolvedExpression, should we
extend
   `ComputedColumn` and `WatermarkSpec` to allow `ResolvedExpression`?

3) `ResolvedSchema` aims to replace `TableSchema`, it would be better to
add un-deprecated
methods of `TableSchema` into `ResolvedSchema`
(e.g. `getColumnDataTypes()`).
Then users can have a smooth migration.

Best,
Jark

On Mon, 8 Feb 2021 at 20:21, Dawid Wysakowicz <[email protected]>
wrote:

Hi Timo,

 From my perspective the proposed changes look good. I agree it is an
important step towards FLIP-129 and FLIP-136. Personally I feel
comfortable voting on the document.

Best,

Dawid

On 05/02/2021 16:09, Timo Walther wrote:
Hi everyone,

you might have seen that we discussed a better schema API in past as
part of FLIP-129 and FLIP-136. We also discussed this topic during
different releases:

https://issues.apache.org/jira/browse/FLINK-17793

Jark and I had an offline discussion how we can finally fix this
shortcoming and maintain backwards compatibile for a couple of
releases to give people time to update their code.

I would like to propose the following FLIP:


https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs


The FLIP updates the class hierarchy to achieve the following goals:

- make it visible whether a schema is resolved or unresolved and when
the resolution happens
- offer a unified API for FLIP-129, FLIP-136, and catalogs
- allow arbitrary data types and expressions in the schema for
watermark spec or columns
- have access to other catalogs for declaring a data type or
expression via CatalogManager
- a cleaned up TableSchema
- remain backwards compatible in the persisted properties and API

Looking forward to your feedback.

Thanks,
Timo




Reply via email to