Re: Resolving a CatalogTable

Timo Walther Fri, 28 Jan 2022 06:38:12 -0800

Hi Balazs,

you are right, the new APIs only allow the serialization of resolvedinstances. This ensures that only validated, correct instances are putinto the persistent storage such as a database. The framework willalways provide resolved instances and call the corresponding methodswith those. They should be easily serializable.

However, when reading from a persistent storage such as a database, theframework needs to validate the input and resolved expressions and datatypes (e.g. from a string representation).

The new design reflects the reality better. A catalog implementationdoes not need to be symmetric. It follows the principle:


- "Resolved" into the catalog (with all information if implementers need it)

- "Unresolved" out of the catalog (let the framework deal with theresolution, also with cross references to other catalogs)

Use ResolvedCatalogTable#toProperties for putting basic info into yourdatabase.


Use CatalogTable#fromProperties to restore the table.

This is esp important for expression resolution of computed columns andwatermark strategies. Functions could come from other catalogs as well.

So for implementers it is usally not important to resolved the`CatalogTable` manually.


If it is important for you, maybe you can elaborate a bit on your use case?

Regards,
Timo


On 26.01.22 12:18, Balázs Varga wrote:

Hi everyone,
I'm trying to migrate from the old set of CatalogTable related APIs(CatalogTableImpl, TableSchema, DescriptorProperties) to the new ones(CatalogBaseTable, Schema and ResolvedSchema, CatalogPropertiesUtil), ina custom catalog.
The catalog stores table definitions, and the current logic involvespersisting theschema from a CatalogBaseTable to a database. When we get a table, itsdefinition is read from the database and the CatalogTable is built upand returned.
For this, we currently serialize the schema like this:
descriptorProperties.putTableSchema(Schema.SCHEMA,catalogBaseTable.getSchema());
The new API seems to intentionally only allow the serialization of theResolved version of objects (e.g. ResolvedCatalogTable, ResolvedSchema).
1. Could you please clarify why this limitation was put into place? Itseems to me that it wouldbe sufficient to resolve the CatalogTables once we are actually tryingto pass the table to the DynamicTableFactory.
2. What additional information is gained during the resolution of aCatalogTable, and where does that information come from? Are there somereferences to things in other catalogs?
3. Is it possible to "manually" resolve a CatalogTable? (invokesomething like what the internal DefaultSchemaResolver does). Whatcontext is required?
Thanks,
Balazs

Re: Resolving a CatalogTable

Reply via email to