Hi Balazs,

you are right, the new APIs only allow the serialization of resolved instances. This ensures that only validated, correct instances are put into the persistent storage such as a database. The framework will always provide resolved instances and call the corresponding methods with those. They should be easily serializable.

However, when reading from a persistent storage such as a database, the framework needs to validate the input and resolved expressions and data types (e.g. from a string representation).

The new design reflects the reality better. A catalog implementation does not need to be symmetric. It follows the principle:

- "Resolved" into the catalog (with all information if implementers need it)
- "Unresolved" out of the catalog (let the framework deal with the resolution, also with cross references to other catalogs)


Use ResolvedCatalogTable#toProperties for putting basic info into your database.

Use CatalogTable#fromProperties to restore the table.

This is esp important for expression resolution of computed columns and watermark strategies. Functions could come from other catalogs as well.

So for implementers it is usally not important to resolved the `CatalogTable` manually.

If it is important for you, maybe you can elaborate a bit on your use case?

Regards,
Timo


On 26.01.22 12:18, Balázs Varga wrote:
Hi everyone,

I'm trying to migrate from the old set of CatalogTable related APIs (CatalogTableImpl, TableSchema, DescriptorProperties) to the new ones (CatalogBaseTable, Schema and ResolvedSchema, CatalogPropertiesUtil), in a custom catalog.

The catalog stores table definitions, and the current logic involves persisting the schema from a CatalogBaseTable to a database. When we get a table, its definition is read from the database and the CatalogTable is built up and returned.

For this, we currently serialize the schema like this:
descriptorProperties.putTableSchema(Schema.SCHEMA, catalogBaseTable.getSchema());

The new API seems to intentionally only allow the serialization of the Resolved version of objects (e.g. ResolvedCatalogTable, ResolvedSchema).

1. Could you please clarify why this limitation was put into place? It seems to me that it would be sufficient to resolve the CatalogTables once we are actually trying to pass the table to the DynamicTableFactory.

2. What additional information is gained during the resolution of a CatalogTable, and where does that information come from? Are there some references to things in other catalogs?

3. Is it possible to "manually" resolve a CatalogTable? (invoke something like what the internal DefaultSchemaResolver does). What context is required?

Thanks,
Balazs


Reply via email to