[ https://issues.apache.org/jira/browse/FLINK-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296435#comment-16296435 ]
Haohui Mai commented on FLINK-8240: ----------------------------------- It seems that it is a great use case of layered table sources / converters, thus I'm not fully sure that all tables should be built using {{TableFactory}} yet. Popping up one level, I have a relevant question -- assuming that we need to implement the {{CREATE EXTERNAL TABLE}} statement. How will the statement look like? Here is an example of Hive's {{CREATE EXTERNAL TABLE}} statement: {code} CREATE EXTERNAL TABLE weatherext ( wban INT, date STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ LOCATION ‘ /hive/data/weatherext’; {code} It seems that combinations of {{ROW FORMAT}} and {{LOCATION}} are the effectively same as what you proposed -- but it does not seem to force all table sources to be aware of the compositions of connector / converter (i.e., {{TableFactory}}, at least at the API level. Thoughts? > Create unified interfaces to configure and instatiate TableSources > ------------------------------------------------------------------ > > Key: FLINK-8240 > URL: https://issues.apache.org/jira/browse/FLINK-8240 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL > Reporter: Timo Walther > Assignee: Timo Walther > > At the moment every table source has different ways for configuration and > instantiation. Some table source are tailored to a specific encoding (e.g., > {{KafkaAvroTableSource}}, {{KafkaJsonTableSource}}) or only support one > encoding for reading (e.g., {{CsvTableSource}}). Each of them might implement > a builder or support table source converters for external catalogs. > The table sources should have a unified interface for discovery, defining > common properties, and instantiation. The {{TableSourceConverters}} provide a > similar functionality but use an external catalog. We might generialize this > interface. > In general a table source declaration depends on the following parts: > {code} > - Source > - Type (e.g. Kafka, Custom) > - Properties (e.g. topic, connection info) > - Encoding > - Type (e.g. Avro, JSON, CSV) > - Schema (e.g. Avro class, JSON field names/types) > - Rowtime descriptor/Proctime > - Watermark strategy and Watermark properties > - Time attribute info > - Bucketization > {code} > This issue needs a design document before implementation. Any discussion is > very welcome. -- This message was sent by Atlassian JIRA (v6.4.14#64029)