[jira] [Commented] (FLINK-8240) Create unified interfaces to configure and instatiate TableSources

Haohui Mai (JIRA) Tue, 19 Dec 2017 00:26:44 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296435#comment-16296435
 ]


Haohui Mai commented on FLINK-8240:
-----------------------------------

It seems that it is a great use case of layered table sources / converters, 
thus I'm not fully sure that all tables should be built using {{TableFactory}} 
yet.

Popping up one level, I have a relevant question -- assuming that we need to 
implement the {{CREATE EXTERNAL TABLE}} statement. How will the statement look 
like? Here is an example of Hive's {{CREATE EXTERNAL TABLE}} statement:

{code}
CREATE EXTERNAL TABLE weatherext ( wban INT, date STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
LOCATION ‘ /hive/data/weatherext’;
{code}

It seems that combinations of {{ROW FORMAT}} and {{LOCATION}} are the 
effectively same as what you proposed -- but it does not seem to force all 
table sources to be aware of the compositions of connector / converter (i.e., 
{{TableFactory}}, at least at the API level.

Thoughts?

> Create unified interfaces to configure and instatiate TableSources
> ------------------------------------------------------------------
>
>                 Key: FLINK-8240
>                 URL: https://issues.apache.org/jira/browse/FLINK-8240
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>
> At the moment every table source has different ways for configuration and 
> instantiation. Some table source are tailored to a specific encoding (e.g., 
> {{KafkaAvroTableSource}}, {{KafkaJsonTableSource}}) or only support one 
> encoding for reading (e.g., {{CsvTableSource}}). Each of them might implement 
> a builder or support table source converters for external catalogs.
> The table sources should have a unified interface for discovery, defining 
> common properties, and instantiation. The {{TableSourceConverters}} provide a 
> similar functionality but use an external catalog. We might generialize this 
> interface.
> In general a table source declaration depends on the following parts:
> {code}
> - Source
>   - Type (e.g. Kafka, Custom)
>   - Properties (e.g. topic, connection info)
> - Encoding
>   - Type (e.g. Avro, JSON, CSV)
>   - Schema (e.g. Avro class, JSON field names/types)
> - Rowtime descriptor/Proctime
>   - Watermark strategy and Watermark properties
>   - Time attribute info
> - Bucketization
> {code}
> This issue needs a design document before implementation. Any discussion is 
> very welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-8240) Create unified interfaces to configure and instatiate TableSources

Reply via email to