[ 
https://issues.apache.org/jira/browse/FLINK-9813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545012#comment-16545012
 ] 

Fabian Hueske commented on FLINK-9813:
--------------------------------------

Hi [~flacombe],

I am not sure if I understand the proposal correctly.

IMO, CSV and Avro are two different data formats and serialization schemas. CSV 
stores rows with a flat schema as plain text by separating values by commas 
(although our {{CsvTableSource}} also supports different delimiters). Avro 
supports nested structures and serializes rows in a binary format. Hence, I 
don't see how we could build a {{CsvTableSource}} that supports Avro.

Are you suggestion a {{TableSource}} that reads Avro files?
Btw. we are currently in the process of separating connectors (file system, 
Kafka, Kinesis) from formats (Avro, CSV, JSON, ORC, Parquet) to make them 
easier to combine, i.e., have support for Avro files, by combining the file 
system connector and the Avro schema (see FLINK-8558)

Best, Fabian


> Build xTableSource from Avro schemas
> ------------------------------------
>
>                 Key: FLINK-9813
>                 URL: https://issues.apache.org/jira/browse/FLINK-9813
>             Project: Flink
>          Issue Type: Wish
>          Components: Table API & SQL
>    Affects Versions: 1.5.0
>            Reporter: François Lacombe
>            Priority: Trivial
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> As Avro provide efficient data schemas formalism, it may be great to be able 
> to build Flink Tables Sources with such files.
> More info about Avro schemas 
> :[https://avro.apache.org/docs/1.8.1/spec.html#schemas]
> For instance, with CsvTableSource :
> Parser schemaParser = new Schema.Parser();
> Schema tableSchema = schemaParser.parse("avro.json");
> Builder bld = CsvTableSource.builder().schema(tableSchema);
>  
> This would give me a fully available CsvTableSource with columns defined in 
> avro.json
> It may be possible to do so for every TableSources since avro format is 
> really common and versatile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to