Hi jun:

Sorry for the late reply,  I share my thoughts on StreamingFileSink in FLIP-63 
[1] and I don't recommend using StreamingFileSink to support partitioning in 
Table.
1.The bucket concept and SQL's bucket concept are in serious conflict.[2]
2.In table, we need support single-partition writing, grouped multi-partition 
writing, non-grouped multi-partition writing.
3.We need a global role to commit files to metastore.
4.We need an abstraction to support both streaming and batch mode.
5.Table partition is simpler than StreamingFileSink, the concept of 
partitioning is that we only support partition references on fields, rather 
than being as flexible as runtime.

The DDL can like this:
CREATE TABLE USER_T (
  a INT,
  b STRING,
  c DOUBLE
) PARTITIONED BY (date STRING, country STRING)
WITH (
  'connector.type' = ‘filesystem’,
  'connector.path' = 'hdfs:///tmp/xxx',
  'format.type' = 'csv',
  'update-mode' = 'append',
'partition-support' = 'true'
 )
In SQL world, we can only support row inputs. 
The only difference from the previous FileSystem is that the partition-support 
attribute is required. We can use this identifier to represent the new 
connector support partition without changing the previous connector.
Other attributes can be completely consistent. We can add parquet, Orc and 
other formats incrementally later.

[1] 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-63-Rework-table-partition-support-td32770.html
[2] 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables


------------------------------------------------------------------
From:Jun Zhang <825875...@qq.com>
Send Time:2019年9月22日(星期日) 23:18
To:dev <dev@flink.apache.org>
Cc:Kurt Young <ykt...@gmail.com>; fhue...@gmail.com <fhue...@gmail.com>
Subject:[DISCUSS] Add Bucket File System Connector

Hi,everyone:
&nbsp; &nbsp; &nbsp; In the current flink system, use flink sql to read data 
and then write it to File System with kind of formats is not supported, the 
current File System Connector is only experimental [1], so I have developed a 
new File System Connector.
&nbsp; &nbsp; &nbsp; &nbsp;Thanks to the suggestion of Kurt and Fabian, I 
carefully studied the design documentation of FLIP-63, redesigned this feature, 
enriched the functionality of the existing File System Connector, and add 
partition support. Users can add this File System Connector by using code or 
DDL, and then use flink sql to write data to the file system.
&nbsp; &nbsp; &nbsp; &nbsp;We can treat it as a sub-task of FLIP-63. I wrote a 
design document and put it in google docs [2].
&nbsp; &nbsp; &nbsp; &nbsp;I hope everyone will give me some more suggestion, 
thank you very much.。
&nbsp; &nbsp; &nbsp; 
[1].https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/connect.html#file-system-connector
[2].https://docs.google.com/document/d/1R5K_tKgy1MhqhQmolGD_hKnEAKglfeHRDa2f4tB-xew/edit?usp=sharing

Reply via email to