Hi, Yuxia
Thank you for your reply.
We can identify whether a CatalogTable supports atomic Ctas by determining its 
type in DynamicTableFactory/DynamicTableSink, like the following:
boolean isAtomicCtas = context.getCatalogTable().getOrigin() instanceof 
TwoPhaseCatalogTable;
And I've updated the flip.  
this is my poc commit : 
https://github.com/Tartarus0zm/flink/commit/ca82b6a816491df5a251b410f4c614436402d2dc
Looking forward to more feedback










--

Best regards,
Mang Zhang





At 2023-04-14 19:46:08, "yuxia" <luoyu...@alumni.sjtu.edu.cn> wrote:
>Hi, Mang.
>+1 for completing the support for atomicity of CTAS, this is very useful in 
>batch scenarios and integrate with the data lake which support transcation.
>
>I just have one question, IIUC, the DynamiacTableSink will need to know it's 
>for normal case or the atomicity with CTAS as well as neccessary context.
>Take jdbc catalog as an example, if it's CTAS with atomicity supports, the 
>jdbc DynamiacTableSink will write the temp table defined in the 
>TwoPhaseCatalogTable which is different from normal case.
>
>How can the DynamiacTableSink can get it? Could you give some explanation or 
>example in this FLIP?
>
>
>Best regards,
>Yuxia
>
>----- 原始邮件 -----
>发件人: "zhangmang1" <zhangma...@163.com>
>收件人: "dev" <dev@flink.apache.org>, "ron9 liu" <ron9....@gmail.com>, "lincoln 
>86xy" <lincoln.8...@gmail.com>
>发送时间: 星期五, 2023年 4 月 14日 下午 2:50:40
>主题: Re:Re: [DISCUSS] FLIP-305: Support atomic for CREATE TABLE AS SELECT(CTAS) 
>statement
>
>Hi, Lincoln and Ron
>
>
>Thank you for your reply.
>On the naming wise I think OK, the future expansion of new features more 
>uniform. I have updated the FLIP.
>
>
>About Hive support atomicity CTAS, Hive is rich in usage scenarios and can be 
>divided into three scenarios: 1. writing Hive tables 2. writing Hive tables 
>with speculative execution 3. writing Hive table with small file merge
>
>
>The main purpose of FLIP-305 is to implement support for CTAS atomicity in the 
>Flink framework,
>so I only poc to verify the first scenario of writing to the Hive table, and 
>we can subsequently split the sub-task to support the other two scenarios.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>--
>
>Best regards,
>Mang Zhang
>
>
>
>
>
>At 2023-04-13 12:27:24, "Lincoln Lee" <lincoln.8...@gmail.com> wrote:
>>Hi, Mang
>>
>>+1 for completing the support for atomicity of CTAS, this is very useful in
>>batch scenarios.
>>
>>I have two questions:
>>1. naming wise:
>>  a) can we rename the `Catalog#getTwoPhaseCommitCreateTable` to
>>`Catalog#twoPhaseCreateTable` (and we may add
>>twoPhaseReplaceTable/twoPhaseCreateOrReplaceTable later)
>>  b) for the `TwoPhaseCommitCatalogTable`, may it be better using
>>`TwoPhaseCatalogTable`?
>>  c) `TwoPhaseCommitCatalogTable#beginTransaction`, the word 'transaction'
>>in the method name, which may remind users of the relevance of transaction
>>support (however, it is not strictly so), so I suggest changing it to
>>`begin`
>>2. Has this design been validated by any relevant Poc on hive or other
>>catalogs?
>>
>>Best,
>>Lincoln Lee
>>
>>
>>liu ron <ron9....@gmail.com> 于2023年4月13日周四 10:17写道:
>>
>>> Hi, Mang
>>> Atomicity is very important for CTAS, especially for batch jobs. This FLIP
>>> is a continuation of FLIP-218, which is valuable for CTAS.
>>> I just have one question, in the Motivation part of FLIP-218, we mentioned
>>> three levels of atomicity semantics, can this current design do the same as
>>> Spark's DataSource V2, which can guarantee both atomicity and isolation,
>>> for example, can it be done by writing to Hive tables using CTAS?
>>>
>>> Best,
>>> Ron
>>>
>>> Mang Zhang <zhangma...@163.com> 于2023年4月10日周一 11:03写道:
>>>
>>> > Hi, everyone
>>> >
>>> >
>>> >
>>> >
>>> > I'd like to start a discussion about FLIP-305: Support atomic for CREATE
>>> > TABLE AS SELECT(CTAS) statement [1].
>>> >
>>> >
>>> >
>>> >
>>> > CREATE TABLE AS SELECT(CTAS) statement has been support, but it's not
>>> > atomic. It will create the table first before job running. If the job
>>> > execution fails, or is cancelled, the table will not be dropped.
>>> >
>>> >
>>> >
>>> >
>>> > So I want Flink to support atomic CTAS, where only the table is created
>>> > when the Job succeeds. Improve user experience.
>>> >
>>> >
>>> >
>>> >
>>> > Looking forward to your feedback.
>>> >
>>> >
>>> >
>>> >
>>> > [1]
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-305%3A+Support+atomic+for+CREATE+TABLE+AS+SELECT%28CTAS%29+statement
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Best regards,
>>> > Mang Zhang
>>>

Reply via email to