Hi Mang, I have two questions/remarks:
1. The FLIP mentions that if the user doesn't specify the WITH option part in the query of the sink table, it will be assumed that the user wants to create a managed table. What will happen if the user doesn't have Table Store configured/installed? Will we throw an error? 2. Will there be support included for FLIP-190 (version upgrades)? Best regards, Martijn Op wo 29 jun. 2022 om 05:18 schreef Mang Zhang <zhangma...@163.com>: > Hi everyone, > Thank you to all those who participated in the discussion, we have > discussed many rounds, the program has been gradually revised and improved, > looking forward to further feedback, we will launch a vote in the next day > or two. > > > > > > > > -- > > Best regards, > Mang Zhang > > > > > > At 2022-06-28 22:23:16, "Mang Zhang" <zhangma...@163.com> wrote: > >Hi Yuxia, > >Thank you very much for your reply. > > > > > >>1: Also, the mixture of ctas and rtas confuses me as the FLIP talks > nothing about rtas but refer it in the configuration suddenly. And if > we're not to implement rtas in this FLIP, it may be better not to refer it > and the `rtas` shouldn't exposed to user as a configuration. > >Currently does not support RTAS because in the stream mode and batch mode > semantic unification issues and specific business scenarios are not very > clear, the future we will support, if in support of rtas and then modify > the option name, then it will bring the cost of modifying the configuration > to the user. > >>2: How will the CTASJobStatusHook be passed to StreamGraph as a hook? > Could you please explain about it. Some pseudocode will be much better if > it's possible. I'm lost in this part. > > > > > > > > > >This part is too much of an implementation detail, and of course we had > to make some changes to achieve this. FLIP focuses on semantic consistency > in stream and batch mode, and can provide optional atomicity support. > > > > > >>3: The name `AtomicCatalog` confuses me. Seems the backgroud for the > naming is to implement atomic for ctas, we propose a interface for catalog > to support serializing, then we name it to `AtomicCatalog`. At least, the > interface is for the atomic of ctas. But if we want to implement other > features like isolate which may also require serializable catalog in the > future, should we introduce a new interface naming `IsolateCatalog`? Have > you ever considered other names like `SerializableCatalog`. As it's a > public interface, maybe we should be careful about the name. > >Regarding the definition of the Catalog name, we have also discussed the > name `SerializableCatalog`, which is too specific and does not relate to > the atomic functionality we want to express. CTAS/RTAS want to support > atomicity, need Catalog to implement `AtomicCatalog`, so it's more > straightforward to understand. > > > > > >Hope this answers your question. > > > > > > > > > >-- > > > >Best regards, > >Mang Zhang > > > > > > > > > > > >At 2022-06-28 11:36:51, "yuxia" <luoyu...@alumni.sjtu.edu.cn> wrote: > >>Thanks for updating. The FLIP looks generall good to me. I have only > minor questions: > >> > >>1: Also, the mixture of ctas and rtas confuses me as the FLIP talks > nothing about rtas but refer it in the configuration suddenly. And if > we're not to implement rtas in this FLIP, it may be better not to refer it > and the `rtas` shouldn't exposed to user as a configuration. > >> > >>2: How will the CTASJobStatusHook be passed to StreamGraph as a hook? > Could you please explain about it. Some pseudocode will be much better if > it's possible. I'm lost in this part. > >> > >>3: The name `AtomicCatalog` confuses me. Seems the backgroud for the > naming is to implement atomic for ctas, we propose a interface for catalog > to support serializing, then we name it to `AtomicCatalog`. At least, the > interface is for the atomic of ctas. But if we want to implement other > features like isolate which may also require serializable catalog in the > future, should we introduce a new interface naming `IsolateCatalog`? Have > you ever considered other names like `SerializableCatalog`. As it's a > public interface, maybe we should be careful about the name. > >> > >> > >>Best regards, > >>Yuxia > >> > >>----- 原始邮件 ----- > >>发件人: "Mang Zhang" <zhangma...@163.com> > >>收件人: "dev" <dev@flink.apache.org> > >>抄送: imj...@gmail.com > >>发送时间: 星期一, 2022年 6 月 27日 下午 5:43:50 > >>主题: Re:Re: Re:Re: Re: Re: Re: [DISCUSS] FLIP-218: Support SELECT clause > in CREATE TABLE(CTAS) > >> > >>Hi Jark, > >>First of all, thank you for your very good advice! > >>The RTAS point you mentioned is a good one, and we should support it as > well. > >>However, by investigating the semantics of RTAS and how RTAS is used > within the company, I found that: > >>1. The semantics of RTAS says that if the table exists, need to delete > the old data and use the new data. > >>This semantics is better implemented in Batch mode, for example, if the > target table is a Hive table, old data file can be deleted directly. > >>But in Streaming mode, the target table is probably a Kafka topic, we > can't delete the data. > >>So the semantics in streaming and batch scenarios are not well > guaranteed to be consistent. > >>2. I checked the SQL for big data in the company in the last week and > found that RTAS was not used. > >>No users in the company have mentioned the need for RTAS yet. So this > application scenario is not very clear. > >> > >> > >>It is not clear what kind of semantics RTAS should provide in streaming > mode, and the user's business scenarios are not very clear. > >>Maybe We don't have to support RTAS soon, but we can leave the > possibility of supporting RTAS in the future in the interface definition. > >>What do you think? Looking forward to your response! > >> > >> > >>By the way, the other points raised have been updated. thanks. > >> > >> > >> > >> > >>-- > >> > >>Best regards, > >>Mang Zhang > >> > >> > >> > >> > >> > >>At 2022-06-26 11:56:53, "Jark Wu" <imj...@gmail.com> wrote: > >>>Thanks for the update, Mang and Ron, > >>> > >>>The new proposal looks good to me in general, especially keeping the > >>>behavior > >>>consistent between batch and streaming mode by default. This is how we > do > >>>it > >>>in the previous "table.dml-sync" option on ML [1]. > >>> > >>>Besides that, I just have some final minor comments regarding some > >>>interfaces. > >>> > >>>1) table.ctas-or-rtas.atomicity-enabled > >>>The "OR" keyword sounds like this configuration can only take effect on > one > >>>of CTAS and RTAS. > >>>What about "table.ctas-and-rtas" or "table.ctas-rtas"? > >>> > >>>2) In the FLIP, you have mentioned RTAS many times, but have no plan to > >>>support it. > >>>RTAS is another widely used statement similar to CTAS. It seems there is > >>>not much difference > >>>between CTAS and RTAS. Considering we are introducing RTAS > configurations, > >>>is it possible > >>> to support RTAS in this FLIP as well? > >>> > >>>3) connector.type > >>>"connector.type" has been deprecated since FLIP-95, could you replace > them > >>>with 'connector'? > >>> > >>>4) SupportsAtomicCatalog > >>>I have some concerns about using "Supports.." prefix which is known as > the > >>>ability extension for > >>>DynamicTableSource and DynamicTableSink. Maybe "AtomicCatalog" is > enough? > >>> > >>>Best, > >>>Jark > >>> > >>>[1]: https://lists.apache.org/thread/78r8ybh4q3hkxf935vzjkb7782hqzcj2 > >>> > >>>On Fri, 24 Jun 2022 at 22:51, Mang Zhang <zhangma...@163.com> wrote: > >>> > >>>> Hi all, > >>>> Thank you to all those who participated in the discussion and made > >>>> suggestions! > >>>> After several rounds of online and offline discussions, the solution > in > >>>> FLIP has been updated. > >>>> Looking forward to more feedback from everyone. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> > >>>> Best regards, > >>>> Mang Zhang > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> At 2022-06-24 21:58:01, "Mang Zhang" <zhangma...@163.com> wrote: > >>>> >Hi godfrey and ron, > >>>> >Thank you very much for your replies and suggestions. > >>>> >Special thanks to ron for helping to review and improve the FLIP. > >>>> >Looking forward to further feedback from others. > >>>> > > >>>> > > >>>> > > >>>> >-- > >>>> > > >>>> >Best regards, > >>>> >Mang Zhang > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> >At 2022-06-24 19:52:58, "ron" <ld...@zju.edu.cn> wrote: > >>>> >>Thanks for godfrey further feedback, your suggestions are very good > to > >>>> me, the FLIP has updated according to your feedback. It will be very > good > >>>> if you look at it again。 > >>>> >> > >>>> >>Also looking forward to further feedback from others. > >>>> >> > >>>> >> > >>>> >>> -----原始邮件----- > >>>> >>> 发件人: "godfrey he" <godfre...@gmail.com> > >>>> >>> 发送时间: 2022-06-24 17:00:51 (星期五) > >>>> >>> 收件人: dev <dev@flink.apache.org> > >>>> >>> 抄送: "Yun Gao" <yungao...@aliyun.com> > >>>> >>> 主题: Re: Re: Re: [DISCUSS] FLIP-218: Support SELECT clause in > CREATE > >>>> TABLE(CTAS) > >>>> >>> > >>>> >>> Hi all, > >>>> >>> > >>>> >>> Sorry for the late reply. > >>>> >>> > >>>> >>> >table.cor-table-as-select.atomicity-enabled > >>>> >>> Regarding `cor`, this abbreviation is not commonly used. > >>>> >>> > >>>> >>> >Create Table As Select(CTAS) feature depends on the > serializability > >>>> of the catalog. To quickly see if the catalog supports CTAS, we need > to try > >>>> to serialize the catalog when compile SQL in planner and if it fails, > an > >>>> exception will be >thrown to indicate to user that the catalog does > not > >>>> support CTAS because it cannot be serialized. > >>>> >>> This behavior is too cryptic, and will break the current catalog > >>>> >>> behavior when using 1.16. > >>>> >>> I suggest we introduce a new interface for atomic catalog which > >>>> >>> implements Serializable. > >>>> >>> The existent catalogs can choose whether implements the new > catalog > >>>> interface. > >>>> >>> > >>>> >>> > Catalog#inferTableOptions > >>>> >>> I strongly recommend not introducing this feature now, because the > >>>> >>> behavior is unclear. > >>>> >>> 1) if the catalog support managed table, the connector option is > >>>> >>> empty. but if user forget to > >>>> >>> set connector option for CTAS statement, the created table will be > >>>> >>> managed table. > >>>> >>> 2) the options and its values for catalog and for connector may be > >>>> different, > >>>> >>> so use the catalog option may cause expected errors. > >>>> >>> > >>>> >>> > StreamGraph#addJobStatusHook > >>>> >>> I prefer `registerJobStatusHook` > >>>> >>> > >>>> >>> Best, > >>>> >>> Godfrey > >>>> >>> > >>>> >>> Mang Zhang <zhangma...@163.com> 于2022年6月13日周一 16:43写道: > >>>> >>> > > >>>> >>> > Hi Yun, > >>>> >>> > Thanks for your reply! > >>>> >>> > Through offline communication with Dalong, I updated the > >>>> JobStatusHook part to FLIP, looking forward to your feedback. > >>>> >>> > > >>>> >>> > > >>>> >>> > > >>>> >>> > -- > >>>> >>> > > >>>> >>> > Best regards, > >>>> >>> > Mang Zhang > >>>> >>> > > >>>> >>> > > >>>> >>> > > >>>> >>> > > >>>> >>> > > >>>> >>> > At 2022-05-31 14:34:25, "Yun Gao" <yungao...@aliyun.com.INVALID > > > >>>> wrote: > >>>> >>> > >Hi, > >>>> >>> > > > >>>> >>> > >Regarding the drop operation, with some offline discussion with > >>>> Dalong and Zhu, > >>>> >>> > >we think that listening in the client side might be problematic > >>>> since it would exit > >>>> >>> > >after submitting the jobs in detached mode, thus the operation > >>>> might need to > >>>> >>> > >be in the JobMaster side. > >>>> >>> > > > >>>> >>> > >For the listener interface, currently JobListener only resides > in > >>>> the client side > >>>> >>> > >and contains unsuitable methods like onJobSubmitted for this > >>>> scenario, and > >>>> >>> > >the internal JobStatusListener is designed to be used inside > JM and > >>>> is not > >>>> >>> > >serializable, thus we tend to add a new interface > JobStatusHook, > >>>> >>> > >which could be attached to the JobGraph and executed in the > >>>> JobMaster. > >>>> >>> > >The interface will also be marked as Internal. > >>>> >>> > > > >>>> >>> > >Best, > >>>> >>> > >Yun > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > >------------------------------------------------------------------ > >>>> >>> > >From:Mang Zhang <zhangma...@163.com> > >>>> >>> > >Send Time:2022 May 25 (Wed.) 10:24 > >>>> >>> > >To:dev <dev@flink.apache.org> > >>>> >>> > >Subject:Re:Re: [DISCUSS] FLIP-218: Support SELECT clause in > CREATE > >>>> TABLE(CTAS) > >>>> >>> > > > >>>> >>> > >Hi, Martijn > >>>> >>> > >Thanks for your reply! > >>>> >>> > >I looked at the SQL standard, CTAS is part of the SQL standard. > >>>> >>> > >Feature T172 is "AS subquery clause in table definition". > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > >-- > >>>> >>> > > > >>>> >>> > >Best regards, > >>>> >>> > >Mang Zhang > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > >At 2022-05-04 21:49:00, "Martijn Visser" < > martijnvis...@apache.org> > >>>> wrote: > >>>> >>> > >>Hi everyone, > >>>> >>> > >> > >>>> >>> > >>Can we identify if this proposed syntax is part of the SQL > >>>> standard? > >>>> >>> > >> > >>>> >>> > >>Best regards, > >>>> >>> > >> > >>>> >>> > >>Martijn Visser > >>>> >>> > >>https://twitter.com/MartijnVisser82 > >>>> >>> > >>https://github.com/MartijnVisser > >>>> >>> > >> > >>>> >>> > >> > >>>> >>> > >>On Fri, 29 Apr 2022 at 11:19, yuxia < > luoyu...@alumni.sjtu.edu.cn> > >>>> wrote: > >>>> >>> > >> > >>>> >>> > >>> Thanks for for driving this work, it's to be a useful > feature. > >>>> >>> > >>> About the flip-218, I have some questions. > >>>> >>> > >>> > >>>> >>> > >>> 1: Does our CTAS syntax support specify target table's > schema > >>>> including > >>>> >>> > >>> column name and data type? I think it maybe a useful fature > in > >>>> case we want > >>>> >>> > >>> to change the data types in target table instead of always > copy > >>>> the source > >>>> >>> > >>> table's schema. It'll be more flexible with this feature. > >>>> >>> > >>> Btw, MySQL's "CREATE TABLE ... SELECT Statement"[1] support > this > >>>> feature. > >>>> >>> > >>> > >>>> >>> > >>> 2: Seems it'll requre sink to implement an public interface > to > >>>> drop table, > >>>> >>> > >>> so what's the interface will look like? > >>>> >>> > >>> > >>>> >>> > >>> [1] > >>>> https://dev.mysql.com/doc/refman/8.0/en/create-table-select.html > >>>> >>> > >>> > >>>> >>> > >>> Best regards, > >>>> >>> > >>> Yuxia > >>>> >>> > >>> > >>>> >>> > >>> ----- 原始邮件 ----- > >>>> >>> > >>> 发件人: "Mang Zhang" <zhangma...@163.com> > >>>> >>> > >>> 收件人: "dev" <dev@flink.apache.org> > >>>> >>> > >>> 发送时间: 星期四, 2022年 4 月 28日 下午 4:57:24 > >>>> >>> > >>> 主题: [DISCUSS] FLIP-218: Support SELECT clause in CREATE > >>>> TABLE(CTAS) > >>>> >>> > >>> > >>>> >>> > >>> Hi, everyone > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> I would like to open a discussion for support select clause > in > >>>> CREATE > >>>> >>> > >>> TABLE(CTAS), > >>>> >>> > >>> With the development of business and the enhancement of > flink sql > >>>> >>> > >>> capabilities, queries become more and more complex. > >>>> >>> > >>> Now the user needs to use the Create Table statement to > create > >>>> the target > >>>> >>> > >>> table first, and then execute the insert statement. > >>>> >>> > >>> However, the target table may have many columns, which will > >>>> bring a lot of > >>>> >>> > >>> work outside the business logic to the user. > >>>> >>> > >>> At the same time, ensure that the schema of the created > target > >>>> table is > >>>> >>> > >>> consistent with the schema of the query result. > >>>> >>> > >>> Using a CTAS syntax like Hive/Spark can greatly facilitate > the > >>>> user. > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> You can find more details in FLIP-218[1]. Looking forward to > >>>> your feedback. > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> [1] > >>>> >>> > >>> > >>>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-218%3A+Support+SELECT+clause+in+CREATE+TABLE(CTAS) > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> > >>>> >>> > >>> -- > >>>> >>> > >>> > >>>> >>> > >>> Best regards, > >>>> >>> > >>> Mang Zhang > >>>> >>> > >>> > >>>> >>> > > > >>>> >> > >>>> >> > >>>> >>------------------------------ > >>>> >>Best, > >>>> >>Ron > >>>> >