Re: [DISCUSS] FLIP-302: Support TRUNCATE TABLE statement

2023-04-07 Thread Jing Ge
Hi yuxia, Thanks for raising this topic. It is indeed a useful feature. +1 for having it in Flink. I have some small questions and it would be great if related information could be described in the FLIP. 1. Speaking of data warehouse use cases, what is the benefit of using TRUNCATE table ove

Re: [DISCUSS] EXACTLY_ONCE delivery semantics for upsert-kafka connector

2023-04-07 Thread Alexander Sorokoumov
Hi Jark, To my knowledge, Kafka's EXACTLY_ONCE transactions together with idempotent producers prevent duplicated records[1], at least in the cases when upstream does not produce them intentionally and across checkpoints. Could you please elaborate or point me to the docs that explain the reason

[jira] [Created] (FLINK-31755) ROW function can not work with RewriteIntersectAllRule

2023-04-07 Thread Aitozi (Jira)
Aitozi created FLINK-31755: -- Summary: ROW function can not work with RewriteIntersectAllRule Key: FLINK-31755 URL: https://issues.apache.org/jira/browse/FLINK-31755 Project: Flink Issue Type: Bug

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Jing Ge
Hi, Jingsong, Yanfei, please check, if you can view the doc. Thanks. Best regards, Jing On Fri, Apr 7, 2023 at 2:19 PM Zakelly Lan wrote: > Hi Yanfei, > > Thanks for your comments. > > > Does this result in a larger space amplification? Maybe a more > suitable value can be determined through s

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Zakelly Lan
Hi Yanfei, Thanks for your comments. > Does this result in a larger space amplification? Maybe a more suitable value can be determined through some experimental statistics after we implement this feature. Yes, it results in larger space amplification for shared states. I will do more tests and i

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Zakelly Lan
Hi @Piotr and @Jingsong Li I have read access to the document, but I'm not sure whether the owner of this document wants to make it public. Actually, the doc is for FLINK-23342 and there is a candidate design very similar to this FLIP, but only for the shared state. Like Yun said, the previous des

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Zakelly Lan
Hi Piotr, Thanks for your comments! (1) Sorry for the misleading, let me make it more clear. It is a concurrent checkpoint senario. Yes, the assumption you said needs to be followed, but the state handles here refer to the original SST files, not the underlying file. In this FLIP when checkpoint

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Zakelly Lan
Hi Yun, Thanks for your suggestions! I have read the FLINK-23342 and its design doc as you provided. First of all the goal of this FLIP and the doc are similar, and the design of this FLIP is pretty much like option 3. The main difference is that we imply the concept of 'epoch' in the folder path

[jira] [Created] (FLINK-31754) Build flink master error with Error in ASM processing class org/apache/calcite/sql/validate/SqlValidatorImpl$NavigationExpander.class: 19

2023-04-07 Thread zhenlong dong (Jira)
zhenlong dong created FLINK-31754: - Summary: Build flink master error with Error in ASM processing class org/apache/calcite/sql/validate/SqlValidatorImpl$NavigationExpander.class: 19 Key: FLINK-31754 URL: https:/

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Yanfei Lei
Thanks for your explanation Zakelly. (1) Keeping these merging granularities for different types of files as presets that are not configurable is a good idea to prevent performance degradation. (2) > For the third option, 64MB is an acceptable target size. The RocksDB state > backend in Flink als

Re: [DISCUSS] EXACTLY_ONCE delivery semantics for upsert-kafka connector

2023-04-07 Thread Jark Wu
Hi Alexander, I’m not sure I fully understand the reasons. I left my comments inline. > 1. There might be other non-Flink topic consumers that would rather not have duplicated records. Exactly once can’t avoid producing duplicated records. Because the upstream produces duplicated records intent

[jira] [Created] (FLINK-31753) Support DataStream CoGroup in stream Mode with similar performance as DataSet CoGroup

2023-04-07 Thread Dong Lin (Jira)
Dong Lin created FLINK-31753: Summary: Support DataStream CoGroup in stream Mode with similar performance as DataSet CoGroup Key: FLINK-31753 URL: https://issues.apache.org/jira/browse/FLINK-31753 Project

Re: [External] [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

2023-04-07 Thread Jane Chan
Hi, devs, Thanks for all the feedback. Based on the discussion [1], we seem to have a consensus so far, so I would like to start a vote on FLIP-292 [2], which begins on the following Monday (Apr. 10th at 10:00 AM GMT). If you have any questions or concerns, please don't hesitate to follow up on

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Jingsong Li
Hi Yun, It looks like this doc needs permission to read? [1] [1] https://docs.google.com/document/d/1NJJQ30P27BmUvD7oa4FChvkYxMEgjRPTVdO1dHLl_9I/edit# Best, Jingsong On Fri, Apr 7, 2023 at 4:34 PM Piotr Nowojski wrote: > > Hi, > > +1 To what Yun Tang wrote. We don't seem to have access to the

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-07 Thread Piotr Nowojski
Hi, +1 To what Yun Tang wrote. We don't seem to have access to the design doc. Could you make it publicly visible or copy out its content to another document? Thanks for your answers Zakelly. (1) Yes, the current mechanism introduced in FLINK-24611 allows for checkpoint N, to only re-use shared

[jira] [Created] (FLINK-31752) SourceOperatorStreamTask increments numRecordsOut twice

2023-04-07 Thread Weihua Hu (Jira)
Weihua Hu created FLINK-31752: - Summary: SourceOperatorStreamTask increments numRecordsOut twice Key: FLINK-31752 URL: https://issues.apache.org/jira/browse/FLINK-31752 Project: Flink Issue Type:

[jira] [Created] (FLINK-31751) array return type SpecificTypeStrategies.ARRAY and ifThenElse return type is not correct

2023-04-07 Thread jackylau (Jira)
jackylau created FLINK-31751: Summary: array return type SpecificTypeStrategies.ARRAY and ifThenElse return type is not correct Key: FLINK-31751 URL: https://issues.apache.org/jira/browse/FLINK-31751 Proj

Re: [DISCUSS] FLIP 295: Support persistence of Catalog configuration and asynchronous registration

2023-04-07 Thread Feng Jin
hi Shammon Thank you for your response, and I completely agree with your point of view. Initially, I may have over complicated the whole issue. First and foremost, we need to consider the persistence of the Catalog's Configuration. If we only need to provide persistence for Catalog Configuration,