Re: [DISCUSS] FLIP-415: Introduce a new join operator to support minibatch

2024-01-11 Thread Jingsong Li
Hi all, This is a relatively large optimization that may pose a significant risk of bugs, so I like to keep it from being enabled by default for now. Best, Jingsong On Fri, Jan 12, 2024 at 3:01 PM shuai xu wrote: > > Suppose we currently have a job that joins two CDC sources after > de-duplica

Re: Re: Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-11 Thread Kurt Yang
+1 (binding) Best, Kurt On Fri, Jan 12, 2024 at 2:21 PM Hequn Cheng wrote: > +1 (binding) > > Thanks, > Hequn > > On Fri, Jan 12, 2024 at 2:19 PM godfrey he wrote: > > > +1 (binding) > > > > Thanks, > > Godfrey > > > > Zhu Zhu 于2024年1月12日周五 14:10写道: > > > > > > +1 (binding) > > > > > > Thank

Re: [DISCUSS] FLIP-415: Introduce a new join operator to support minibatch

2024-01-11 Thread shuai xu
Suppose we currently have a job that joins two CDC sources after de-duplicating them and the output is available for audit analysis, and the user turns off the parameter "table.exec.deduplicate.mini-batch.compact-changes-enabled" to ensure that it does not lose update details. If we don't introd

Re: [DISCUSS] FLIP-417: Expose JobManagerOperatorMetrics via REST API

2024-01-11 Thread Hang Ruan
Hi, Mason. Thanks for driving this FLIP. The JobManagerOperatorQueryScopeInfo has three fields: jobID, vertexID and operatorName. So we should use the operator name in the API. If you think we should use the operator id, there need be more changes about it. About the Xuyang's questions, we add b

Re: [VOTE] FLIP-405: Migrate string configuration key to ConfigOption

2024-01-11 Thread Zhu Zhu
+1 (binding) Thanks, Zhu Xuannan Su 于2024年1月12日周五 14:24写道: > Hi all, > > I would like to clarify the statement regarding the first improvement > from the previous email, as it was incomplete. To be more specific, we > will also deprecate the getClass(String key, Class > defaultValue, ClassLoade

Re: [VOTE] FLIP-405: Migrate string configuration key to ConfigOption

2024-01-11 Thread Xuannan Su
Hi all, I would like to clarify the statement regarding the first improvement from the previous email, as it was incomplete. To be more specific, we will also deprecate the getClass(String key, Class defaultValue, ClassLoader classLoader) and setClass(String key, Class klazz), as they are intended

Re: Re: Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-11 Thread Hequn Cheng
+1 (binding) Thanks, Hequn On Fri, Jan 12, 2024 at 2:19 PM godfrey he wrote: > +1 (binding) > > Thanks, > Godfrey > > Zhu Zhu 于2024年1月12日周五 14:10写道: > > > > +1 (binding) > > > > Thanks, > > Zhu > > > > Hangxiang Yu 于2024年1月11日周四 14:26写道: > > > > > +1 (non-binding) > > > > > > On Thu, Jan 11,

Re: Re: Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-11 Thread jincheng sun
+1 (binding) Best, Jincheng Sun Zhu Zhu 于2024年1月12日周五 14:11写道: > +1 (binding) > > Thanks, > Zhu > > Hangxiang Yu 于2024年1月11日周四 14:26写道: > > > +1 (non-binding) > > > > On Thu, Jan 11, 2024 at 11:19 AM Xuannan Su > wrote: > > > > > +1 (non-binding) > > > > > > Best, > > > Xuannan > > > > > > O

Re: Re: Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-11 Thread godfrey he
+1 (binding) Thanks, Godfrey Zhu Zhu 于2024年1月12日周五 14:10写道: > > +1 (binding) > > Thanks, > Zhu > > Hangxiang Yu 于2024年1月11日周四 14:26写道: > > > +1 (non-binding) > > > > On Thu, Jan 11, 2024 at 11:19 AM Xuannan Su wrote: > > > > > +1 (non-binding) > > > > > > Best, > > > Xuannan > > > > > > On Thu

Re: Re: Re: [VOTE] Accept Flink CDC into Apache Flink

2024-01-11 Thread Zhu Zhu
+1 (binding) Thanks, Zhu Hangxiang Yu 于2024年1月11日周四 14:26写道: > +1 (non-binding) > > On Thu, Jan 11, 2024 at 11:19 AM Xuannan Su wrote: > > > +1 (non-binding) > > > > Best, > > Xuannan > > > > On Thu, Jan 11, 2024 at 10:28 AM Xuyang wrote: > > > > > > +1 (non-binding)-- > > > > > > Best! >

Re:Re: [DISCUSS] FLIP-415: Introduce a new join operator to support minibatch

2024-01-11 Thread Xuyang
Hi, Xu Shuai. Thanks for driving this flip. The CDC message amplification of cascade join has always been a problem for users. Judging from the nexmark results, this optimization is very meaningful. I just have the same doubts as Benchao, why can't we use minibatch join as the default behavio

Re: [VOTE] FLIP-405: Migrate string configuration key to ConfigOption

2024-01-11 Thread Xuannan Su
Hi all, During voting, we identified two improvements we'd like to make to the FLIP: - We will mark the getBytes(String key, byte[] defaultValue) and setBytes(String key, byte[] bytes) methods as @Internal, as they are intended for internal use only. - In addition to marking all getXxx(ConfigOpti

Re: [FLIP-412] Add the time-consuming span of each stage when starting the Flink job to TraceReporter

2024-01-11 Thread Rui Fan
The permission is added by Piotr, thank you Piotr. Best, Rui On Thu, Jan 11, 2024 at 9:15 PM Eason Qin wrote: > Hi all, > > Currently, I am working on the FLIP-412: Add the time-consuming span of > each stage when starting the Flink job to TraceReporter[1], but I have no > permission to update

[jira] [Created] (FLINK-34067) Fix javacc warnings in flink-sql-parser

2024-01-11 Thread Jim Hughes (Jira)
Jim Hughes created FLINK-34067: -- Summary: Fix javacc warnings in flink-sql-parser Key: FLINK-34067 URL: https://issues.apache.org/jira/browse/FLINK-34067 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-34066) LagFunction throw NPE when input argument are not null

2024-01-11 Thread Yunhong Zheng (Jira)
Yunhong Zheng created FLINK-34066: - Summary: LagFunction throw NPE when input argument are not null Key: FLINK-34066 URL: https://issues.apache.org/jira/browse/FLINK-34066 Project: Flink Issu

Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for improved state compatibility on parallelism change

2024-01-11 Thread Zhanghao Chen
Thanks for the input, Piotr. It might still be possible to make it compatible with the old snapshots, following the direction of FLINK-5290 suggested by Yu. I'll discuss with Yu on more details. Best, Zhanghao Chen __

[jira] [Created] (FLINK-34065) Design AbstractAutoscalerStateStore to support serialize State to String

2024-01-11 Thread Rui Fan (Jira)
Rui Fan created FLINK-34065: --- Summary: Design AbstractAutoscalerStateStore to support serialize State to String Key: FLINK-34065 URL: https://issues.apache.org/jira/browse/FLINK-34065 Project: Flink

Re:[DISCUSS] FLIP-417: Expose JobManagerOperatorMetrics via REST API

2024-01-11 Thread Xuyang
Hi, Mason. Thanks for driving this Flip. I think it's important for external system to be able to perceive the metric of the operator coordinator. +1 for it. I just have the following minor questions and am looking forward to your reply. Please forgive me if I have some misunderstandings. 1.

[ANNOUNCE] Apache Flink-shaded 18.0 released

2024-01-11 Thread Sergey Nuyanzin
The Apache Flink community is very happy to announce the release of Apache Flink-shaded 18.0. The flink-shaded project contains a number of shaded dependencies for Apache Flink. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and acc

[DISCUSS] FLIP-417: Expose JobManagerOperatorMetrics via REST API

2024-01-11 Thread Mason Chen
Hi Devs, I'm opening this thread to discuss a short FLIP for exposing JobManagerOperatorMetrics via REST API [1]. The current set of REST APIs make it impossible to query coordinator metrics. This FLIP proposes a new REST API to query the JobManagerOperatorMetrics. [1] https://cwiki.apache.org/c

[jira] [Created] (FLINK-34064) Expose JobManagerOperatorMetrics via REST API

2024-01-11 Thread Mason Chen (Jira)
Mason Chen created FLINK-34064: -- Summary: Expose JobManagerOperatorMetrics via REST API Key: FLINK-34064 URL: https://issues.apache.org/jira/browse/FLINK-34064 Project: Flink Issue Type: Improve

Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for improved state compatibility on parallelism change

2024-01-11 Thread Piotr Nowojski
Hi, Using unaligned checkpoints is orthogonal to this FLIP. Yes, unaligned checkpoints are not supported for pointwise connections, so most of the cases go away anyway. It is possible to switch from unchained to chained subtasks by removing a keyBy exchange, and this would be a problem, but that'

Re: [DISCUSS] FLIP-414: Support Retry Mechanism in RocksDBStateDataTransfer

2024-01-11 Thread Piotr Nowojski
Hi, Thanks for the proposal. I second the Hangxiang's suggestions. I think this might be valuable. Instead of retrying the whole checkpoint, it will be more resource efficient to retry upload of a single file. Regarding re-using configuration options, a while back we introduced `taskmanager.netw

[jira] [Created] (FLINK-34063) When snapshot compression is enabled, rescaling of a source operator leads to some splits getting lost

2024-01-11 Thread Ivan Burmistrov (Jira)
Ivan Burmistrov created FLINK-34063: --- Summary: When snapshot compression is enabled, rescaling of a source operator leads to some splits getting lost Key: FLINK-34063 URL: https://issues.apache.org/jira/browse/F

Re: [DISCUSS] FLIP-389: Annotate SingleThreadFetcherManager and FutureCompletingBlockingQueue as PublicEvolving

2024-01-11 Thread Becket Qin
Hi Qingsheng, Thanks for the comment. I think the initial idea is to hide the queue completely from the users, i.e. make FutureCompletingBlockingQueue class internal. If it is OK to expose the class to the users, then just returning the queue sounds reasonable to me. Thanks, Jiangjie (Becket) Qi

[FLIP-412] Add the time-consuming span of each stage when starting the Flink job to TraceReporter

2024-01-11 Thread Eason Qin
Hi all, Currently, I am working on the FLIP-412: Add the time-consuming span of each stage when starting the Flink job to TraceReporter[1], but I have no permission to update the Flink Improvement Proposals space. Can any PMC help me add permissions? My Jira account is easonqin and my email is qi

Re: [VOTE] FLIP-407: Improve Flink Client performance in interactive scenarios

2024-01-11 Thread Rui Fan
+1 binding Best, Rui On Thu, 11 Jan 2024 at 19:45, xiangyu feng wrote: > Hi all, > > I would like to start the vote for FLIP-407: Improve Flink Client > performance in interactive scenarios[1]. > This FLIP was discussed in this thread [2]. > > The vote will be open for at least 72 hours unless

Re: [VOTE] Release flink-connector-hive, release candidate #1

2024-01-11 Thread Sergey Nuyanzin
Great that it is resolved and thanks a lot for checking On Thu, Jan 11, 2024 at 8:31 AM Hang Ruan wrote: > Hi, Sergey. > > Thanks for the quick reply. > > I try to package it in other pc with jdk8 and it succeeds. Please ignore > it. It seems like some errors in my environment. > > Best, > Hang

[VOTE] FLIP-407: Improve Flink Client performance in interactive scenarios

2024-01-11 Thread xiangyu feng
Hi all, I would like to start the vote for FLIP-407: Improve Flink Client performance in interactive scenarios[1]. This FLIP was discussed in this thread [2]. The vote will be open for at least 72 hours unless there is an objection or insufficient votes. Regards, Xiangyu [1] https://cwiki.apach

[jira] [Created] (FLINK-34062) Propagate in the surefire-plugin configuration for Java 21

2024-01-11 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34062: - Summary: Propagate in the surefire-plugin configuration for Java 21 Key: FLINK-34062 URL: https://issues.apache.org/jira/browse/FLINK-34062 Project: Flink

Re: [Discuss] FLIP-407: Improve Flink Client performance in interactive scenarios

2024-01-11 Thread xiangyu feng
Hi devs, Thanks for all the feedback. If there are no more comments, I would like to start a vote for this FLIP, thanks again! Best, Xiangyu Feng Weihua Hu 于2024年1月9日周二 14:45写道: > Thanks for proposing this FLIP. > > Experiments have shown that it significantly enhances the real-time query > ex

[jira] [Created] (FLINK-34061) Add explicit exclusion of JDK-related excluded groups in the surefire-plugin config

2024-01-11 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34061: - Summary: Add explicit exclusion of JDK-related excluded groups in the surefire-plugin config Key: FLINK-34061 URL: https://issues.apache.org/jira/browse/FLINK-34061

Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for improved state compatibility on parallelism change

2024-01-11 Thread Yu Chen
Hi Zhanghao, Actually, Stefan has done similar compatibility work in the early FLINK-5290[1], where he introduced the legacyStreamGraphHashers list for hasher backward compatibility. We have attempted to implement a similar feature in the internal version of FLINK and tried to include the new

[DISCUSS] [connectors] FileSystem connector - restore from historical checkpoint

2024-01-11 Thread Сергей Парышев
Hi devs! I have question about filesystem (parquet) sink  connector. When compaction is enabled and job restoring from historical checkpoint then job canceling with FileNotFoundException: can't find old .uncompacted file, when compaction is disabled and restoring from historical checkpoint fres

Re: [DISCUSS] FLIP-414: Support Retry Mechanism in RocksDBStateDataTransfer

2024-01-11 Thread Hangxiang Yu
Thanks for driving this. Retry mechanism is common when we want to get or put data by network. So I think it will help when checkpoint failure due to temporary network problems, of course it may increase a bit overhead for some other reasons. Some comments and suggestions: 1. Since Flink has a che

[jira] [Created] (FLINK-34060) Migrate UserDefinedTableAggFunctions to JavaUserDefinedTableAggFunctions

2024-01-11 Thread Jane Chan (Jira)
Jane Chan created FLINK-34060: - Summary: Migrate UserDefinedTableAggFunctions to JavaUserDefinedTableAggFunctions Key: FLINK-34060 URL: https://issues.apache.org/jira/browse/FLINK-34060 Project: Flink

Re: [DISCUSS] FLIP-415: Introduce a new join operator to support minibatch

2024-01-11 Thread Benchao Li
> the change might not be supposed for the downstream of the job which requires > details of changelog Could you elaborate on this a bit? I've never met such kinds of requirements before, I'm curious what is the scenario that requires this. shuai xu 于2024年1月11日周四 13:08写道: > > Thanks for your re