Re: [DISCUSS] FLIP-307: Flink connector Redshift

2023-05-30 Thread Leonard Xu
Thanks @Samrat for bringing this discussion. It makes sense to me to introduce AWS Redshift connector for Apache Flink, and I’m glad to help review the design as well as the code review. About the implementation phases, How about prioritizing support for the Datastream Sink API and TableSink AP

Re: [DISCUSS] Hive dialect shouldn't fall back to Flink's default dialect

2023-05-30 Thread Jingsong Li
+1, the fallback looks weird now, it is outdated. But, it is good to provide an option. I don't know if there are some users who depend on this fallback. Best, Jingsong On Tue, May 30, 2023 at 1:47 PM Rui Li wrote: > > +1, the fallback was just intended as a temporary workaround to run > catal

Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-05-30 Thread João Boto
Hi Lijie, I left the two options to use whatever you want, but I can clean the FLIP to have only one.. Updated the FLIP Regards On 2023/05/23 07:23:45 Lijie Wang wrote: > Hi Joao, > > I noticed the FLIP currently contains the following 2 methods about type > serializer: > > (1) TypeSerializ

Fwd: Parquet fille sink to azure blob

2023-05-30 Thread Eli Golin
I have defined the following sink: *object ParquetSink { def parquetFileSink[A <: Message: ClassTag]( assigner: A => String, config: Config )(implicit lc: LoggingConfigs): FileSink[A] = {val bucketAssigner = new BucketAssigner[A, String] { override

Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-05-30 Thread Piotr Nowojski
Hi again, Thanks Dong, yes I think your concerns are valid, and that's why I have previously refined my idea to use one of the backpressure measuring metrics that we already have. Either simply `isBackPressured == true` check [1], or `backPressuredTimeMsPerSecond >= N` (where `N ~= 990`) [2]. That

Re: [DISCUSS] FLIP-311: Support Call Stored Procedure

2023-05-30 Thread yuxia
Hi, Jingsong. Thanks for your feedback. > Does this need to be a function call? Do you have some example? I think it'll be useful to support function call when user call procedure. The following example is from iceberg:[1] CALL catalog_name.system.migrate('spark_catalog.db.sample', map('foo', 'bar

Re: [DISCUSS] FLIP-307: Flink connector Redshift

2023-05-30 Thread Samrat Deb
Hi Leonard, > and I’m glad to help review the design as well as the code review. Thank you so much. It would be really great and helpful to bring flink-connector-redshift for flink users :) . I have divided the implementation in 3 phases in the `Scope` Section[1]. 1st phase is to - Integrate

Re: [DISCUSS] FLIP-307: Flink connector Redshift

2023-05-30 Thread Samrat Deb
[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift [note] Missed the trailing link for previous mail On Tue, May 30, 2023 at 2:43 PM Samrat Deb wrote: > Hi Leonard, > > > and I’m glad to help review the design as well as the code review. > Thank you so

[jira] [Created] (FLINK-32217) Retain metric store can cause NPE

2023-05-30 Thread Junrui Li (Jira)
Junrui Li created FLINK-32217: - Summary: Retain metric store can cause NPE Key: FLINK-32217 URL: https://issues.apache.org/jira/browse/FLINK-32217 Project: Flink Issue Type: Bug Compon

Re: [Discussion] - Release major Flink version to support JDK 17 (LTS)

2023-05-30 Thread Tamir Sagi
Hey Chesnay, I'm sending a follow up email regarding JDK 17 support. I see the Epic[1] is in progress and frequently updated. I'm curios if there is an ETA or any plan to release a major release with JDK 17 support that is not backward compatible. Thanks, Tamir [1] https://issues.apache.org/

[SUMMARY] Flink 1.18 Release Sync 05/30/2023

2023-05-30 Thread Qingsheng Ren
Hi devs and users, I'd like to share some highlights from the release sync of 1.18 on May 30. 1. @developers please update the progress of your features on 1.18 release wiki page [1] ! That will help us a lot to have an overview of the entire release cycle. 2. We found a JIRA issue (FLINK-18356)

Re: [DISCUSS] FLIP-311: Support Call Stored Procedure

2023-05-30 Thread Jingsong Li
Thanks for your explanation. We can support Iterable in future. Current design looks good to me. Best, Jingsong On Tue, May 30, 2023 at 4:56 PM yuxia wrote: > > Hi, Jingsong. > Thanks for your feedback. > > > Does this need to be a function call? Do you have some example? > I think it'll be use

[jira] [Created] (FLINK-32218) Implement support for parent/child shard ordering

2023-05-30 Thread Hong Liang Teoh (Jira)
Hong Liang Teoh created FLINK-32218: --- Summary: Implement support for parent/child shard ordering Key: FLINK-32218 URL: https://issues.apache.org/jira/browse/FLINK-32218 Project: Flink Issue

Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-05-30 Thread Dong Lin
Hi Piotr, Thank you for providing those details. I understand you suggested using the existing "isBackPressured" signal to determine whether we should use the less frequent checkpointing interval. I followed your thoughts and tried to make it work. Below are the issues that I am not able to addre

Re: [DISCUSS] Status of Statefun Project

2023-05-30 Thread Galen Warren
Getting to a resolution here would be great and much appreciated, yes. On Sat, May 27, 2023 at 1:03 AM Salva Alcántara wrote: > Hey Galen, > > I took a look at StateFun some time ago; not using it in production but I > agree that it would be a pity to abandon it. > > As Martijn said, let's be cl

Re: [SUMMARY] Flink 1.18 Release Sync 05/30/2023

2023-05-30 Thread Jing Ge
Thanks Qingsheng for driving it! @Devs As you might already be aware of, there are many externalizations and new releases of Flink connectors. Once a connector has been externalized successfully, i.e. the related module has been removed in the Flink repo, we will not set a priority higher than maj

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Weihua Hu
Thanks Paul for the proposal. +1 for this. It is valuable in improving ease of use. I have a few questions. - Is SQLRunner the better name? We use this to run a SQL Job. (Not strong, the SQLDriver is fine for me) - Could we run SQL jobs using SQL in strings? Otherwise, we need to prepare a SQL fi

Re: Job stuck in CREATED state with scheduling failures

2023-05-30 Thread Matthias Pohl
Hi Gyula, Could you share the logs in the ML? Or is there a Jira issue I missed? Matthias On Wed, May 17, 2023 at 9:33 PM Gyula Fóra wrote: > Hey Devs! > > I am bumping this thread to see if someone has any ideas how to go about > solving this. > > Yang Wang earlier had this comment but I am no

Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-05-30 Thread Jing Ge
Hi Piotr, > But why do we need to have two separate mechanisms, if the dynamic > adjustment based on the backpressure/backlog would > achieve basically the same goal as your proposal and would solve both of > the problems? Having two independent solutions > in the same codebase, in the docs, that

Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-05-30 Thread Piotr Nowojski
Hi Dong, First of all we don't need to send any extra signal from source (or non source) operators. All of the operators are already reporting backpressured metrics [1] and all of the metrics are already sent to JobManager. We would only need to pass some accessor to the metrics to the `Checkpoint

Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-05-30 Thread Tzu-Li (Gordon) Tai
Hi, > I think we can get the serializer directly in InitContextImpl through `getOperatorConfig().getTypeSerializerIn(0, getUserCodeClassloader()).duplicate()`. This should work, yes. +1 to the updated FLIP so far. Thank you, Joao, for being on top of this! Thanks, Gordon On Tue, May 30, 2023 a

Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-05-30 Thread Shammon FY
Thanks Feng, the catalog modification listener is only used to report read-only ddl information to other components or systems. > 1. Will an exception thrown by the listener affect the normal execution process? Users need to handle the exception in the listener themselves. Many DDLs such as drop

Re: [DISCUSS] Hive dialect shouldn't fall back to Flink's default dialect

2023-05-30 Thread liu ron
Thanks for your proposal. I even don't notice this fallback behavior, +1. Best, Ron Jingsong Li 于2023年5月30日周二 15:23写道: > +1, the fallback looks weird now, it is outdated. > > But, it is good to provide an option. I don't know if there are some > users who depend on this fallback. > > Best, > Ji

Re: [DISCUSS] FLIP-307: Flink connector Redshift

2023-05-30 Thread liu ron
Hi, Samrat Thanks for driving this FLIP. It looks like supporting flink-connector-redshift is very useful to Flink. I have two question: 1. Regarding the `read.mode` and `write.mode`, you say here provides two modes, respectively, jdbc and `unload or copy`, What is the default value for `read.mod

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Shammon FY
Thanks Paul for driving this proposal. I found the sql driver has no config related options. If I understand correctly, the sql driver can be used to submit sql jobs in a 'job submission service' such as sql-gateway. In general, in addition to the default config for Flink cluster which includes k8

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Shengkai Fang
Thanks for the proposal. The Application mode is very important to Flink SQL. But I have some questions about the FLIP: 1. The FLIP does not specify the kind of SQL that will be submitted with the application mode. I believe only a portion of the SQL will be delegated to the SqlRunner. 2. Will the

[jira] [Created] (FLINK-32219) sql client would be pending after executing plan of inserting

2023-05-30 Thread Shuai Xu (Jira)
Shuai Xu created FLINK-32219: Summary: sql client would be pending after executing plan of inserting Key: FLINK-32219 URL: https://issues.apache.org/jira/browse/FLINK-32219 Project: Flink Issue

Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-05-30 Thread liu ron
Hi, Shammon Thanks for driving this FLIP, It will enforce the Flink metadata capability from the platform produce perspective. The overall design looks good to me, I just have some small question: 1. Regarding CatalogModificationListenerFactory#createListener method, I think it would be better to

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Paul Lam
Hi Weihua, Thanks a lot for your input! Please see my comments inline. > - Is SQLRunner the better name? We use this to run a SQL Job. (Not strong, > the SQLDriver is fine for me) I’ve thought about SQL Runner but picked SQL Driver for the following reasons FYI: 1. I have a PythonDriver doing

Re: [DISCUSS] FLIP-308: Support Time Travel In Batch Mode

2023-05-30 Thread liu ron
Hi, Feng Thanks for driving this FLIP, Time travel is very useful for Flink integrate with data lake system. I have one question why the implementation of TimeTravel is delegated to Catalog? Assuming that we use Flink to query Hudi table with the time travel syntax, but we don't use the HudiCatalo

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Paul Lam
Hi Shammon, Thanks a lot for your input! I thought SQL Driver could act as a general-purpose default main class for Flink SQL. It could be used in Flink CLI submission, web submission, or SQL Client/Gateway submission. For SQL Client/Gateway submission, we use it implicitly if needed, and for the

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Biao Geng
Thanks Paul for the proposal!I believe it would be very useful for flink users. After reading the FLIP, I have some questions: 1. Scope: is this FLIP only targeted for non-interactive Flink SQL jobs in Application mode? More specifically, if we use SQL client/gateway to execute some interactive SQL

Re: [DISCUSS] FLIP-316: Introduce SQL Driver

2023-05-30 Thread Paul Lam
Sorry for the typo. I mean “We already have a PythonDriver doing the same job for PyFlink." Best, Paul Lam > 2023年5月31日 11:49,Paul Lam 写道: > > 1. I have a PythonDriver doing the same job for PyFlink [1]

Re: [DISCUSS] FLIP 295: Support persistence of Catalog configuration and asynchronous registration

2023-05-30 Thread liu ron
Hi, Feng Thanks for driving this FLIP, this proposal is very useful for catalog management. I have some small questions: 1. Regarding the CatalogStoreFactory#createCatalogStore method, do we need to provide a default implementation? 2. If we get Catalog from CatalogStore, after initializing it, w

Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-05-30 Thread yuxia
Thanks Shammon for driving it. The FLIP generally looks good to me. I only have one question. WRT AlterDatabaseEvent, IIUC, it'll contain the origin database name and the new CatalogDatabase after modified. Is it enough only pass the origin database name? Will it be better to contain the origin

Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for Flink SQL

2023-05-30 Thread liu ron
Hi, Jinsong Thanks for your valuable suggestions. Best, Ron Jingsong Li 于2023年5月30日周二 13:22写道: > Thanks Ron for your information. > > I suggest that it can be written in the Motivation of FLIP. > > Best, > Jingsong > > On Tue, May 30, 2023 at 9:57 AM liu ron wrote: > > > > Hi, Jingsong > > >

[jira] [Created] (FLINK-32220) Improving the adaptive local hash agg code to avoid get value from RowData repeatedly

2023-05-30 Thread dalongliu (Jira)
dalongliu created FLINK-32220: - Summary: Improving the adaptive local hash agg code to avoid get value from RowData repeatedly Key: FLINK-32220 URL: https://issues.apache.org/jira/browse/FLINK-32220 Proje

Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-05-30 Thread Shammon FY
Hi ron, Thanks for your feedback. > 1. Regarding CatalogModificationListenerFactory#createListener method, I think it would be better to pass Context as its parameter instead of two specific Object. In this way, we can easily extend it in the future and there will be no compatibility problems. I

Re: [DISCUSS] FLIP-313 Add support of User Defined AsyncTableFunction

2023-05-30 Thread Aitozi
Hi Jing, What do you think about it? Can we move forward this feature? Thanks, Aitozi. Aitozi 于2023年5月29日周一 09:56写道: > Hi Jing, > > "Do you mean to support the AyncTableFunction beyond the > LookupTableSource?" > Yes, I mean to support the AyncTableFunction beyond the LookupTableSource.

Re: [DISCUSS] FLIP-294: Support Customized Job Meta Data Listener

2023-05-30 Thread Shammon FY
Hi yuxia Thanks for your input. The `AlterDatabaseEvent` extends `DatabaseModificationEvent` which has the original database. Best, Shammon FY On Wed, May 31, 2023 at 2:24 PM yuxia wrote: > Thanks Shammon for driving it. > The FLIP generally looks good to me. I only have one question. > WRT Al

Re: [DISCUSS] FLIP-309: Enable operators to trigger checkpoints dynamically

2023-05-30 Thread Dong Lin
Hi Piotr, Thanks for the reply. Please see my comments inline. On Wed, May 31, 2023 at 12:58 AM Piotr Nowojski wrote: > Hi Dong, > > First of all we don't need to send any extra signal from source (or non > source) operators. All of the operators are already reporting backpressured > metrics [1