Re: [VOTE] FLIP-471: Fixing watermark idleness timeout accounting
+1 (binding) Best, Stefan > On 31. Jul 2024, at 04:56, Zakelly Lan wrote: > > +1 (binding) > > > Best, > Zakelly > > On Wed, Jul 31, 2024 at 12:07 AM Piotr Nowojski > wrote: > >> Hi all! >> >> I would like to open the vote for FLIP-471 [1]. It has been discussed here >> [2]. >> >> The vote will remain open for at least 72 hours. >> >> Best, >> Piotrek >> >> [1] >> https://www.google.com/url?q=https://cwiki.apache.org/confluence/x/oQvOEg&source=gmail-imap&ust=172299943500&usg=AOvVaw1i6RoAc0yxKmQkhezGamhM >> [2] >> https://www.google.com/url?q=https://lists.apache.org/thread/byj1l2236rfx3mcl3v4374rcbkq4rf85&source=gmail-imap&ust=172299943500&usg=AOvVaw1z2jA41WlE0WWB_ZCstuci >>
Re: [VOTE] Release 1.20.0, release candidate #2
+1 (binding) - Built from source - Reviewed web PR and release note - Verified checksum and signature - Checked GitHub release tag - Tested submitting SQL job with SQL client reading and writing Kafka Best, Qingsheng On Tue, Jul 30, 2024 at 2:26 PM Xintong Song wrote: > +1 (binding) > > - reviewed flink-web PR > - verified checksum and signature > - verified source archives don't contain binaries > - built from source > - tried example jobs on a standalone cluster, and everything looks fine > > Best, > > Xintong > > > > On Tue, Jul 30, 2024 at 12:13 AM Jing Ge > wrote: > > > Thanks Weijie! > > > > +1 (binding) > > > > - verified signatures > > - verified checksums > > - checked Github release tag > > - reviewed the PRs > > - checked the repo > > - started a local cluster, tried with WordCount, everything was fine. > > > > Best regards, > > Jing > > > > > > On Mon, Jul 29, 2024 at 1:47 PM Samrat Deb > wrote: > > > > > Thank you Weijie for driving 1.20 release > > > > > > +1 (non-binding) > > > > > > - Verified checksums and sha512 > > > - Verified signatures > > > - Verified Github release tags > > > - Build from source > > > - Start the flink cluster locally run few jobs (Statemachine and word > > > Count) > > > > > > Bests, > > > Samrat > > > > > > > > > On Mon, Jul 29, 2024 at 3:15 PM Ahmed Hamdy > > wrote: > > > > > > > Thanks Weijie for driving > > > > > > > > +1 (non-binding) > > > > > > > > - Verified checksums > > > > - Verified signature matches Rui Fan's > > > > - Verified tag exists on Github > > > > - Build from source > > > > - Verified no binaries in source archive > > > > - Reviewed release notes PR (some nits) > > > > > > > > Best Regards > > > > Ahmed Hamdy > > > > > > > > > > > > On Thu, 25 Jul 2024 at 12:21, weijie guo > > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > Please review and vote on the release candidate #2 for the version > > > > 1.20.0, > > > > > > > > > > as follows: > > > > > > > > > > > > > > > [ ] +1, Approve the release > > > > > > > > > > [ ] -1, Do not approve the release (please provide specific > comments) > > > > > > > > > > > > > > > The complete staging area is available for your review, which > > includes: > > > > > > > > > > * JIRA release notes [1], and the pull request adding release note > > for > > > > > users [2] > > > > > > > > > > * the official Apache source release and binary convenience > releases > > to > > > > be > > > > > > > > > > deployed to dist.apache.org [3], which are signed with the key > with > > > > > > > > > > fingerprint B2D64016B940A7E0B9B72E0D7D0528B28037D8BC [4], > > > > > > > > > > * all artifacts to be deployed to the Maven Central Repository [5], > > > > > > > > > > * source code tag "release-1.20.0-rc2" [6], > > > > > > > > > > * website pull request listing the new release and adding > > announcement > > > > blog > > > > > > > > > > post [7]. > > > > > > > > > > > > > > > The vote will be open for at least 72 hours. It is adopted by > > majority > > > > > > > > > > approval, with at least 3 PMC affirmative votes. > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354210 > > > > > > > > > > [2] https://github.com/apache/flink/pull/25091 > > > > > > > > > > [3] https://dist.apache.org/repos/dist/dev/flink/flink-1.20.0-rc2/ > > > > > > > > > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS > > > > > > > > > > [5] > > > > > > > > > https://repository.apache.org/content/repositories/orgapacheflink-1752/ > > > > > > > > > > [6] > https://github.com/apache/flink/releases/tag/release-1.20.0-rc2 > > > > > > > > > > [7] https://github.com/apache/flink-web/pull/751 > > > > > > > > > > > > > > > Best, > > > > > > > > > > Robert, Rui, Ufuk, Weijie > > > > > > > > > > > > > > >
[jira] [Created] (FLINK-35937) Helm RBAC cleanup
Tim created FLINK-35937: --- Summary: Helm RBAC cleanup Key: FLINK-35937 URL: https://issues.apache.org/jira/browse/FLINK-35937 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Affects Versions: 1.19.0 Environment: Kubernetes 1.29.4 Reporter: Tim This is a follow up ticket to [https://issues.apache.org/jira/projects/FLINK/issues/FLINK-35310] In further research and testing with Kyverno I figured out that some apiGroups seem to be invalid and I removed them with this PR. It seems that the "extensions" apiGroups does not exist on our recent cluster (Kubernetes 1.29.4). I'm not sure but it might be related to these deprecation notices: https://kubernetes.io/docs/reference/using-api/deprecation-guide/#deployment-v116 Same holds for the "finalizers" resources. They do not seem to exist anymore and lead to problems with our deployment. So I also removed them. To complete the verb list I also added "deleteCollections" where applicable. Please take a look if this makes sense to you. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[DISCUSS] Remove deprecated dataset based API from State Processor API
Hi Devs, There is a common agreement that the dataset API going to be removed in 2.0. We're planning to invest quite some time in the State Processor API[1] in the near future. On this road we've realized that there are several deprecated APIs in this area which are based on the dataset API (for example Savepoint, ExistingSavepoint, NewSavepoint, etc...) [2]. Based on the dataset removal agreement these are going to be removed but in order to ease our efforts it would be good to remove them now. The question is whether it's fine to do it or there are some reasons not to do so? Here are the collected facts about the APIs to help forming an opinion: * Example intended to remove class: Savepoint (others share the same characteristics) * API marker: PublicEvolving * Marked deprecated in: FLINK-24912 [3] * Deprecation date: 22/12/2021 * Deprecation in Flink version: 1.15.0 * Eligible for removal according to our deprecation process [4]: 1.16.0+ * New replacement APIs: SavepointReader, SavepointWriter Please share your thoughts on this. [1] https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/libs/state_processor_api/ [2] https://github.com/apache/flink/blob/82b628d4730eef32b2f7a022e3b73cb18f950e6e/flink-libraries/flink-state-processing-api/src/main/java/org/apache/flink/state/api/Savepoint.java#L47-L49 [3] https://issues.apache.org/jira/browse/FLINK-24912 [4] https://cwiki.apache.org/confluence/display/FLINK/FLIP-321%3A+Introduce+an+API+deprecation+process BR, G
Re: [DISCUSS] FLIP-XXX Amazon SQS Source Connector
Hi Ahmed, Thank you very much for the detailed, valuable review. Please find our responses below: - In the FLIP you mention the split is going to be 1 sqs Queue, does this mean we would support reading from multiple queues? This is also not clear in the implementation of `addSplitsBack` whether we are planning to support multiple sqs topics or not. *Our current implementation assumes that each source reads from a single SQS queue. If you need to read from multiple SQS queues, you can define multiple sources accordingly. We believe this approach is clearer and more organized compared to having a single source switch between multiple queues. This design choice is based on weighing the benefits, but we can support multiple queues per source if the need arises.* - Regarding Client creation, there has been some effort in the common `aws-util` module like createAwsSyncClient, we should reuse that for `SqsClient` creation. *Thank you for bringing this to our attention. Yes, we will utilize the existing createClient methods available in the libraries. Our goal is to avoid any code duplication on our end.* - On the same point for clients, Is there a reason the FLIP suggests async clients? sync clients have proven more stable and the source threading model already guarantees no blocking by sync clients. *We were not aware of this, and we have been using async clients for our in-house use cases. However, since we already have sync clients in the aws-util that ensure no blocking, we are in a good position. We will use these sync clients during our development and testing efforts, and we will share the results and keep the community updated.* - On mentioning threading, the FLIP doesn’t mention the fetcher manager. Is it going to be `SingleThreadFetcherManager`? Would it be better to make the source reader extend the SingleThreadedMultiplexReaderBase or are we going to implement a more simple version? *Yes, we are considering implementing SingleThreadMultiplexSourceReaderBase for the Reader. We have included the implementation snippet in the FLIP for reference.* - The FLIP doesn’t mention schema deserialization or the recordEmitter implementation, Are we going to use `deserializationSchema` or some sort of string to element converter? It is also not clear form the builder example provided? *Yes, we plan to use the deserializationSchema and recordEmitter implementations. We have included sample code for these in the FLIP for reference.* - Are the values mentioned in getSqsClientProperties recommended defaults? If so we should highlight that. *The defaults are not decided. These are just sample snapshots for example.* - Most importantly I am a bit skeptical regarding enforcing exactly-once semantics with side effects especially with dependency on checkpointing configuration, could we add flags to disable and disable by default if the checkpointing is not enabled? *During our initial design phase, we intended to enforce exactly-once semantics via checkpoints. However, you raise a valid point, and we will make this a configurable feature for users. They can choose to disable exactly-once semantics, accepting some duplicate processing (at-least-once) as a trade-off. We have updated the FLIP to include support for this feature.* - I am not 100% convinced we should block FLIP itself on the FLIP-438 implementation but I echo the fact that there might be some reusable code between the 2 submodules we should make use of. *Yes we echo the same. We fully understand the concern and are closely examining the SQS Sink implementation. We will ensure there is no duplication of work or submodules. If any issues arise, we will address them promptly.* Thank you for your valuable feedback on the FLIP. Your input is helping us refine and improve it significantly. [Main Proposal doc] - https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit#heading=h.ci1rrcgbsvkl Regards Saurabh & Abhi On Sun, Jul 28, 2024 at 3:04 AM Ahmed Hamdy wrote: > Hi Saurabh > I think this is going to be a valuable addition which is needed. > I have a couple of comments > - In the FLIP you mention the split is going to be 1 sqs Queue, does this > mean we would support reading from multiple queues? The builder example > seems to support a single queue > SqsSource.builder.setSqsUrl(" > https://sqs.us-east-1.amazonaws.com/23145433/sqs-test";) > - This is also not clear in the implementation of `addSplitsBack` whether > we are planning to support multiple sqs topics or not. > - Regarding Client creation, there has been some effort in the common > `aws-util` module like createAwsSyncClient , we should reuse that for > `SqsClient` creation. > - On the same point for clients, Is there a reason the FLIP suggests async > clients? sync clients have proven more stable and the source threading > model already guarantees no b
Re: [VOTE] FLIP-471: Fixing watermark idleness timeout accounting
+1 (binding) Thanks for fixing this critical bug. Regards, Timo On 31.07.24 09:51, Stefan Richter wrote: +1 (binding) Best, Stefan On 31. Jul 2024, at 04:56, Zakelly Lan wrote: +1 (binding) Best, Zakelly On Wed, Jul 31, 2024 at 12:07 AM Piotr Nowojski wrote: Hi all! I would like to open the vote for FLIP-471 [1]. It has been discussed here [2]. The vote will remain open for at least 72 hours. Best, Piotrek [1] https://www.google.com/url?q=https://cwiki.apache.org/confluence/x/oQvOEg&source=gmail-imap&ust=172299943500&usg=AOvVaw1i6RoAc0yxKmQkhezGamhM [2] https://www.google.com/url?q=https://lists.apache.org/thread/byj1l2236rfx3mcl3v4374rcbkq4rf85&source=gmail-imap&ust=172299943500&usg=AOvVaw1z2jA41WlE0WWB_ZCstuci
Re: [VOTE] Release 1.20.0, release candidate #2
+1 (binding) - checked Github release tag - verified signatures and hashsums - built from source code succeeded - reviewed the web PR, left minor comment - checked release notes, minor: there're some issues need to update Fix Version[1] - started SQL Client, used MySQL CDC connector to read changelog from database , the result is expected Best, Leonard [1] https://issues.apache.org/jira/projects/FLINK/versions/12354210 > 2024年7月31日 下午4:24,Qingsheng Ren 写道: > > +1 (binding) > > - Built from source > - Reviewed web PR and release note > - Verified checksum and signature > - Checked GitHub release tag > - Tested submitting SQL job with SQL client reading and writing Kafka > > Best, > Qingsheng > > On Tue, Jul 30, 2024 at 2:26 PM Xintong Song wrote: > >> +1 (binding) >> >> - reviewed flink-web PR >> - verified checksum and signature >> - verified source archives don't contain binaries >> - built from source >> - tried example jobs on a standalone cluster, and everything looks fine >> >> Best, >> >> Xintong >> >> >> >> On Tue, Jul 30, 2024 at 12:13 AM Jing Ge >> wrote: >> >>> Thanks Weijie! >>> >>> +1 (binding) >>> >>> - verified signatures >>> - verified checksums >>> - checked Github release tag >>> - reviewed the PRs >>> - checked the repo >>> - started a local cluster, tried with WordCount, everything was fine. >>> >>> Best regards, >>> Jing >>> >>> >>> On Mon, Jul 29, 2024 at 1:47 PM Samrat Deb >> wrote: >>> Thank you Weijie for driving 1.20 release +1 (non-binding) - Verified checksums and sha512 - Verified signatures - Verified Github release tags - Build from source - Start the flink cluster locally run few jobs (Statemachine and word Count) Bests, Samrat On Mon, Jul 29, 2024 at 3:15 PM Ahmed Hamdy >>> wrote: > Thanks Weijie for driving > > +1 (non-binding) > > - Verified checksums > - Verified signature matches Rui Fan's > - Verified tag exists on Github > - Build from source > - Verified no binaries in source archive > - Reviewed release notes PR (some nits) > > Best Regards > Ahmed Hamdy > > > On Thu, 25 Jul 2024 at 12:21, weijie guo > wrote: > >> Hi everyone, >> >> >> Please review and vote on the release candidate #2 for the version > 1.20.0, >> >> as follows: >> >> >> [ ] +1, Approve the release >> >> [ ] -1, Do not approve the release (please provide specific >> comments) >> >> >> The complete staging area is available for your review, which >>> includes: >> >> * JIRA release notes [1], and the pull request adding release note >>> for >> users [2] >> >> * the official Apache source release and binary convenience >> releases >>> to > be >> >> deployed to dist.apache.org [3], which are signed with the key >> with >> >> fingerprint B2D64016B940A7E0B9B72E0D7D0528B28037D8BC [4], >> >> * all artifacts to be deployed to the Maven Central Repository [5], >> >> * source code tag "release-1.20.0-rc2" [6], >> >> * website pull request listing the new release and adding >>> announcement > blog >> >> post [7]. >> >> >> The vote will be open for at least 72 hours. It is adopted by >>> majority >> >> approval, with at least 3 PMC affirmative votes. >> >> >> [1] >> >> > >>> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354210 >> >> [2] https://github.com/apache/flink/pull/25091 >> >> [3] https://dist.apache.org/repos/dist/dev/flink/flink-1.20.0-rc2/ >> >> [4] https://dist.apache.org/repos/dist/release/flink/KEYS >> >> [5] >> >> https://repository.apache.org/content/repositories/orgapacheflink-1752/ >> >> [6] >> https://github.com/apache/flink/releases/tag/release-1.20.0-rc2 >> >> [7] https://github.com/apache/flink-web/pull/751 >> >> >> Best, >> >> Robert, Rui, Ufuk, Weijie >> > >>> >>
[jira] [Created] (FLINK-35938) Avoid commit the same datafile again in Paimon Sink.
LvYanquan created FLINK-35938: - Summary: Avoid commit the same datafile again in Paimon Sink. Key: FLINK-35938 URL: https://issues.apache.org/jira/browse/FLINK-35938 Project: Flink Issue Type: Technical Debt Components: Flink CDC Affects Versions: cdc-3.1.0 Reporter: LvYanquan Fix For: cdc-3.2.0 Attachments: image-2024-07-31-19-45-14-153.png [Flink will re-commit committables|https://github.com/apache/flink/blob/82b628d4730eef32b2f7a022e3b73cb18f950e6e/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/connector/sink2/GlobalCommitterOperator.java#L148] when job restart from failure. This may cause the same datafile were added twice in current PaimonCommitter. !image-2024-07-31-19-45-14-153.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[RESULT][VOTE] Release 1.20.0, release candidate #2
Hi everyone, I'm happy to announce that we have unanimously approved this release. There are 7 approving votes, 5 of which are binding: - Rui Fan(binding) - Ahmed Hamdy(non-binding) - Samrat Deb(non-binding) - Jing Ge(binding) - Xintong Song (binding) - Qingsheng Ren(binding) - Leonard Xu(binding) There are no disapproving votes. Thank you for verifying the release candidate. We will now proceed to finalize the release and announce it once everything is published. Best, Robert, Rui, Ufuk, Weijie
[jira] [Created] (FLINK-35939) Do not set empty config values via ConfigUtils#encodeCollectionToConfig
Ferenc Csaky created FLINK-35939: Summary: Do not set empty config values via ConfigUtils#encodeCollectionToConfig Key: FLINK-35939 URL: https://issues.apache.org/jira/browse/FLINK-35939 Project: Flink Issue Type: Improvement Affects Versions: 1.19.1 Reporter: Ferenc Csaky Fix For: 2.0.0 The {{ConfigUtils#encodeCollectionToConfig}} function only skips to set a given {{ConfigOption}} value, if that value is null. If the passed collection is empty, it will set that empty collection. I think this behavior is less logical and can cause more undesired situations, when we only set a value if it is not empty AND not null. Furthermore, the method's [javadoc|https://github.com/apache/flink/blob/82b628d4730eef32b2f7a022e3b73cb18f950e6e/flink-core/src/main/java/org/apache/flink/configuration/ConfigUtils.java#L73] describes the logic I just mentioned above, which is in conflict with the actual implementation and tests, which sets an empty collection. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] FLIP-471: Fixing watermark idleness timeout accounting
+1(binding) Best, Rui Timo Walther 于2024年7月31日 周三17:47写道: > +1 (binding) > > Thanks for fixing this critical bug. > > Regards, > Timo > > On 31.07.24 09:51, Stefan Richter wrote: > > > > +1 (binding) > > > > Best, > > Stefan > > > > > > > >> On 31. Jul 2024, at 04:56, Zakelly Lan wrote: > >> > >> +1 (binding) > >> > >> > >> Best, > >> Zakelly > >> > >> On Wed, Jul 31, 2024 at 12:07 AM Piotr Nowojski > >> wrote: > >> > >>> Hi all! > >>> > >>> I would like to open the vote for FLIP-471 [1]. It has been discussed > here > >>> [2]. > >>> > >>> The vote will remain open for at least 72 hours. > >>> > >>> Best, > >>> Piotrek > >>> > >>> [1] > https://www.google.com/url?q=https://cwiki.apache.org/confluence/x/oQvOEg&source=gmail-imap&ust=172299943500&usg=AOvVaw1i6RoAc0yxKmQkhezGamhM > >>> [2] > https://www.google.com/url?q=https://lists.apache.org/thread/byj1l2236rfx3mcl3v4374rcbkq4rf85&source=gmail-imap&ust=172299943500&usg=AOvVaw1z2jA41WlE0WWB_ZCstuci > >>> > > > > > >
Re: [DISCUSS] Remove deprecated dataset based API from State Processor API
Hi! Thanks Gabor for looking into this. +1 for removing the DataSet based APIs from the state processor in the next Flink version, I don't think we should wait until 2.0. This will also reduce the 2.0 scope overall which is always good :) Cheers, Gyula On Wed, Jul 31, 2024 at 10:30 AM Gabor Somogyi wrote: > Hi Devs, > > There is a common agreement that the dataset API going to be removed in > 2.0. > > We're planning to invest quite some time in the State Processor API[1] in > the near future. > On this road we've realized that there are several deprecated APIs in this > area which are based > on the dataset API (for example Savepoint, ExistingSavepoint, NewSavepoint, > etc...) [2]. > > Based on the dataset removal agreement these are going to be removed but in > order to ease > our efforts it would be good to remove them now. The question is whether > it's fine to do it > or there are some reasons not to do so? > > Here are the collected facts about the APIs to help forming an opinion: > * Example intended to remove class: Savepoint (others share the same > characteristics) > * API marker: PublicEvolving > * Marked deprecated in: FLINK-24912 [3] > * Deprecation date: 22/12/2021 > * Deprecation in Flink version: 1.15.0 > * Eligible for removal according to our deprecation process [4]: 1.16.0+ > * New replacement APIs: SavepointReader, SavepointWriter > > Please share your thoughts on this. > > [1] > > https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/libs/state_processor_api/ > [2] > > https://github.com/apache/flink/blob/82b628d4730eef32b2f7a022e3b73cb18f950e6e/flink-libraries/flink-state-processing-api/src/main/java/org/apache/flink/state/api/Savepoint.java#L47-L49 > [3] https://issues.apache.org/jira/browse/FLINK-24912 > [4] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-321%3A+Introduce+an+API+deprecation+process > > BR, > G >
[jira] [Created] (FLINK-35940) Upgrade log4j dependencies to 2.23.1
Daniel Burrell created FLINK-35940: -- Summary: Upgrade log4j dependencies to 2.23.1 Key: FLINK-35940 URL: https://issues.apache.org/jira/browse/FLINK-35940 Project: Flink Issue Type: Improvement Components: API / Core Affects Versions: 1.19.1 Environment: N/A Reporter: Daniel Burrell There is a need to upgrade log4j dependencies as the current version has a vulnerability. I propose upgrading to 2.23.1 which is the latest 2.x version. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35941) Add CompiledPlan annotations to BatchExecLimit
Jim Hughes created FLINK-35941: -- Summary: Add CompiledPlan annotations to BatchExecLimit Key: FLINK-35941 URL: https://issues.apache.org/jira/browse/FLINK-35941 Project: Flink Issue Type: Sub-task Reporter: Jim Hughes Assignee: Jim Hughes In addition to the annotations, implement the BatchCompiledPlan test for this operator. Additionally, tests for the TableSource operator will be pulled into this work. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35942) Add CompiledPlan annotations to BatchExecSort
Jim Hughes created FLINK-35942: -- Summary: Add CompiledPlan annotations to BatchExecSort Key: FLINK-35942 URL: https://issues.apache.org/jira/browse/FLINK-35942 Project: Flink Issue Type: Sub-task Reporter: Jim Hughes Assignee: Jim Hughes In addition to the annotations, implement the BatchCompiledPlan test for this operator. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35943) Add CompiledPlan annotations to BatchExecHashJoin and BatchExecNestedLoopJoin
Jim Hughes created FLINK-35943: -- Summary: Add CompiledPlan annotations to BatchExecHashJoin and BatchExecNestedLoopJoin Key: FLINK-35943 URL: https://issues.apache.org/jira/browse/FLINK-35943 Project: Flink Issue Type: Sub-task Reporter: Jim Hughes Assignee: Jim Hughes In addition to the annotations, implement the BatchCompiledPlan test for these two operators. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35944) Add CompiledPlan annotations to BatchExecUnion
Jim Hughes created FLINK-35944: -- Summary: Add CompiledPlan annotations to BatchExecUnion Key: FLINK-35944 URL: https://issues.apache.org/jira/browse/FLINK-35944 Project: Flink Issue Type: Sub-task Reporter: Jim Hughes Assignee: Jim Hughes In addition to the annotations, implement the BatchCompiledPlan test for this operator. For this operator, the BatchExecHashAggregate operator must be annotated as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35945) Using Flink to dock with Kafka data sources without enabling checkpoints cannot browse consumer group information in Kafka
任铭睿 created FLINK-35945: --- Summary: Using Flink to dock with Kafka data sources without enabling checkpoints cannot browse consumer group information in Kafka Key: FLINK-35945 URL: https://issues.apache.org/jira/browse/FLINK-35945 Project: Flink Issue Type: Improvement Components: API / DataStream Reporter: 任铭睿 Using Flink to dock with Kafka data sources without enabling checkpoints cannot browse consumer group information in Kafka -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] FLIP-470: Support Adaptive Broadcast Join
Hi Lincoln, Thanks for your detailed explanation. I understand your concern. Introducing configuration with redundant semantics can indeed confuse users, and the engine should minimize user exposure to these details. Based on this premise, while also ensuring that users can choose to enable the broadcast hash join optimization during either the compile-time or runtime, I think we can introduce a new configuration `table.optimizer.adaptive-broadcast-join.strategy`, and reuse the existing configuration `table.optimizer.join.broadcast-threshold` as a unified threshold for determining broadcast hash join optimization. The `table.optimizer.adaptive-broadcast-join.strategy` configuration would be of an enumeration type with three options: AUTO: Flink will autonomously select the optimal timing for the optimization. RUNTIME_ONLY: The broadcast hash join optimization will only be performed at runtime. NONE: The broadcast hash join optimization will only be performed at compile phase. And AUTO will be the default option. I have also updated this information in FLIP, PTAL. Best, Xia Lincoln Lee 于2024年7月30日周二 23:39写道: > Thanks Xia for your explanation! > > I can understand your concern, but considering the design of this FLIP, > which already covers compile-time inaccurate optimization for runtime > de-optimization, is it necessary to make the user manually turn off > 'table.optimizer.join.broadcast-threshold' and set the new > 'table.optimizer.adaptive.join.broadcast-threshold' again? Another option > is that users only need to focus on the existing broadcast size threshold, > and accept the reality that 100% accurate optimization cannot be done > at compile time, and adopt the new capability of dynamic optimization at > runtime, and ultimately, users will trust that flink will always optimize > accurately, and from this point of view, I would prefer a generic parameter > 'table.optimizer. adaptive-optimization.enabled', which would allow for > more dynamic optimization in the future, not limited to broadcast join > scenarios and will not continuously bring more new options, WDYT? > > > Best, > Lincoln Lee > > > Xia Sun 于2024年7月30日周二 11:27写道: > > > Hi Lincoln, > > > > Thank you for your input and participation in the discussion! > > > > Compared to introducing the 'table.optimizer.adaptive-join.enabled' > option, > > introducing the "table.optimizer.adaptive.join.broadcast-threshold" can > > also cover the need to disable static broadcast optimization while only > > enabling dynamic broadcast optimization. From this perspective, > introducing > > a new threshold configuration might be more appropriate. What do you > think? > > > > Best, > > Xia > > > > Lincoln Lee 于2024年7月29日周一 23:12写道: > > > > > +1 for this useful optimization! > > > > > > I have a question about the new optoin, do we really need two broadcast > > > join thresholds? IIUC, this adaptive broadcast join is a complement to > > > compile-time optimization, there is no need for the user to configure > two > > > different thresholds (not the off represented by -1), so we just want > to > > > control the adaptive optimization itself, should we provide a > > configuration > > > option like 'table.optimizer.adaptive-join.enabled' or a more general > one > > > 'table.optimizer.adaptive-optimization.enabled' for such related > > > optimizations? > > > > > > > > > Best, > > > Lincoln Lee > > > > > > > > > Ron Liu 于2024年7月26日周五 11:59写道: > > > > > > > Hi, Xia > > > > > > > > Thanks for your reply. It looks good to me. > > > > > > > > > > > > Best, > > > > Ron > > > > > > > > Xia Sun 于2024年7月26日周五 10:49写道: > > > > > > > > > Hi Ron, > > > > > > > > > > Thanks for your feedback! > > > > > > > > > > -> creation of the join operators until runtime > > > > > > > > > > > > > > > That means when creating the AdaptiveJoinOperatorFactory, we will > not > > > > > immediately create the JoinOperator. Instead, we only pass in the > > > > necessary > > > > > parameters for creating the JoinOperator. The appropriate > > JoinOperator > > > > will > > > > > be created during the StreamGraphOptimizationStrategy optimization > > > phase. > > > > > > > > > > You mentioned that the runtime's visibility into the table planner > is > > > > > indeed an issue. It includes two aspects, > > > > > (1) we plan to place both implementations of the > > > > > AdaptiveBroadcastJoinOptimizationStrategy and > > > AdaptiveJoinOperatorFactory > > > > > in the table layer. During the runtime phase, we will obtain the > > > > > AdaptiveBroadcastJoinOptimizationStrategy through class loading. > > > > Therefore, > > > > > the flink-runtime does not need to be aware of the table layer's > > > > > implementation. > > > > > (2) Since the dynamic codegen in the AdaptiveJoinOperatorFactory > > needs > > > to > > > > > be aware of the table planner, we will consider placing the > > > > > AdaptiveJoinOperatorFactory in the table planner module as well. > > > > > > > > > > > > > > > -> When did you
[jira] [Created] (FLINK-35946) Finalize release 1.20.0
Weijie Guo created FLINK-35946: -- Summary: Finalize release 1.20.0 Key: FLINK-35946 URL: https://issues.apache.org/jira/browse/FLINK-35946 Project: Flink Issue Type: New Feature Reporter: Weijie Guo Assignee: lincoln lee Fix For: 1.19.0 Once the release candidate has been reviewed and approved by the community, the release should be finalized. This involves the final deployment of the release candidate to the release repositories, merging of the website changes, etc. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35947) CLONE - Deploy Python artifacts to PyPI
Weijie Guo created FLINK-35947: -- Summary: CLONE - Deploy Python artifacts to PyPI Key: FLINK-35947 URL: https://issues.apache.org/jira/browse/FLINK-35947 Project: Flink Issue Type: Sub-task Reporter: Weijie Guo Assignee: lincoln lee Release manager should create a PyPI account and ask the PMC add this account to pyflink collaborator list with Maintainer role (The PyPI admin account info can be found here. NOTE, only visible to PMC members) to deploy the Python artifacts to PyPI. The artifacts could be uploaded using twine([https://pypi.org/project/twine/]). To install twine, just run: {code:java} pip install --upgrade twine==1.12.0 {code} Download the python artifacts from dist.apache.org and upload it to pypi.org: {code:java} svn checkout https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM} cd flink-${RELEASE_VERSION}-rc${RC_NUM} cd python #uploads wheels for f in *.whl; do twine upload --repository-url https://upload.pypi.org/legacy/ $f $f.asc; done #upload source packages twine upload --repository-url https://upload.pypi.org/legacy/ apache-flink-libraries-${RELEASE_VERSION}.tar.gz apache-flink-libraries-${RELEASE_VERSION}.tar.gz.asc twine upload --repository-url https://upload.pypi.org/legacy/ apache-flink-${RELEASE_VERSION}.tar.gz apache-flink-${RELEASE_VERSION}.tar.gz.asc {code} If upload failed or incorrect for some reason (e.g. network transmission problem), you need to delete the uploaded release package of the same version (if exists) and rename the artifact to \{{{}apache-flink-${RELEASE_VERSION}.post0.tar.gz{}}}, then re-upload. (!) Note: re-uploading to pypi.org must be avoided as much as possible because it will cause some irreparable problems. If that happens, users cannot install the apache-flink package by explicitly specifying the package version, i.e. the following command "pip install apache-flink==${RELEASE_VERSION}" will fail. Instead they have to run "pip install apache-flink" or "pip install apache-flink==${RELEASE_VERSION}.post0" to install the apache-flink package. h3. Expectations * Python artifacts released and indexed in the [PyPI|https://pypi.org/project/apache-flink/] Repository -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35949) CLONE - Create Git tag and mark version as released in Jira
Weijie Guo created FLINK-35949: -- Summary: CLONE - Create Git tag and mark version as released in Jira Key: FLINK-35949 URL: https://issues.apache.org/jira/browse/FLINK-35949 Project: Flink Issue Type: Sub-task Reporter: Weijie Guo Assignee: lincoln lee Create and push a new Git tag for the released version by copying the tag for the final release candidate, as follows: {code:java} $ git tag -s "release-${RELEASE_VERSION}" refs/tags/${TAG}^{} -m "Release Flink ${RELEASE_VERSION}" $ git push refs/tags/release-${RELEASE_VERSION} {code} In JIRA, inside [version management|https://issues.apache.org/jira/plugins/servlet/project-config/FLINK/versions], hover over the current release and a settings menu will appear. Click Release, and select today’s date. (Note: Only PMC members have access to the project administration. If you do not have access, ask on the mailing list for assistance.) If PRs have been merged to the release branch after the the last release candidate was tagged, make sure that the corresponding Jira tickets have the correct Fix Version set. h3. Expectations * Release tagged in the source code repository * Release version finalized in JIRA. (Note: Not all committers have administrator access to JIRA. If you end up getting permissions errors ask on the mailing list for assistance) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35948) CLONE - Deploy artifacts to Maven Central Repository
Weijie Guo created FLINK-35948: -- Summary: CLONE - Deploy artifacts to Maven Central Repository Key: FLINK-35948 URL: https://issues.apache.org/jira/browse/FLINK-35948 Project: Flink Issue Type: Sub-task Reporter: Weijie Guo Assignee: lincoln lee Use the [Apache Nexus repository|https://repository.apache.org/] to release the staged binary artifacts to the Maven Central repository. In the Staging Repositories section, find the relevant release candidate orgapacheflink-XXX entry and click Release. Drop all other release candidates that are not being released. h3. Deploy source and binary releases to dist.apache.org Copy the source and binary releases from the dev repository to the release repository at [dist.apache.org|http://dist.apache.org/] using Subversion. {code:java} $ svn move -m "Release Flink ${RELEASE_VERSION}" https://dist.apache.org/repos/dist/dev/flink/flink-${RELEASE_VERSION}-rc${RC_NUM} https://dist.apache.org/repos/dist/release/flink/flink-${RELEASE_VERSION} {code} (Note: Only PMC members have access to the release repository. If you do not have access, ask on the mailing list for assistance.) h3. Remove old release candidates from [dist.apache.org|http://dist.apache.org/] Remove the old release candidates from [https://dist.apache.org/repos/dist/dev/flink] using Subversion. {code:java} $ svn checkout https://dist.apache.org/repos/dist/dev/flink --depth=immediates $ cd flink $ svn remove flink-${RELEASE_VERSION}-rc* $ svn commit -m "Remove old release candidates for Apache Flink ${RELEASE_VERSION} {code} h3. Expectations * Maven artifacts released and indexed in the [Maven Central Repository|https://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.flink%22] (usually takes about a day to show up) * Source & binary distributions available in the release repository of [https://dist.apache.org/repos/dist/release/flink/] * Dev repository [https://dist.apache.org/repos/dist/dev/flink/] is empty * Website contains links to new release binaries and sources in download page * (for minor version updates) the front page references the correct new major release version and directs to the correct link -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35950) CLONE - Publish the Dockerfiles for the new release
Weijie Guo created FLINK-35950: -- Summary: CLONE - Publish the Dockerfiles for the new release Key: FLINK-35950 URL: https://issues.apache.org/jira/browse/FLINK-35950 Project: Flink Issue Type: Sub-task Reporter: Weijie Guo Assignee: lincoln lee Note: the official Dockerfiles fetch the binary distribution of the target Flink version from an Apache mirror. After publishing the binary release artifacts, mirrors can take some hours to start serving the new artifacts, so you may want to wait to do this step until you are ready to continue with the "Promote the release" steps in the follow-up Jira. Follow the [release instructions in the flink-docker repo|https://github.com/apache/flink-docker#release-workflow] to build the new Dockerfiles and send an updated manifest to Docker Hub so the new images are built and published. h3. Expectations * Dockerfiles in [flink-docker|https://github.com/apache/flink-docker] updated for the new Flink release and pull request opened on the Docker official-images with an updated manifest -- This message was sent by Atlassian Jira (v8.20.10#820010)