date:20240604

Re: [PR] [docs] Polish quickstart guide & migrate maven links from ververica to apache [flink-cdc]

2024-06-04 Thread via GitHub



gtk96 commented on PR #3343:
URL: https://github.com/apache/flink-cdc/pull/3343#issuecomment-2146793364

   cc @Jiabao-Sun @GOODBOY008 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35473) FLIP-457: Improve Table/SQL Configuration for Flink 2.0

2024-06-04 Thread Jane Chan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851938#comment-17851938
 ] 

Jane Chan commented on FLINK-35473:
---

[~lincoln.86xy] Thank you for the reminder. I'm planning to open a PR in the 
coming days, and it would be great if you could help review it.

> FLIP-457: Improve Table/SQL Configuration for Flink 2.0
> ---
>
> Key: FLINK-35473
> URL: https://issues.apache.org/jira/browse/FLINK-35473
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Affects Versions: 1.20.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
> Fix For: 1.20.0
>
>
> This is the parent task for 
> [FLIP-457|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=307136992].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34559] Limit Global & Local Aggregation buffers [flink]

2024-06-04 Thread via GitHub



twalthr commented on code in PR #24869:
URL: https://github.com/apache/flink/pull/24869#discussion_r1625562520


##
flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java:
##
@@ -276,6 +277,16 @@ public long getLatencyTrackingInterval() {
 return configuration.get(MetricOptions.LATENCY_INTERVAL).toMillis();
 }
 
+@PublicEvolving
+public Optional getGlobalAggregationBufferSize() {

Review Comment:
   You are mixing layers here. Global aggregation is a concept of the SQL layer 
and thus should only change classes in the `flink-table` module. Configuration 
doesn't need to be added to `ExecutionConfig`. You can pass parameters to the 
runtime class during planning in the corresponding ExecNode.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] Update ttlConfig in state.md [flink]

2024-06-04 Thread via GitHub



lizhigong opened a new pull request, #24884:
URL: https://github.com/apache/flink/pull/24884

   Duration.ofSeconds(1) is no longer accepted by the API, use Time.seconds(1) 
instead.
   
   ## What is the purpose of the change
   
   Change typos.
   
   ## Brief change log
   
   In the documentation of [Working With State] 
(https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/#state-time-to-live-ttl),
 the code uses `Duration.ofSeconds(1)`, however, the accepted type is 
`org.apache.flink.api.common.time.Time`, thus this should be changed to 
`Time.seconds(1)`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Update ttlConfig in state.md [flink]

2024-06-04 Thread via GitHub



lizhigong closed pull request #24884: Update ttlConfig in state.md
URL: https://github.com/apache/flink/pull/24884


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35517:
--

 Summary: CI pipeline triggered by pull request seems unstable
 Key: FLINK-35517
 URL: https://issues.apache.org/jira/browse/FLINK-35517
 Project: Flink
  Issue Type: Bug
  Components: Build System / CI
Affects Versions: 1.20.0
Reporter: Weijie Guo


Flink CI pipeline triggered by pull request seems sort of unstable. 

For example, https://github.com/apache/flink/pull/24883 was filed 15 hours ago, 
but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35517:
---
Priority: Blocker  (was: Major)

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35464] Fixes operator state backwards compatibility from CDC 3.0.x [flink-cdc]

2024-06-04 Thread via GitHub



PatrickRen commented on code in PR #3369:
URL: https://github.com/apache/flink-cdc/pull/3369#discussion_r1625602137


##
flink-cdc-migration-tests/pom.xml:
##
@@ -19,13 +19,10 @@ limitations under the License.
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
 4.0.0
-

Review Comment:
   Actually you can keep the  section here. Removing 
`flink-cdc-migration-tests` from the  section in the parent pom should 
be enough.



##
pom.xml:
##
@@ -40,7 +40,6 @@ limitations under the License.
 flink-cdc-connect
 flink-cdc-runtime
 flink-cdc-e2e-tests
-flink-cdc-migration-tests

Review Comment:
   Is it possible to use profiles to control whether to build 
`flink-cdc-migration-tests` module?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink [flink-cdc]

2024-06-04 Thread via GitHub



PatrickRen merged PR #3233:
URL: https://github.com/apache/flink-cdc/pull/3233


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink [flink-cdc]

2024-06-04 Thread via GitHub



PatrickRen commented on PR #3233:
URL: https://github.com/apache/flink-cdc/pull/3233#issuecomment-2146945078

   @loserwang1024 Could you cherry-pick this commit to release-3.1 branch? 
Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35282][python] Upgrade pyarrow and beam [flink]

2024-06-04 Thread via GitHub



hlteoh37 merged PR #24875:
URL: https://github.com/apache/flink/pull/24875


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35149) Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink

2024-06-04 Thread Qingsheng Ren (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851955#comment-17851955
 ] 

Qingsheng Ren commented on FLINK-35149:
---

flink-cdc master: 33891869a9fffa2abf8b8ae03915d0ddccdaf5ec

> Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not 
> TwoPhaseCommittingSink
> ---
>
> Key: FLINK-35149
> URL: https://issues.apache.org/jira/browse/FLINK-35149
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Hongshun Wang
>Assignee: Hongshun Wang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> Current , when sink is not instanceof TwoPhaseCommittingSink, use 
> input.transform rather than stream. It means that pre-write topology will be 
> ignored.
> {code:java}
> private void sinkTo(
> DataStream input,
> Sink sink,
> String sinkName,
> OperatorID schemaOperatorID) {
> DataStream stream = input;
> // Pre write topology
> if (sink instanceof WithPreWriteTopology) {
> stream = ((WithPreWriteTopology) 
> sink).addPreWriteTopology(stream);
> }
> if (sink instanceof TwoPhaseCommittingSink) {
> addCommittingTopology(sink, stream, sinkName, schemaOperatorID);
> } else {
> input.transform(
> SINK_WRITER_PREFIX + sinkName,
> CommittableMessageTypeInfo.noOutput(),
> new DataSinkWriterOperatorFactory<>(sink, schemaOperatorID));
> }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35149) Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink

2024-06-04 Thread Qingsheng Ren (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qingsheng Ren reassigned FLINK-35149:
-

Assignee: Hongshun Wang

> Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not 
> TwoPhaseCommittingSink
> ---
>
> Key: FLINK-35149
> URL: https://issues.apache.org/jira/browse/FLINK-35149
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Hongshun Wang
>Assignee: Hongshun Wang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> Current , when sink is not instanceof TwoPhaseCommittingSink, use 
> input.transform rather than stream. It means that pre-write topology will be 
> ignored.
> {code:java}
> private void sinkTo(
> DataStream input,
> Sink sink,
> String sinkName,
> OperatorID schemaOperatorID) {
> DataStream stream = input;
> // Pre write topology
> if (sink instanceof WithPreWriteTopology) {
> stream = ((WithPreWriteTopology) 
> sink).addPreWriteTopology(stream);
> }
> if (sink instanceof TwoPhaseCommittingSink) {
> addCommittingTopology(sink, stream, sinkName, schemaOperatorID);
> } else {
> input.transform(
> SINK_WRITER_PREFIX + sinkName,
> CommittableMessageTypeInfo.noOutput(),
> new DataSinkWriterOperatorFactory<>(sink, schemaOperatorID));
> }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh resolved FLINK-35282.
-
Resolution: Fixed

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh updated FLINK-35282:

Fix Version/s: (was: 1.19.1)

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851954#comment-17851954
 ] 

Hong Liang Teoh commented on FLINK-35282:
-

 merged commit 
[{{91a9e06}}|https://github.com/apache/flink/commit/91a9e06d3cb611e048274088e56b2c110cd29926]
 into   apache:master

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35149) Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink

2024-06-04 Thread Qingsheng Ren (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qingsheng Ren updated FLINK-35149:
--
Fix Version/s: cdc-3.1.1

> Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not 
> TwoPhaseCommittingSink
> ---
>
> Key: FLINK-35149
> URL: https://issues.apache.org/jira/browse/FLINK-35149
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Hongshun Wang
>Assignee: Hongshun Wang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.0, cdc-3.1.1
>
>
> Current , when sink is not instanceof TwoPhaseCommittingSink, use 
> input.transform rather than stream. It means that pre-write topology will be 
> ignored.
> {code:java}
> private void sinkTo(
> DataStream input,
> Sink sink,
> String sinkName,
> OperatorID schemaOperatorID) {
> DataStream stream = input;
> // Pre write topology
> if (sink instanceof WithPreWriteTopology) {
> stream = ((WithPreWriteTopology) 
> sink).addPreWriteTopology(stream);
> }
> if (sink instanceof TwoPhaseCommittingSink) {
> addCommittingTopology(sink, stream, sinkName, schemaOperatorID);
> } else {
> input.transform(
> SINK_WRITER_PREFIX + sinkName,
> CommittableMessageTypeInfo.noOutput(),
> new DataSinkWriterOperatorFactory<>(sink, schemaOperatorID));
> }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh reassigned FLINK-35282:
---

Assignee: Antonio Vespoli

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Assignee: Antonio Vespoli
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35129][cdc][postgres] Add checkpoint cycle option for commit offsets [flink-cdc]

2024-06-04 Thread via GitHub



leonardBang merged PR #3349:
URL: https://github.com/apache/flink-cdc/pull/3349


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851962#comment-17851962
 ] 

Weijie Guo commented on FLINK-35517:


I have reached out [~lorenzo.affetti] to see if this is related to the 
Ververica machine?

cc [~jingge] 

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35129) Postgres source commits the offset after every multiple checkpoint cycles.

2024-06-04 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu resolved FLINK-35129.

Resolution: Implemented

Implemented via master: 5b28d1a579919b29acac6acded46d9bee5596bde

> Postgres source commits the offset after every multiple checkpoint cycles.
> --
>
> Key: FLINK-35129
> URL: https://issues.apache.org/jira/browse/FLINK-35129
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Hongshun Wang
>Assignee: Muhammet Orazov
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> After entering the Stream phase, the offset consumed by the global slot is 
> committed upon the completion of each checkpoint, preventing log files from 
> being unable to be recycled continuously, which could lead to insufficient 
> disk space.
> However, the job can only restart from the latest checkpoint or savepoint. if 
> restored from an earlier state, WAL may already have been recycled.
>  
> The way to solve it is to commit the offset after every multiple checkpoint 
> cycles. The number of checkpoint cycles is determine by connector option, and 
> the default value is 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35518) CI Bot doesn't run on PRs

2024-06-04 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-35518:
--

 Summary: CI Bot doesn't run on PRs
 Key: FLINK-35518
 URL: https://issues.apache.org/jira/browse/FLINK-35518
 Project: Flink
  Issue Type: Bug
  Components: Build System / CI
Affects Versions: 1.20.0
Reporter: Piotr Nowojski


Doesn't want to run on my PR/branch. I was doing force-pushes, rebases, asking 
flink bot to run, closed and opened new PR, but nothing helped
https://github.com/apache/flink/pull/24868
https://github.com/apache/flink/pull/24883

I've heard others were having similar problems recently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35518) CI Bot doesn't run on PRs - status UNKNOWN

2024-06-04 Thread Piotr Nowojski (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-35518:
---
Summary: CI Bot doesn't run on PRs - status UNKNOWN  (was: CI Bot doesn't 
run on PRs)

> CI Bot doesn't run on PRs - status UNKNOWN
> --
>
> Key: FLINK-35518
> URL: https://issues.apache.org/jira/browse/FLINK-35518
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Piotr Nowojski
>Priority: Critical
>
> Doesn't want to run on my PR/branch. I was doing force-pushes, rebases, 
> asking flink bot to run, closed and opened new PR, but nothing helped
> https://github.com/apache/flink/pull/24868
> https://github.com/apache/flink/pull/24883
> I've heard others were having similar problems recently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35518) CI Bot doesn't run on PRs - status UNKNOWN

2024-06-04 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851969#comment-17851969
 ] 

Weijie Guo commented on FLINK-35518:


Yes, I have created a ticket(FLINK-35517) also intend to track this issue.

> CI Bot doesn't run on PRs - status UNKNOWN
> --
>
> Key: FLINK-35518
> URL: https://issues.apache.org/jira/browse/FLINK-35518
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Piotr Nowojski
>Priority: Critical
>
> Doesn't want to run on my PR/branch. I was doing force-pushes, rebases, 
> asking flink bot to run, closed and opened new PR, but nothing helped
> https://github.com/apache/flink/pull/24868
> https://github.com/apache/flink/pull/24883
> I've heard others were having similar problems recently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge reassigned FLINK-35517:
---

Assignee: Jing Ge

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35464] Fixes operator state backwards compatibility from CDC 3.0.x [flink-cdc]

2024-06-04 Thread via GitHub



yuxiqian commented on PR #3369:
URL: https://github.com/apache/flink-cdc/pull/3369#issuecomment-2147045294

   Thanks for the tips! Addressed your comments to simplify pom files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-35519) Flink Job fails with SingleValueAggFunction received more than one element

2024-06-04 Thread Dawid Wysakowicz (Jira)

Dawid Wysakowicz created FLINK-35519:


 Summary: Flink Job fails with SingleValueAggFunction received more 
than one element
 Key: FLINK-35519
 URL: https://issues.apache.org/jira/browse/FLINK-35519
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Dawid Wysakowicz


When running a query:
{code}
select 
   (SELECT 
   t.id 
FROM raw_pagerduty_users, UNNEST(teams) AS t(id, type, summary, self, 
html_url))
from raw_pagerduty_users;
{code}
it is translated to:

{code}
Sink(table=[default_catalog.default_database.sink], fields=[EXPR$0])
+- Calc(select=[$f0 AS EXPR$0])
   +- Join(joinType=[LeftOuterJoin], where=[true], select=[c, $f0], 
leftInputSpec=[NoUniqueKey], rightInputSpec=[JoinKeyContainsUniqueKey])
  :- Exchange(distribution=[single])
  :  +- Calc(select=[c])
  : +- TableSourceScan(table=[[default_catalog, default_database, 
raw_pagerduty_users, project=[c, teams], metadata=[]]], fields=[c, 
teams])(reuse_id=[1])
  +- Exchange(distribution=[single])
 +- GroupAggregate(select=[SINGLE_VALUE(id) AS $f0])
+- Exchange(distribution=[single])
   +- Calc(select=[id])
  +- Correlate(invocation=[$UNNEST_ROWS$1($cor0.teams)], 
correlate=[table($UNNEST_ROWS$1($cor0.teams))], 
select=[c,teams,id,type,summary,self,html_url], rowType=[RecordType(BIGINT c, 
RecordType:peek_no_expand(VARCHAR(2147483647) id, VARCHAR(2147483647) type, 
VARCHAR(2147483647) summary, VARCHAR(2147483647) self, VARCHAR(2147483647) 
html_url) ARRAY teams, VARCHAR(2147483647) id, VARCHAR(2147483647) type, 
VARCHAR(2147483647) summary, VARCHAR(2147483647) self, VARCHAR(2147483647) 
html_url)], joinType=[INNER])
 +- Reused(reference_id=[1])
{code}

and it fails with:

{code}
java.lang.RuntimeException: SingleValueAggFunction received more than one 
element.
at GroupAggsHandler$150.accumulate(Unknown Source)
at 
org.apache.flink.table.runtime.operators.aggregate.GroupAggFunction.processElement(GroupAggFunction.java:151)
at 
org.apache.flink.table.runtime.operators.aggregate.GroupAggFunction.processElement(GroupAggFunction.java:43)
at 
org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
at 
org.apache.flink.streaming.runtime.io.RecordProcessorUtils.lambda$getRecordProcessor$0(RecordProcessorUtils.java:60)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:237)
at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:146)
at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:571)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:900)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:849)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.base/java.lang.Thread.run(Thread.java:829)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35305]Amazon SQS Sink Connector [flink-connector-aws]

2024-06-04 Thread via GitHub



hlteoh37 commented on code in PR #141:
URL: 
https://github.com/apache/flink-connector-aws/pull/141#discussion_r1625636303


##
docs/content.zh/docs/connectors/datastream/sqs.md:
##
@@ -0,0 +1,134 @@
+---
+title: DynamoDB
+weight: 5
+type: docs
+aliases:
+- /zh/dev/connectors/dynamodb.html

Review Comment:
   These are wrong and should be changed to `sqs.html`



##
docs/content.zh/docs/connectors/datastream/sqs.md:
##
@@ -0,0 +1,134 @@
+---
+title: DynamoDB
+weight: 5
+type: docs
+aliases:
+- /zh/dev/connectors/dynamodb.html
+---
+
+
+# Amazon SQS Sink
+
+The SQS sink writes to [Amazon SQS](https://aws.amazon.com/sqs) using the [AWS 
v2 SDK for 
Java](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/home.html).
 Follow the instructions from the [Amazon SQS Developer 
Guide](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/welcome.html)
+to setup a SQS.
+
+To use the connector, add the following Maven dependency to your project:
+
+{{< connector_artifact flink-connector-sqs sqs >}}
+
+{{< tabs "ec24a4ae-6a47-11ed-a1eb-0242ac120002" >}}
+{{< tab "Java" >}}
+```java
+Properties sinkProperties = new Properties();
+// Required
+sinkProperties.put(AWSConfigConstants.AWS_REGION, "eu-west-1");

Review Comment:
   Is there a reason we need to specify this? Can we figure this out via the 
SQS URL?



##
flink-connector-aws/flink-connector-sqs/src/main/java/org.apache.flink.connector.sqs/sink/SqsConfigConstants.java:
##
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.sqs.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+
+/** Defaults for {@link SqsSinkWriter}. */
+@PublicEvolving
+public class SqsConfigConstants {
+
+public static final String BASE_SQS_USER_AGENT_PREFIX_FORMAT =

Review Comment:
   Could we use `ConfigOption` instead of strings here? Example: 
https://github.com/apache/flink-connector-aws/blob/38aafb44d3a8200e4ff41d87e0780338f40da258/flink-connector-aws/flink-connector-aws-kinesis-streams/src/main/java/org/apache/flink/connector/kinesis/source/config/KinesisStreamsSourceConfigConstants.java#L41-L46



##
docs/content.zh/docs/connectors/datastream/sqs.md:
##
@@ -0,0 +1,134 @@
+---
+title: DynamoDB
+weight: 5
+type: docs
+aliases:
+- /zh/dev/connectors/dynamodb.html
+---
+
+
+# Amazon SQS Sink
+
+The SQS sink writes to [Amazon SQS](https://aws.amazon.com/sqs) using the [AWS 
v2 SDK for 
Java](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/home.html).
 Follow the instructions from the [Amazon SQS Developer 
Guide](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/welcome.html)
+to setup a SQS.

Review Comment:
   nit: setup an SQS message queue.



##
flink-connector-aws/flink-connector-sqs/src/main/java/org.apache.flink.connector.sqs/sink/SqsSinkWriter.java:
##
@@ -0,0 +1,257 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.sqs.sink;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.connector.sink2.Sink;
+import org.apache.flink.connector.aws.sink.throwable.AWSExceptionHandler;
+import org.apache.flink.connector.aws.util.AWSClientUtil;
+import org.apache.flink.connector.aws.util.AWSGeneralUtil;
+import org.apache.flink.connector.base.sink.thr

[jira] [Created] (FLINK-35520) Nightly build can't compile as problems were detected from NoticeFileChecker

2024-06-04 Thread Weijie Guo (Jira)

Weijie Guo created FLINK-35520:
--

 Summary: Nightly build can't compile as problems were detected 
from NoticeFileChecker
 Key: FLINK-35520
 URL: https://issues.apache.org/jira/browse/FLINK-35520
 Project: Flink
  Issue Type: Bug
  Components: Build System / CI
Affects Versions: 1.20.0
Reporter: Weijie Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile as problems were detected from NoticeFileChecker

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805]

  was:
[] - Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805


> Nightly build can't compile as problems were detected from NoticeFileChecker
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> 09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
> Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35305]Amazon SQS Sink Connector [flink-connector-aws]

2024-06-04 Thread via GitHub



hlteoh37 commented on PR #141:
URL: 
https://github.com/apache/flink-connector-aws/pull/141#issuecomment-2147097522

   We seem to be having quite a few `.` in the class folders. Can we change 
them to `/` instead?
   e.g. 
`flink-connector-aws/flink-connector-sqs/src/test/java/org.apache.flink/connector.sqs/sink/SqsExceptionClassifiersTest.java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35520) Nightly build can't compile as problems were detected from NoticeFileChecker

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
[] - Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805

> Nightly build can't compile as problems were detected from NoticeFileChecker
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> [] - Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Summary: Nightly build can't compile  (was: Nightly build can't compile as 
problems were detected from NoticeFileChecker)

> Nightly build can't compile
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> 09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
> Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35305]Amazon SQS Sink Connector [flink-connector-aws]

2024-06-04 Thread via GitHub



hlteoh37 commented on code in PR #141:
URL: 
https://github.com/apache/flink-connector-aws/pull/141#discussion_r1625705602


##
flink-connector-aws/flink-connector-sqs/src/main/java/org.apache.flink.connector.sqs/sink/SqsSinkElementConverter.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.sqs.sink;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.api.common.serialization.SerializationSchema;
+import org.apache.flink.api.connector.sink2.Sink;
+import org.apache.flink.api.connector.sink2.SinkWriter;
+import org.apache.flink.connector.base.sink.writer.ElementConverter;
+import org.apache.flink.metrics.MetricGroup;
+import org.apache.flink.metrics.groups.UnregisteredMetricsGroup;
+import org.apache.flink.util.FlinkRuntimeException;
+import org.apache.flink.util.Preconditions;
+import org.apache.flink.util.SimpleUserCodeClassLoader;
+import org.apache.flink.util.UserCodeClassLoader;
+
+import software.amazon.awssdk.services.sqs.model.SendMessageBatchRequestEntry;
+
+import java.nio.charset.StandardCharsets;
+import java.util.UUID;
+
+/**
+ * An implementation of the {@link ElementConverter} that uses the AWS SQS SDK 
v2. The user only
+ * needs to provide a {@link SerializationSchema} of the {@code InputT} to 
transform it into a
+ * {@link SendMessageBatchRequestEntry} that may be persisted.
+ */
+@Internal
+public class SqsSinkElementConverter
+implements ElementConverter {
+
+/** A serialization schema to specify how the input element should be 
serialized. */
+private final SerializationSchema serializationSchema;
+
+private SqsSinkElementConverter(SerializationSchema 
serializationSchema) {
+this.serializationSchema = serializationSchema;
+}
+
+@Override
+public SendMessageBatchRequestEntry apply(InputT element, 
SinkWriter.Context context) {
+final byte[] messageBody = serializationSchema.serialize(element);
+return SendMessageBatchRequestEntry.builder()
+.id(UUID.randomUUID().toString())
+.messageBody(new String(messageBody, StandardCharsets.UTF_8))
+.build();
+}
+
+@Override
+public void open(Sink.InitContext context) {
+try {
+serializationSchema.open(
+new SerializationSchema.InitializationContext() {
+@Override
+public MetricGroup getMetricGroup() {
+return new UnregisteredMetricsGroup();
+}
+
+@Override
+public UserCodeClassLoader getUserCodeClassLoader() {
+return SimpleUserCodeClassLoader.create(
+
SqsSinkElementConverter.class.getClassLoader());
+}
+});
+} catch (Exception e) {
+throw new FlinkRuntimeException("Failed to initialize 
serialization schema.", e);
+}
+}
+
+public static  Builder builder() {
+return new Builder<>();
+}
+
+/** A builder for the SqsSinkElementConverter. */
+public static class Builder {
+
+private SerializationSchema serializationSchema;
+
+public Builder setSerializationSchema(
+SerializationSchema serializationSchema) {
+this.serializationSchema = serializationSchema;
+return this;
+}
+
+public SqsSinkElementConverter build() {
+Preconditions.checkNotNull(
+serializationSchema,
+"No SerializationSchema was supplied to the " + "SQS Sink 
builder.");

Review Comment:
   nit: `+` not needed here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Summary: Nightly build can't compile as license check failed  (was: Nightly 
build can't compile)

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> 09:26:55,177 ERROR org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - 
> Jar file 
> /tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
>  contains a LICENSE file in an unexpected location: /LICENSE
>  
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45807|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805]
  (was: 09:26:55,177 ERROR 
org.apache.flink.tools.ci.licensecheck.JarFileChecker [] - Jar file 
/tmp/flink-validation-deployment/org/apache/flink/flink-python/1.20-SNAPSHOT/flink-python-1.20-20240604.084637-1.jar
 contains a LICENSE file in an unexpected location: /LICENSE

 

[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805])

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45807|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Description: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45808
  (was: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45807|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45805])

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851983#comment-17851983
 ] 

Weijie Guo commented on FLINK-35520:


Hi [~antoniovespoli], would you mind taking a look? I found this issue after 
91a9e06d merged.

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (FLINK-35520) Nightly build can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851983#comment-17851983
 ] 

Weijie Guo edited comment on FLINK-35520 at 6/4/24 9:53 AM:


Hi [~antoniovespoli], would you mind taking a look? I found this issue after 
FLINK-35282 merged.


was (Author: weijie guo):
Hi [~antoniovespoli], would you mind taking a look? I found this issue after 
91a9e06d merged.

> Nightly build can't compile as license check failed
> ---
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35520) master can't compile as license check failed

2024-06-04 Thread Weijie Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-35520:
---
Summary: master can't compile as license check failed  (was: Nightly build 
can't compile as license check failed)

> master can't compile as license check failed
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35435] Add timeout Configuration to Async Sink [flink]

2024-06-04 Thread via GitHub



vahmed-hamdy commented on PR #24839:
URL: https://github.com/apache/flink/pull/24839#issuecomment-2147156792

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Jing Ge (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851994#comment-17851994
 ] 

Jing Ge commented on FLINK-35517:
-

It should work again. [~Weijie Guo] please let me know if you still have any 
issues. Thanks!

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851995#comment-17851995
 ] 

Rui Fan commented on FLINK-35517:
-

Hi [~jingge] , may I know whether all old PRs can be recovered as well? Or only 
new PR works now?

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1624540081


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   The original implementation was only for batch in `BatchSQLTestProgram`. 
This PR is for the migration that should not implicitly bring big change for 
the data generation that might cause performance issue later. In addition, the 
new implementation is still in the `flink-batch-sql-test` module which should 
be used only for batch. 
   Not sure if there are already similar generators in the stream-sql-test. If 
not, a new jira task could created and add the new implementation to the stream 
sql test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1624540081


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   The original implementation was only for batch in `BatchSQLTestProgram`. 
This PR is for the migration that should not implicitly bring big change for 
the data generation that might cause performance issue later. In addition, the 
new implementation is still in the `flink-batch-sql-test` module which should 
be used only for batch. 
   Not sure if there are already similar generators in the stream-sql-test. If 
not, a new jira task could be created and add the new generator implementation 
to the stream sql test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1624540081


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   The original implementation was only for batch in `BatchSQLTestProgram`. 
This PR is for the migration that should not implicitly bring big change for 
the data generation that might cause performance issue later. In addition, the 
new implementation is still in the `flink-batch-sql-test` module which should 
be used only for batch. 
   Not sure if there are already similar generators in the stream-sql-test. If 
not, a new jira task could be created and add the new implementation to the 
stream sql test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Rui Fan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851995#comment-17851995
 ] 

Rui Fan edited comment on FLINK-35517 at 6/4/24 10:52 AM:
--

Hi [~jingge] , thanks for the quick fix. :)

May I know whether all old PRs can be recovered as well? Or only new PR works 
now?


was (Author: fanrui):
Hi [~jingge] , may I know whether all old PRs can be recovered as well? Or only 
new PR works now?

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-32229) Implement metrics and logging for Initial implementation

2024-06-04 Thread Hong Liang Teoh (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-32229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh reassigned FLINK-32229:
---

Assignee: Burak Ozakinci

> Implement metrics and logging for Initial implementation
> 
>
> Key: FLINK-32229
> URL: https://issues.apache.org/jira/browse/FLINK-32229
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hong Liang Teoh
>Assignee: Burak Ozakinci
>Priority: Major
>
> Add/Ensure Kinesis specific metrics for MillisBehindLatest/numRecordsIn are 
> published.
> More metrics here: 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Reopened] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Sergey Nuyanzin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin reopened FLINK-35282:
-

I noticed that it was reverted by [~hong] at 
[caa68c2481f7c483d0364206d36654af26f2074f|https://github.com/apache/flink/commit/caa68c2481f7c483d0364206d36654af26f2074f]
for that reason it would make sense to reopen the issue as well


BTW I have a fix for license check issue at 
https://github.com/snuyanzin/flink/commit/5a4f4d0eb785050552c73fbfc74214f85ee278b0
We could try is once build becomes more stable

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Assignee: Antonio Vespoli
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-32229) Implement metrics and logging for Initial implementation

2024-06-04 Thread Hong Liang Teoh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-32229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17851998#comment-17851998
 ] 

Hong Liang Teoh commented on FLINK-32229:
-

[~chalixar] sorry I didn't manage to see this, and [~burakoz] has requested to 
work on this. Could this be worked out between you and [~burakoz] ? Maybe we 
can collaborate by reviewing the PR?

> Implement metrics and logging for Initial implementation
> 
>
> Key: FLINK-32229
> URL: https://issues.apache.org/jira/browse/FLINK-32229
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hong Liang Teoh
>Assignee: Burak Ozakinci
>Priority: Major
>
> Add/Ensure Kinesis specific metrics for MillisBehindLatest/numRecordsIn are 
> published.
> More metrics here: 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-33%3A+Standardize+Connector+Metrics]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (FLINK-35518) CI Bot doesn't run on PRs - status UNKNOWN

2024-06-04 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge closed FLINK-35518.
---
Resolution: Duplicate

> CI Bot doesn't run on PRs - status UNKNOWN
> --
>
> Key: FLINK-35518
> URL: https://issues.apache.org/jira/browse/FLINK-35518
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Piotr Nowojski
>Assignee: Jing Ge
>Priority: Critical
>
> Doesn't want to run on my PR/branch. I was doing force-pushes, rebases, 
> asking flink bot to run, closed and opened new PR, but nothing helped
> https://github.com/apache/flink/pull/24868
> https://github.com/apache/flink/pull/24883
> I've heard others were having similar problems recently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35520) master can't compile as license check failed

2024-06-04 Thread Sergey Nuyanzin (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852001#comment-17852001
 ] 

Sergey Nuyanzin commented on FLINK-35520:
-

so far it was reverted as mentioned in comments FLINK-35282
 PS:  I have a fix for this at 
https://github.com/snuyanzin/flink/commit/5a4f4d0eb785050552c73fbfc74214f85ee278b0
which could be tried when build will be more stable

> master can't compile as license check failed
> 
>
> Key: FLINK-35520
> URL: https://issues.apache.org/jira/browse/FLINK-35520
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Priority: Blocker
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=60024&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=45808



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35518) CI Bot doesn't run on PRs - status UNKNOWN

2024-06-04 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge reassigned FLINK-35518:
---

Assignee: Jing Ge

> CI Bot doesn't run on PRs - status UNKNOWN
> --
>
> Key: FLINK-35518
> URL: https://issues.apache.org/jira/browse/FLINK-35518
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Piotr Nowojski
>Assignee: Jing Ge
>Priority: Critical
>
> Doesn't want to run on my PR/branch. I was doing force-pushes, rebases, 
> asking flink bot to run, closed and opened new PR, but nothing helped
> https://github.com/apache/flink/pull/24868
> https://github.com/apache/flink/pull/24883
> I've heard others were having similar problems recently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Jing Ge (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Ge resolved FLINK-35517.
-
Fix Version/s: 1.20.0
   Resolution: Fixed

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
> Fix For: 1.20.0
>
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35517) CI pipeline triggered by pull request seems unstable

2024-06-04 Thread Jing Ge (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852002#comment-17852002
 ] 

Jing Ge commented on FLINK-35517:
-

[~fanrui] all PRs will be recovered.

> CI pipeline triggered by pull request seems unstable
> 
>
> Key: FLINK-35517
> URL: https://issues.apache.org/jira/browse/FLINK-35517
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.20.0
>Reporter: Weijie Guo
>Assignee: Jing Ge
>Priority: Blocker
> Fix For: 1.20.0
>
>
> Flink CI pipeline triggered by pull request seems sort of unstable. 
> For example, https://github.com/apache/flink/pull/24883 was filed 15 hours 
> ago, but CI report is UNKNOWN.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35512) ArtifactFetchManagerTest unit tests fail

2024-06-04 Thread Elphas Toringepi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852005#comment-17852005
 ] 

Elphas Toringepi commented on FLINK-35512:
--

[~hong] +1, I reproduced the error by running
{noformat}
./mvnw clean package{noformat}

> ArtifactFetchManagerTest unit tests fail
> 
>
> Key: FLINK-35512
> URL: https://issues.apache.org/jira/browse/FLINK-35512
> Project: Flink
>  Issue Type: Technical Debt
>Affects Versions: 1.19.1
>Reporter: Hong Liang Teoh
>Priority: Major
> Fix For: 1.19.1
>
>
> The below three tests from *ArtifactFetchManagerTest* seem to fail 
> consistently:
>  * ArtifactFetchManagerTest.testFileSystemFetchWithAdditionalUri
>  * ArtifactFetchManagerTest.testMixedArtifactFetch
>  * ArtifactFetchManagerTest.testHttpFetch
> The error printed is
> {code:java}
> java.lang.AssertionError:
> Expecting actual not to be empty
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.getFlinkClientsJar(ArtifactFetchManagerTest.java:248)
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.testMixedArtifactFetch(ArtifactFetchManagerTest.java:146)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
>     at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>     at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>     at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
>     at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35516][Connector/Files] Update the Experimental annotation for files connector [flink]

2024-06-04 Thread via GitHub



RocMarshal opened a new pull request, #24885:
URL: https://github.com/apache/flink/pull/24885

   
   
   
   
   ## What is the purpose of the change
   
   Update the Experimental annotation for files connector
   
   
   ## Brief change log
   
   Update the Experimental annotation for files connector
   
   
   ## Verifying this change
   
   
   
   This change is already covered by existing tests
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35516) Update the Experimental annotation to PublicEvolving for files connector

2024-06-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35516:
---
Labels: pull-request-available  (was: )

> Update the Experimental annotation to PublicEvolving for files connector 
> -
>
> Key: FLINK-35516
> URL: https://issues.apache.org/jira/browse/FLINK-35516
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / FileSystem
>Reporter: RocMarshal
>Assignee: RocMarshal
>Priority: Minor
>  Labels: pull-request-available
>
> as described in https://issues.apache.org/jira/browse/FLINK-35496
> We should update the annotations for the stable APIs in files connector.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35516][Connector/Files] Update the Experimental annotation for files connector [flink]

2024-06-04 Thread via GitHub



flinkbot commented on PR #24885:
URL: https://github.com/apache/flink/pull/24885#issuecomment-2147260985

   
   ## CI report:
   
   * d303ba9557f4cbef16b70565a97ba35b829f9bc6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-26050] Manually compact small SST files [flink]

2024-06-04 Thread via GitHub



rkhachatryan merged PR #24880:
URL: https://github.com/apache/flink/pull/24880


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-35506) disable kafka auto-commit and rely on flink’s checkpointing if both are enabled

2024-06-04 Thread elon_X (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

elon_X closed FLINK-35506.
--
Resolution: Not A Problem

> disable kafka auto-commit and rely on flink’s checkpointing if both are 
> enabled
> ---
>
> Key: FLINK-35506
> URL: https://issues.apache.org/jira/browse/FLINK-35506
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka
>Affects Versions: 1.16.1
>Reporter: elon_X
>Priority: Major
> Attachments: image-2024-06-03-23-39-28-270.png
>
>
> When I use KafkaSource for consuming topics and set the Kafka parameter 
> {{{}enable.auto.commit=true{}}}, while also enabling checkpointing for the 
> task, I notice that both will commit offsets. Should Kafka's auto-commit be 
> disabled when enabling Flink checkpointing, similar to how it's done with 
> FlinkKafkaConsumer?
>  
> *How to reproduce*
>  
> {code:java}
> // code placeholder
> Properties kafkaParams = new Properties();
> kafkaParams.put("enable.auto.commit", "true");
> kafkaParams.put("auto.offset.reset", "latest");
> kafkaParams.put("fetch.min.bytes", "4096");
> kafkaParams.put("sasl.mechanism", "PLAIN");
> kafkaParams.put("security.protocol", "SASL_PLAINTEXT");
> kafkaParams.put("bootstrap.servers", bootStrap);
> kafkaParams.put("group.id", expoGroupId);
> kafkaParams.put("sasl.jaas.config","org.apache.kafka.common.security.plain.PlainLoginModule
>  required username=\"" + username + "\" password=\"" + password + "\";");
> KafkaSource source = KafkaSource
> .builder()
> .setBootstrapServers(bootStrap)
> .setProperties(kafkaParams)
> .setGroupId(expoGroupId)
> .setTopics(Arrays.asList(expoTopic))
> .setValueOnlyDeserializer(new SimpleStringSchema())
> .setStartingOffsets(OffsetsInitializer.latest())
> .build();
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromSource(source, WatermarkStrategy.noWatermarks(), "kafka-source")
> .filter(r ->  true);
> env.enableCheckpointing(3000 * 1000);
> env.getCheckpointConfig().setMinPauseBetweenCheckpoints(1000 * 1000);
> env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.AT_LEAST_ONCE);
> env.getCheckpointConfig().setCheckpointTimeout(1000 * 300);
> env.execute("kafka-consumer"); {code}
>  
>  
> the kafka client's 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator continuously 
> committing offsets.
> !image-2024-06-03-23-39-28-270.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-32081][checkpoint] Compatibility between file-merging on and off across job runs [flink]

2024-06-04 Thread via GitHub



ljz2051 commented on code in PR #24873:
URL: https://github.com/apache/flink/pull/24873#discussion_r1625978087


##
flink-tests/src/test/java/org/apache/flink/test/checkpointing/SnapshotFileMergingCompatibilityITCase.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.checkpointing;
+
+import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
+import org.apache.flink.core.execution.RestoreMode;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.jupiter.api.io.TempDir;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import java.nio.file.Path;
+import java.util.Arrays;
+import java.util.Collection;
+
+import static 
org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase.runJobAndGetExternalizedCheckpoint;
+import static org.assertj.core.api.Assertions.assertThat;
+
+/**
+ * FileMerging Compatibility IT case which tests recovery from a checkpoint 
created in different
+ * fileMerging mode (i.e. fileMerging enabled/disabled).
+ */
+public class SnapshotFileMergingCompatibilityITCase extends TestLogger {
+
+public static Collection parameters() {
+return Arrays.asList(
+new Object[][] {
+{RestoreMode.CLAIM, true},
+{RestoreMode.CLAIM, false},
+{RestoreMode.NO_CLAIM, true},
+{RestoreMode.NO_CLAIM, false}
+});
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromDisablingToEnablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, false, true, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromEnablingToDisablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, true, false, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+private void testSwitchingFileMerging(
+Path checkpointDir,
+boolean firstFileMergingSwitch,
+boolean secondFileMergingSwitch,
+RestoreMode restoreMode,
+boolean fileMergingAcrossBoundary)
+throws Exception {
+final Configuration config = new Configuration();
+config.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, 
checkpointDir.toUri().toString());
+config.set(CheckpointingOptions.INCREMENTAL_CHECKPOINTS, true);
+config.set(CheckpointingOptions.FILE_MERGING_ACROSS_BOUNDARY, 
fileMergingAcrossBoundary);
+config.set(CheckpointingOptions.FILE_MERGING_ENABLED, 
firstFileMergingSwitch);
+MiniClusterWithClientResource firstCluster =
+new MiniClusterWithClientResource(
+new MiniClusterResourceConfiguration.Builder()
+.setConfiguration(config)
+.setNumberTaskManagers(2)
+.setNumberSlotsPerTaskManager(2)
+.build());
+EmbeddedRocksDBStateBackend stateBackend1 = new 
EmbeddedRocksDBStateBackend();
+stateBackend1.configure(config, 
Thread.currentThread().getContextClassLoader());
+firstCluster.before();
+String externalCheckpoint;
+try {
+externalCheckpoint =
+runJobAndGetExternalizedCheckpoint(
+stateBackend1, null, firstCluster, restoreM

Re: [PR] [FLINK-32081][checkpoint] Compatibility between file-merging on and off across job runs [flink]

2024-06-04 Thread via GitHub



ljz2051 commented on code in PR #24873:
URL: https://github.com/apache/flink/pull/24873#discussion_r1625978087


##
flink-tests/src/test/java/org/apache/flink/test/checkpointing/SnapshotFileMergingCompatibilityITCase.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.test.checkpointing;
+
+import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend;
+import org.apache.flink.core.execution.RestoreMode;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.flink.util.TestLogger;
+
+import org.junit.jupiter.api.io.TempDir;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import java.nio.file.Path;
+import java.util.Arrays;
+import java.util.Collection;
+
+import static 
org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase.runJobAndGetExternalizedCheckpoint;
+import static org.assertj.core.api.Assertions.assertThat;
+
+/**
+ * FileMerging Compatibility IT case which tests recovery from a checkpoint 
created in different
+ * fileMerging mode (i.e. fileMerging enabled/disabled).
+ */
+public class SnapshotFileMergingCompatibilityITCase extends TestLogger {
+
+public static Collection parameters() {
+return Arrays.asList(
+new Object[][] {
+{RestoreMode.CLAIM, true},
+{RestoreMode.CLAIM, false},
+{RestoreMode.NO_CLAIM, true},
+{RestoreMode.NO_CLAIM, false}
+});
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromDisablingToEnablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, false, true, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+@ParameterizedTest(name = "RestoreMode = {0}, fileMergingAcrossBoundary = 
{1}")
+@MethodSource("parameters")
+public void testSwitchFromEnablingToDisablingFileMerging(
+RestoreMode restoreMode, boolean fileMergingAcrossBoundary, 
@TempDir Path checkpointDir)
+throws Exception {
+testSwitchingFileMerging(
+checkpointDir, true, false, restoreMode, 
fileMergingAcrossBoundary);
+}
+
+private void testSwitchingFileMerging(
+Path checkpointDir,
+boolean firstFileMergingSwitch,
+boolean secondFileMergingSwitch,
+RestoreMode restoreMode,
+boolean fileMergingAcrossBoundary)
+throws Exception {
+final Configuration config = new Configuration();
+config.set(CheckpointingOptions.CHECKPOINTS_DIRECTORY, 
checkpointDir.toUri().toString());
+config.set(CheckpointingOptions.INCREMENTAL_CHECKPOINTS, true);
+config.set(CheckpointingOptions.FILE_MERGING_ACROSS_BOUNDARY, 
fileMergingAcrossBoundary);
+config.set(CheckpointingOptions.FILE_MERGING_ENABLED, 
firstFileMergingSwitch);
+MiniClusterWithClientResource firstCluster =
+new MiniClusterWithClientResource(
+new MiniClusterResourceConfiguration.Builder()
+.setConfiguration(config)
+.setNumberTaskManagers(2)
+.setNumberSlotsPerTaskManager(2)
+.build());
+EmbeddedRocksDBStateBackend stateBackend1 = new 
EmbeddedRocksDBStateBackend();
+stateBackend1.configure(config, 
Thread.currentThread().getContextClassLoader());
+firstCluster.before();
+String externalCheckpoint;
+try {
+externalCheckpoint =
+runJobAndGetExternalizedCheckpoint(
+stateBackend1, null, firstCluster, restoreM

Re: [PR] [FLINK-35353][docs-zh]Translate "Profiler" page into Chinese [flink]

2024-06-04 Thread via GitHub



drymatini commented on PR #24822:
URL: https://github.com/apache/flink/pull/24822#issuecomment-2147503847

   Hi @JingGe , thank you for the review, I have already made adjustment based 
on your opinion, please check my commit again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35475][runtime] Introduce isInternalSorterSupport to OperatorAttributes [flink]

2024-06-04 Thread via GitHub



jeyhunkarimov commented on PR #24874:
URL: https://github.com/apache/flink/pull/24874#issuecomment-2147546699

   Thanks @Sxnan LGTM!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [hotfix][docs] Fix typo in Log Example Given on upgrade.md File [flink-kubernetes-operator]

2024-06-04 Thread via GitHub



nacisimsek opened a new pull request, #835:
URL: https://github.com/apache/flink-kubernetes-operator/pull/835

   The ID in the savepoint folder name that is passed as a parameter, and the 
ID in the folder name that is given as the log example do NOT match. 
   
   Expected ID in log example: `aec3dd08e76d`
   Actual ID given in the log example: `2f40a9c8e4b9`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35501] Use common IO thread pool for RocksDB data transfer [flink]

2024-06-04 Thread via GitHub



rkhachatryan commented on PR #24882:
URL: https://github.com/apache/flink/pull/24882#issuecomment-2147612702

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-06-04 Thread via GitHub



XComp commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1626077906


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.connector.testframe.source.FromElementsSource;
+import org.apache.flink.table.data.GenericRowData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.Row;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.ArrayList;
+import java.util.List;
+
+class Generator implements FromElementsSource.ElementsSupplier {
+private static final long serialVersionUID = -8455653458083514261L;
+private final List elements;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+final int stepMs = (int) (1000 / rowsPerKeyAndSecond);
+final long durationMs = durationSeconds * 1000L;
+final long offsetMs = offsetSeconds * 2000L;
+final List elements = new ArrayList<>();
+int keyIndex = 0;
+long ms = 0;
+while (ms < durationMs) {
+elements.add(createRow(keyIndex++, ms, offsetMs));

Review Comment:
   Jing actually has a good point on the memory consumption. I missed that one. 
👍 We should continue generating the records on-the-fly to be closer to what the 
original test did.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852074#comment-17852074
 ] 

Hong Liang Teoh commented on FLINK-35282:
-

Yes. Reopned JIRA

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Assignee: Antonio Vespoli
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] (FLINK-35282) PyFlink Support for Apache Beam > 2.49

2024-06-04 Thread Hong Liang Teoh (Jira)



[ https://issues.apache.org/jira/browse/FLINK-35282 ]


Hong Liang Teoh deleted comment on FLINK-35282:
-

was (Author: hong):
Yes. Reopned JIRA

> PyFlink Support for Apache Beam > 2.49
> --
>
> Key: FLINK-35282
> URL: https://issues.apache.org/jira/browse/FLINK-35282
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Affects Versions: 1.19.0, 1.18.1
>Reporter: APA
>Assignee: Antonio Vespoli
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> From what I see PyFlink still has the requirement of Apache Beam => 2.43.0 
> and <= 2.49.0 which subsequently results in a requirement of PyArrow <= 
> 12.0.0. That keeps us exposed to 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-47248]
> I'm not deep enough familiar with the PyFlink code base to understand why 
> Apache Beam's upper dependency limit can't be lifted. From all the existing 
> issues I haven't seen one addressing this. Therefore I created one now. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34487][ci] Adds Python Wheels nightly GHA workflow [flink]

2024-06-04 Thread via GitHub



XComp commented on code in PR #24426:
URL: https://github.com/apache/flink/pull/24426#discussion_r1626117295


##
.github/workflows/nightly.yml:
##
@@ -94,3 +94,46 @@ jobs:
   s3_bucket: ${{ secrets.IT_CASE_S3_BUCKET }}
   s3_access_key: ${{ secrets.IT_CASE_S3_ACCESS_KEY }}
   s3_secret_key: ${{ secrets.IT_CASE_S3_SECRET_KEY }}
+
+  build_python_wheels:
+name: "Build Python Wheels on ${{ matrix.os_name }}"
+runs-on: ${{ matrix.os }}
+strategy:
+  fail-fast: false
+  matrix:
+include:
+  - os: ubuntu-latest
+os_name: linux
+  - os: macos-latest
+os_name: macos
+steps:
+  - name: "Checkout the repository"
+uses: actions/checkout@v4
+with:
+  fetch-depth: 0
+  persist-credentials: false
+  - name: "Stringify workflow name"
+uses: "./.github/actions/stringify"
+id: stringify_workflow
+with:
+  value: ${{ github.workflow }}
+  - name: "Build python wheels for ${{ matrix.os_name }}"
+uses: pypa/cibuildwheel@v2.16.5

Review Comment:
   Looks fine from my side. But I am not familiar with the whole buildwheel 
logic. @HuangXingBo can you do another pass over it and approve the changes 
once more?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] A couple of hotfixes [flink]

2024-06-04 Thread via GitHub



JingGe commented on code in PR #24883:
URL: https://github.com/apache/flink/pull/24883#discussion_r1626162425


##
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java:
##
@@ -563,7 +563,7 @@ private InflightDataGateOrPartitionRescalingDescriptor gate(
 }
 
 @Test
-void testChannelStateAssignmentTwoGatesPartiallyDownscaling()
+public void testChannelStateAssignmentTwoGatesPartiallyDownscaling()

Review Comment:
   NIT: With Junit 5, methods are not required to be public.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35501] Use common IO thread pool for RocksDB data transfer [flink]

2024-06-04 Thread via GitHub



rkhachatryan commented on PR #24882:
URL: https://github.com/apache/flink/pull/24882#issuecomment-2147786034

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] A couple of hotfixes [flink]

2024-06-04 Thread via GitHub



pnowojski commented on code in PR #24883:
URL: https://github.com/apache/flink/pull/24883#discussion_r1626244851


##
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperationTest.java:
##
@@ -563,7 +563,7 @@ private InflightDataGateOrPartitionRescalingDescriptor gate(
 }
 
 @Test
-void testChannelStateAssignmentTwoGatesPartiallyDownscaling()
+public void testChannelStateAssignmentTwoGatesPartiallyDownscaling()

Review Comment:
   Ahhh, that explains my confusion when I was cherry-picking some things  
between branches. Anyway, next time I will just skip this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [hotfix] A couple of hotfixes [flink]

2024-06-04 Thread via GitHub



pnowojski merged PR #24883:
URL: https://github.com/apache/flink/pull/24883


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-31533) CREATE TABLE AS SELECT should support to define partition

2024-06-04 Thread Jira



[ 
https://issues.apache.org/jira/browse/FLINK-31533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852110#comment-17852110
 ] 

Sergio Peña commented on FLINK-31533:
-

Hi [~luoyuxia] [~aitozi], I'd like to make some contribution to Flink to extend 
the CTAS statement to allow a custom schema definition (columns, partition & 
primary keys, watermarks, etc). I noticed this task and FLINK-31534 are meant 
for a subset of those changes. Are you still considering in working on this at 
some point? I see these comments were made a year ago, but I don't want to step 
on others people's work if there's some progress or interest on it. Are you ok 
if I take this task?

I'm going to write a FLIP with the proposal about the semantics for the schema 
definition, which is similar to Mysql CTAS. 

> CREATE TABLE AS SELECT should support to define partition
> -
>
> Key: FLINK-31533
> URL: https://issues.apache.org/jira/browse/FLINK-31533
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: luoyuxia
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35521) Flink FileSystem SQL Connector Generating SUCESS File Multiple Times

2024-06-04 Thread EMERSON WANG (Jira)

EMERSON WANG created FLINK-35521:


 Summary: Flink FileSystem SQL Connector Generating SUCESS File 
Multiple Times
 Key: FLINK-35521
 URL: https://issues.apache.org/jira/browse/FLINK-35521
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Affects Versions: 1.18.1
 Environment: Our PyFlink SQL jobs are running in AWS EKS environment.
Reporter: EMERSON WANG


Our Flink table SQL job received data from the Kafka streams and then sinked 
all partitioned data into the associated parquet files under the same S3 folder
through the filesystem SQL connector.

For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was set 
to 'success-file' and sink.partition-commit.trigger was set
to 'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
multiple times after multiple partitions are committed.

Because all partitioned parquet files and _SUCCESS file are in the same S3 
folder and _SUCCESS file is used to trigger the downstream application, we 
really like the _SUCCESS file to be generated only once instead of multiple 
times after all partitions are committed and all parquet files are ready to be 
processed.
Thus, one _SUCCESS file can be used to trigger the downstream application only 
once instead of multiple times.

We knew we could set sink.partition-commit.trigger to 'process-time' to 
generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
would not meet our business requirements.

We'd request the FileSystem SQL connector should support to the following new 
user case:
Even if sink.partition-commit.trigger is set to 'partition-time', _SUCCESS file 
will be generated only once after all partitions are committed and all output 
files are ready to be processed, and will be used to trigger the downstream 
application only once instead of multiple times.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35521) Flink FileSystem SQL Connector Generating SUCESS File Multiple Times

2024-06-04 Thread EMERSON WANG (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

EMERSON WANG updated FLINK-35521:
-
Description: 
Our Flink table SQL job received data from the Kafka streams and then sinked 
all partitioned data into the associated parquet files under the same S3 folder 
through the filesystem SQL connector.

For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was set 
to 'success-file' and sink.partition-commit.trigger was set to 
'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
multiple times after multiple partitions are committed.

Because all partitioned parquet files and _SUCCESS file are in the same S3 
folder and _SUCCESS file is used to trigger the downstream application, we 
really like the _SUCCESS file to be generated only once instead of multiple 
times after all partitions are committed and all parquet files are ready to be 
processed. Thus, one _SUCCESS file can be used to trigger the downstream 
application only once instead of multiple times.

We knew we could set sink.partition-commit.trigger to 'process-time' to 
generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
would not meet our business requirements.

We'd request the FileSystem SQL connector should support to the following new 
user case:
Even if sink.partition-commit.trigger is set to 'partition-time', _SUCCESS file 
will be generated only once after all partitions are committed and all output 
files are ready to be processed, and will be used to trigger the downstream 
application only once instead of multiple times.

  was:
Our Flink table SQL job received data from the Kafka streams and then sinked 
all partitioned data into the associated parquet files under the same S3 folder
through the filesystem SQL connector.

For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was set 
to 'success-file' and sink.partition-commit.trigger was set
to 'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
multiple times after multiple partitions are committed.

Because all partitioned parquet files and _SUCCESS file are in the same S3 
folder and _SUCCESS file is used to trigger the downstream application, we 
really like the _SUCCESS file to be generated only once instead of multiple 
times after all partitions are committed and all parquet files are ready to be 
processed.
Thus, one _SUCCESS file can be used to trigger the downstream application only 
once instead of multiple times.

We knew we could set sink.partition-commit.trigger to 'process-time' to 
generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
would not meet our business requirements.

We'd request the FileSystem SQL connector should support to the following new 
user case:
Even if sink.partition-commit.trigger is set to 'partition-time', _SUCCESS file 
will be generated only once after all partitions are committed and all output 
files are ready to be processed, and will be used to trigger the downstream 
application only once instead of multiple times.


> Flink FileSystem SQL Connector Generating SUCESS File Multiple Times
> 
>
> Key: FLINK-35521
> URL: https://issues.apache.org/jira/browse/FLINK-35521
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.18.1
> Environment: Our PyFlink SQL jobs are running in AWS EKS environment.
>Reporter: EMERSON WANG
>Priority: Major
>
> Our Flink table SQL job received data from the Kafka streams and then sinked 
> all partitioned data into the associated parquet files under the same S3 
> folder through the filesystem SQL connector.
> For the S3 filesystem SQL connector, sink.partition-commit.policy.kind was 
> set to 'success-file' and sink.partition-commit.trigger was set to 
> 'partition-time'. We found that _SUCCESS file in the S3 folder was generated 
> multiple times after multiple partitions are committed.
> Because all partitioned parquet files and _SUCCESS file are in the same S3 
> folder and _SUCCESS file is used to trigger the downstream application, we 
> really like the _SUCCESS file to be generated only once instead of multiple 
> times after all partitions are committed and all parquet files are ready to 
> be processed. Thus, one _SUCCESS file can be used to trigger the downstream 
> application only once instead of multiple times.
> We knew we could set sink.partition-commit.trigger to 'process-time' to 
> generate _SUCCESS file only once in the S3 folder; however, 'process-time' 
> would not meet our business requirements.
> We'd request the FileSystem SQL connector should support to the following new 
> user case:
> Even if sink.partition-commit.trigger is set to 'part

[jira] [Updated] (FLINK-35515) Upgrade hive version to 4.0.0

2024-06-04 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-35515:
---
Fix Version/s: (was: 1.18.2)

> Upgrade hive version to 4.0.0
> -
>
> Key: FLINK-35515
> URL: https://issues.apache.org/jira/browse/FLINK-35515
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.18.1
>Reporter: vikasap
>Priority: Major
>
> Hive version 4.0.0 was released recently. However none of the major flink 
> versions will work with this. Filing this so that major flink version's 
> flink-sql and table api will be able to work with the new version of hive 
> metastore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35515) Upgrade hive version to 4.0.0

2024-06-04 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-35515:
---
Issue Type: New Feature  (was: Improvement)

> Upgrade hive version to 4.0.0
> -
>
> Key: FLINK-35515
> URL: https://issues.apache.org/jira/browse/FLINK-35515
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.18.1
>Reporter: vikasap
>Priority: Major
>
> Hive version 4.0.0 was released recently. However none of the major flink 
> versions will work with this. Filing this so that major flink version's 
> flink-sql and table api will be able to work with the new version of hive 
> metastore.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-24605) org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined offset with no reset policy for partitions

2024-06-04 Thread EMERSON WANG (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-24605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852128#comment-17852128
 ] 

EMERSON WANG commented on FLINK-24605:
--

We got the same exception when scan.startup.mode was set to 'group-offsets' and 
properties.auto.offset.reset was set to 'latest'.

We had to work around as follows:
To start the Flink job for the first time, we had to set scan.startup.mode to 
'latest', let the job run for a few minutes, then stopped the job, reset 
scan.startup.mode to 'group-offsets' and finally restart the job.

We'd appreciate it very much if you could resolve this ticket as soon as 
possible.

> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.NoOffsetForPartitionException:
>  Undefined offset with no reset policy for partitions
> ---
>
> Key: FLINK-24605
> URL: https://issues.apache.org/jira/browse/FLINK-24605
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.14.0
>Reporter: Abhijit Talukdar
>Priority: Major
>
> Getting below issue when using 'scan.startup.mode' = 'group-offsets'.
>  
> WITH (
>  'connector' = 'kafka',
>  'topic' = 'ss7gsm-signaling-event',
>  'properties.bootstrap.servers' = '**:9093',
>  'properties.group.id' = 'ss7gsm-signaling-event-T5',
>  'value.format' = 'avro-confluent',
>  'value.avro-confluent.schema-registry.url' = 'https://***:9099',
>  {color:#ff8b00}'scan.startup.mode' = 'group-offsets',{color}
> {color:#ff8b00} 'properties.auto.offset.reset' = 'earliest',{color}
>  'properties.security.protocol'= 'SASL_SSL',
>  'properties.ssl.truststore.location'= '/*/*/ca-certs.jks',
>  'properties.ssl.truststore.password'= '*',
>  'properties.sasl.kerberos.service.name'= 'kafka'
> )
>  
> 'ss7gsm-signaling-event-T5' is a new group id. If the group id is present in 
> ZK then it works otherwise getting below exception. 
> 'properties.auto.offset.reset' property is ignored.
>  
> 021-10-20 22:18:28,267 INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig
>  [] - ConsumerConfig values: 021-10-20 22:18:28,267 INFO  
> org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig
>  [] - ConsumerConfig values: 
> allow.auto.create.topics = false
> auto.commit.interval.ms = 5000
> {color:#FF} +*auto.offset.reset = none*+{color}
> bootstrap.servers = [.xxx.com:9093]
>  
>  
> Exception:
>  
> 021-10-20 22:18:28,620 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
> KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> 
> Sink: Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched 
> from INITIALIZING to RUNNING.021-10-20 22:18:28,620 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
> KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> 
> Sink: Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched 
> from INITIALIZING to RUNNING.2021-10-20 22:18:28,621 INFO  
> org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumerator [] - 
> Assigning splits to readers \{0=[[Partition: ss7gsm-signaling-event-2, 
> StartingOffset: -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-8, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-7, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-9, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-5, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-6, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-0, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-4, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808], [Partition: ss7gsm-signaling-event-1, StartingOffset: 
> -3, StoppingOffset: -9223372036854775808], [Partition: 
> ss7gsm-signaling-event-3, StartingOffset: -3, StoppingOffset: 
> -9223372036854775808]]}2021-10-20 22:18:28,716 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: 
> KafkaSource-hiveSignaling.signaling_stg.ss7gsm_signaling_event_flink_k -> 
> Sink: Collect table sink (1/1) (89b175333242fab8914271ad7638ba92) switched 
> from RUNNING to FAILED on xx.xxx.xxx.xxx:42075-d80607 @ xx.xxx.com 
> (dataPort=34120).java.lang.RuntimeException: One or more fetchers have 
> encountered exception at 
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkError

[jira] [Commented] (FLINK-35502) compress the checkpoint metadata generated by ZK/ETCD HA Services

2024-06-04 Thread Mingliang Liu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852132#comment-17852132
 ] 

Mingliang Liu commented on FLINK-35502:
---

I guess in my day job I don't see user requests that need to recover from any 
point in the past days. I think it works just fine to recover from recent 
checkpoints in the past days. And compressing is a good improvement as data is 
getting large.

> compress the checkpoint metadata generated by ZK/ETCD HA Services
> -
>
> Key: FLINK-35502
> URL: https://issues.apache.org/jira/browse/FLINK-35502
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ying Z
>Priority: Major
>
> In the implementation of Flink HA, the metadata of checkpoints is stored in 
> either Zookeeper (ZK HA) or ETCD (K8S HA), such as:
> {code:java}
> checkpointID-0036044: 
> checkpointID-0036045: 
> ...
> ... {code}
> However, neither of these are designed to store excessive amounts of data. If 
> the 
> [state.checkpoints.num-retained]([https://nightlies.apache.org/flink/flink-docs-release-1.19/zh/docs/deployment/config/#state-checkpoints-num-retained])
>  setting is set too large, it can easily cause abnormalities in ZK/ETCD. 
> The error log when set state.checkpoints.num-retained to 1500:
> {code:java}
> Caused by: org.apache.flink.util.SerializedThrowable: 
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> PUT at: 
> https://xxx/api/v1/namespaces/default/configmaps/xxx-jobmanager-leader. 
> Message: ConfigMap "xxx-jobmanager-leader" is invalid: 0J:
> Too long: must have at most 1048576 bytes. Received status: 
> Status(apiVersion=v1, code=422, 
> details=StatusDetails(causes=(StatusCause(field=[J, message=Too long: must 
> have at most 1048576 bytes, reason=FieldValueTooLong, 
> additionalProperties={})l, group=null, kind=ConfigMap, 
> name=xxx-jobmanager-leader, retryAfterSeconds=null, uid=null, 
> additionalProperties=(}), kind=Status, message=ConfigMap 
> "xxx-jobmanager-leader" is invalid: [): Too long: must have at most 1048576 
> bytes, metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=Invalid, status=Failure, additionalProperties=(}). {code}
> In Flink's code, all checkpoint metadata are updated at the same time, and 
> The checkpoint metadata contains many repeated bytes, therefore it can 
> achieve a very good compression ratio.
> Therefore, I suggest compressing the data when writing checkpoints and 
> decompressing it when reading, to reduce storage pressure and improve IO 
> efficiency.
> Here is the sample code, and reduce the metadata size from 1M bytes to 30K.
> {code:java}
> // Map -> Json
> ObjectMapper objectMapper = new ObjectMapper();
> String checkpointJson = objectMapper.writeValueAsString(checkpointMap); // // 
> copress and base64  
> String compressedBase64 = compressAndEncode(checkpointJson); 
> compressedData.put("checkpoint-all", compressedBase64);{code}
> {code:java}
>     private static String compressAndEncode(String data) throws IOException {
>         ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
>         try (GZIPOutputStream gzipOutputStream = new 
> GZIPOutputStream(outputStream))
> {             gzipOutputStream.write(data.getBytes(StandardCharsets.UTF_8));  
>        }
>         byte[] compressedData = outputStream.toByteArray();
>         return Base64.getEncoder().encodeToString(compressedData);
>     } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-34172] Add support for altering a distribution via ALTER TABLE [flink]

2024-06-04 Thread via GitHub



jnh5y opened a new pull request, #24886:
URL: https://github.com/apache/flink/pull/24886

   ## What is the purpose of the change
   
   This PR implements the SQL parser changes for ALTER TABLE to support ADD, 
MODIFY, and DROP DISTRIBUTION statements.
   
   ## Brief change log
   The SQL Parser has been updated.
   The `AlterSchemaConverter` has been updated to pass the changes in a 
DISTRIBUTION through to the `Operation`.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-34172) Add support for altering a distribution via ALTER TABLE

2024-06-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34172:
---
Labels: pull-request-available  (was: )

> Add support for altering a distribution via ALTER TABLE 
> 
>
> Key: FLINK-34172
> URL: https://issues.apache.org/jira/browse/FLINK-34172
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Jim Hughes
>Assignee: Jim Hughes
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34172] Add support for altering a distribution via ALTER TABLE [flink]

2024-06-04 Thread via GitHub



flinkbot commented on PR #24886:
URL: https://github.com/apache/flink/pull/24886#issuecomment-2148475342

   
   ## CI report:
   
   * 4081f6d00def12b386970d94681364dd99d089a6 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35504] Improve Elasticsearch 8 connector observability [flink-connector-elasticsearch]

2024-06-04 Thread via GitHub



liuml07 commented on PR #106:
URL: 
https://github.com/apache/flink-connector-elasticsearch/pull/106#issuecomment-2148526213

   @reswqa Could you help review and merge? Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread xingbe (Jira)

xingbe created FLINK-35522:
--

 Summary: The source task may get stuck after a failover occurs in 
batch jobs
 Key: FLINK-35522
 URL: https://issues.apache.org/jira/browse/FLINK-35522
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.1, 1.19.0, 1.17.2, 1.20.0
Reporter: xingbe
 Fix For: 1.20.0


If the source task does not get assigned a split because the SplitEnumerator 
has no more splits, and a failover occurs during the closing process, the 
SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
started source task, causing the source vertex to remain stuck indefinitely. 
This case may only occur in batch jobs where speculative execution has been 
enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread xingbe (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852240#comment-17852240
 ] 

xingbe commented on FLINK-35522:


I have a solution to fix it, Could you please assign this ticket to me? 
[~zhuzh] 

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Priority: Major
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35354] Support host mapping in Flink tikv cdc [flink-cdc]

2024-06-04 Thread via GitHub



Mrart commented on PR #3336:
URL: https://github.com/apache/flink-cdc/pull/3336#issuecomment-2148692911

   @leonardBang Can you help me review it again？


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-35522:
---

Assignee: xingbe

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852241#comment-17852241
 ] 

Zhu Zhu commented on FLINK-35522:
-

Thanks for reporting this problem and volunteering to fix it! [~xiasun]
The ticket is assigned to you.

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35512) ArtifactFetchManagerTest unit tests fail

2024-06-04 Thread Rob Young (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852243#comment-17852243
 ] 

Rob Young commented on FLINK-35512:
---

The additional artifact looks arbitrary from the ArtifactFetchManager's point 
of view, so instead of depending on output of the build, maybe we should use a 
file controlled by the test like:
{code:java}
-    private File getFlinkClientsJar() throws IOException {
-        return TestingUtils.getFileFromTargetDir(
-                ArtifactFetchManager.class,
-                p ->
-                        org.apache.flink.util.FileUtils.isJarFile(p)
-                                && 
p.toFile().getName().startsWith("flink-clients")
-                                && 
!p.toFile().getName().contains("test-utils"));
+    private File createArbitraryArtifact() throws IOException {
+        Path tempFile = Files.createTempFile(tempDir, "arbitrary", ".jar");
+        Files.write(tempFile, 
UUID.randomUUID().toString().getBytes(StandardCharsets.UTF_8));
+        return tempFile.toFile();
     } {code}
usages of `File sourceFile = TestingUtils.getClassFile(getClass());` could also 
be replaced with this so all test inputs are generated by the test

 

> ArtifactFetchManagerTest unit tests fail
> 
>
> Key: FLINK-35512
> URL: https://issues.apache.org/jira/browse/FLINK-35512
> Project: Flink
>  Issue Type: Technical Debt
>Affects Versions: 1.19.1
>Reporter: Hong Liang Teoh
>Priority: Major
> Fix For: 1.19.1
>
>
> The below three tests from *ArtifactFetchManagerTest* seem to fail 
> consistently:
>  * ArtifactFetchManagerTest.testFileSystemFetchWithAdditionalUri
>  * ArtifactFetchManagerTest.testMixedArtifactFetch
>  * ArtifactFetchManagerTest.testHttpFetch
> The error printed is
> {code:java}
> java.lang.AssertionError:
> Expecting actual not to be empty
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.getFlinkClientsJar(ArtifactFetchManagerTest.java:248)
>     at 
> org.apache.flink.client.program.artifact.ArtifactFetchManagerTest.testMixedArtifactFetch(ArtifactFetchManagerTest.java:146)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
>     at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>     at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>     at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
>     at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-06-04 Thread via GitHub



liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1626831685


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream arrayExceptTestCases() {
 + "ARRAY_EXCEPT(, )"));
 }
 
+private Stream arrayIntersectTestCases() {
+return Stream.of(
+
TestSetSpec.forFunction(BuiltInFunctionDefinitions.ARRAY_INTERSECT)
+.onFieldsWithData(
+new Integer[] {1, 1, 2},
+null,
+new Row[] {Row.of(true, 1), Row.of(true, 2), 
null},
+new Integer[] {null, null, 1},
+new Map[] {
+CollectionUtil.map(entry(1, "a"), entry(2, 
"b")),
+CollectionUtil.map(entry(3, "c"), entry(4, 
"d"))
+},
+new Integer[][] {new Integer[] {1, 2, 3}})
+.andDataTypes(
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(
+DataTypes.ROW(DataTypes.BOOLEAN(), 
DataTypes.INT())),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.MAP(DataTypes.INT(), 
DataTypes.STRING())),
+
DataTypes.ARRAY(DataTypes.ARRAY(DataTypes.INT(
+// ARRAY
+.testResult(
+$("f0").arrayIntersect(new Integer[] {1, null, 
4}),
+"ARRAY_INTERSECT(f0, ARRAY[1, NULL, 4])",
+new Integer[] {1, 1},

Review Comment:
   @snuyanzin  fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-06-04 Thread via GitHub



liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1626831821


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream arrayExceptTestCases() {
 + "ARRAY_EXCEPT(, )"));
 }
 
+private Stream arrayIntersectTestCases() {
+return Stream.of(
+
TestSetSpec.forFunction(BuiltInFunctionDefinitions.ARRAY_INTERSECT)
+.onFieldsWithData(
+new Integer[] {1, 1, 2},
+null,
+new Row[] {Row.of(true, 1), Row.of(true, 2), 
null},
+new Integer[] {null, null, 1},
+new Map[] {
+CollectionUtil.map(entry(1, "a"), entry(2, 
"b")),
+CollectionUtil.map(entry(3, "c"), entry(4, 
"d"))
+},
+new Integer[][] {new Integer[] {1, 2, 3}})
+.andDataTypes(
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(
+DataTypes.ROW(DataTypes.BOOLEAN(), 
DataTypes.INT())),
+DataTypes.ARRAY(DataTypes.INT()),
+DataTypes.ARRAY(DataTypes.MAP(DataTypes.INT(), 
DataTypes.STRING())),
+
DataTypes.ARRAY(DataTypes.ARRAY(DataTypes.INT(
+// ARRAY
+.testResult(
+$("f0").arrayIntersect(new Integer[] {1, null, 
4}),
+"ARRAY_INTERSECT(f0, ARRAY[1, NULL, 4])",
+new Integer[] {1, 1},
+DataTypes.ARRAY(DataTypes.INT()))
+.testResult(
+$("f0").arrayIntersect(new Integer[] {3, 4}),
+"ARRAY_INTERSECT(f0, ARRAY[3, 4])",
+new Integer[] {},
+DataTypes.ARRAY(DataTypes.INT()))
+.testResult(
+$("f1").arrayIntersect(new Integer[] {1, null, 
4}),
+"ARRAY_INTERSECT(f1, ARRAY[1, NULL, 4])",
+null,
+DataTypes.ARRAY(DataTypes.INT()))
+// ARRAY>
+.testResult(
+$("f2").arrayIntersect(
+new Row[] {
+null, Row.of(true, 2),
+}),
+"ARRAY_INTERSECT(f2, ARRAY[NULL, ROW(TRUE, 
2)])",
+new Row[] {Row.of(true, 2), null},
+DataTypes.ARRAY(
+DataTypes.ROW(DataTypes.BOOLEAN(), 
DataTypes.INT(
+// arrayOne contains null elements
+.testResult(
+$("f3").arrayIntersect(new Integer[] {null, 
42}),
+"ARRAY_INTERSECT(f3, ARRAY[null, 42])",
+new Integer[] {null, null},

Review Comment:
   fixed @snuyanzin 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35522) The source task may get stuck after a failover occurs in batch jobs

2024-06-04 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35522:
---
Labels: pull-request-available  (was: )

> The source task may get stuck after a failover occurs in batch jobs
> ---
>
> Key: FLINK-35522
> URL: https://issues.apache.org/jira/browse/FLINK-35522
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.2, 1.19.0, 1.18.1, 1.20.0
>Reporter: xingbe
>Assignee: xingbe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> If the source task does not get assigned a split because the SplitEnumerator 
> has no more splits, and a failover occurs during the closing process, the 
> SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
> started source task, causing the source vertex to remain stuck indefinitely. 
> This case may only occur in batch jobs where speculative execution has been 
> enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35522][runtime] Fix the issue that the source task may get stuck in speculative execution mode. [flink]

2024-06-04 Thread via GitHub



SinBex opened a new pull request, #24887:
URL: https://github.com/apache/flink/pull/24887

   ## What is the purpose of the change
   
   If the source task does not get assigned a split because the SplitEnumerator 
has no more splits, and a failover occurs during the closing process, the 
SourceCoordinatorContext will not resend the NoMoreSplit event to the newly 
started source task, causing the source vertex to remain stuck indefinitely. 
   This case may only occur in batch jobs where speculative execution has been 
enabled.
   
   
   ## Brief change log
   
 - fix the issue that the source task may get stuck in speculative 
execution mode.
   
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   
 - *Added it case to verify the issue.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): ( no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: ( no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): ( no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no )
 - The S3 file system connector: (no )
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-06-04 Thread via GitHub



liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2148729746

   @snuyanzin fix your review, thanks very much


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [BP-3.1.1][FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink (#3233) [flink-cdc]

2024-06-04 Thread via GitHub



loserwang1024 opened a new pull request, #3386:
URL: https://github.com/apache/flink-cdc/pull/3386

   BP https://issues.apache.org/jira/browse/FLINK-35149 to 3.1.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [BP-3.1.1][FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink (#3233) [flink-cdc]

2024-06-04 Thread via GitHub



loserwang1024 closed pull request #3386: [BP-3.1.1][FLINK-35149][cdc-composer] 
Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not 
TwoPhaseCommittingSink (#3233)
URL: https://github.com/apache/flink-cdc/pull/3386


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [BP-3.1.1][FLINK-35149][cdc-composer] Fix DataSinkTranslator#sinkTo ignoring pre-write topology if not TwoPhaseCommittingSink (#3233) [flink-cdc]

2024-06-04 Thread via GitHub



loserwang1024 opened a new pull request, #3387:
URL: https://github.com/apache/flink-cdc/pull/3387

   BP https://issues.apache.org/jira/browse/FLINK-35149 to 3.1.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 >

1 - 100 of 138 matches

Mail list logo