date:20240524

[jira] [Updated] (FLINK-34545) Add OceanBase pipeline connector to Flink CDC

2024-05-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34545:
---
Labels: pull-request-available  (was: )

> Add OceanBase pipeline connector to Flink CDC
> -
>
> Key: FLINK-34545
> URL: https://issues.apache.org/jira/browse/FLINK-34545
> Project: Flink
>  Issue Type: New Feature
>  Components: Flink CDC
>Reporter: He Wang
>Priority: Major
>  Labels: pull-request-available
>
> Flink CDC supports end-to-end jobs starting from version 3.0.0, but now only 
> supports sink to Doris and StarRocks. OceanBase has made a lot of 
> improvements on AP recently, so supporting writing to OceanBase via Flink CDC 
> would be helpful to users.
> OceanBase maintains a sink connector that supports multi-table sink and ddl 
> sink. We can use it to build the pipeline connector. 
> [https://github.com/oceanbase/flink-connector-oceanbase]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33892][runtime] Support Job Recovery from JobMaster Failures for Batch Jobs. [flink]

2024-05-24 Thread via GitHub



zhuzhurk commented on code in PR #24771:
URL: https://github.com/apache/flink/pull/24771#discussion_r1612975806


##
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/BatchJobRecoveryTest.java:
##
@@ -0,0 +1,1217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler.adaptivebatch;
+
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.api.common.eventtime.WatermarkAlignmentParams;
+import org.apache.flink.api.connector.source.Boundedness;
+import org.apache.flink.api.connector.source.mocks.MockSource;
+import org.apache.flink.api.connector.source.mocks.MockSourceSplit;
+import org.apache.flink.api.connector.source.mocks.MockSplitEnumerator;
+import org.apache.flink.configuration.BatchExecutionOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.runtime.clusterframework.types.ResourceID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.DefaultExecutionGraph;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionGraph;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.executiongraph.IntermediateResultPartition;
+import org.apache.flink.runtime.executiongraph.InternalExecutionGraphAccessor;
+import org.apache.flink.runtime.executiongraph.ResultPartitionBytes;
+import 
org.apache.flink.runtime.executiongraph.TestingComponentMainThreadExecutor;
+import 
org.apache.flink.runtime.executiongraph.failover.FixedDelayRestartBackoffTimeStrategy;
+import org.apache.flink.runtime.io.network.partition.JobMasterPartitionTracker;
+import 
org.apache.flink.runtime.io.network.partition.JobMasterPartitionTrackerImpl;
+import 
org.apache.flink.runtime.io.network.partition.PartitionNotFoundException;
+import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
+import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.jobgraph.JobVertex;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.runtime.jobmaster.event.ExecutionVertexFinishedEvent;
+import org.apache.flink.runtime.jobmaster.event.FileSystemJobEventStore;
+import org.apache.flink.runtime.jobmaster.event.JobEvent;
+import org.apache.flink.runtime.jobmaster.event.JobEventManager;
+import org.apache.flink.runtime.jobmaster.event.JobEventStore;
+import org.apache.flink.runtime.jobmaster.utils.TestingJobMasterGateway;
+import org.apache.flink.runtime.jobmaster.utils.TestingJobMasterGatewayBuilder;
+import org.apache.flink.runtime.operators.coordination.EventReceivingTasks;
+import 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder;
+import 
org.apache.flink.runtime.operators.coordination.RecreateOnResetOperatorCoordinator;
+import 
org.apache.flink.runtime.operators.coordination.TestingOperatorCoordinator;
+import org.apache.flink.runtime.scheduler.DefaultSchedulerBuilder;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingResultPartition;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingTopology;
+import org.apache.flink.runtime.shuffle.DefaultShuffleMetrics;
+import org.apache.flink.runtime.shuffle.JobShuffleContextImpl;
+import org.apache.flink.runtime.shuffle.NettyShuffleDescriptor;
+import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
+import org.apache.flink.runtime.shuffle.PartitionWithMetrics;
+import org.apache.flink.runtime.shuffle.ShuffleDescriptor;
+import org.apache.flink.runtime.shuffle.ShuffleMaster;
+import org.apache.fli

Re: [PR] [FLINK-33892][runtime] Support Job Recovery from JobMaster Failures for Batch Jobs. [flink]

2024-05-24 Thread via GitHub



zhuzhurk commented on code in PR #24771:
URL: https://github.com/apache/flink/pull/24771#discussion_r1612986632


##
flink-tests/src/test/java/org/apache/flink/test/scheduling/JMFailoverITCase.java:
##
@@ -98,7 +98,7 @@
 import java.util.stream.IntStream;
 
 import static org.apache.flink.util.Preconditions.checkState;
-import static org.junit.Assert.assertEquals;
+import static org.assertj.core.api.Assertions.assertThat;
 
 /** ITCase for JM failover. */
 public class JMFailoverITCase {

Review Comment:
   Junit5 should be used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33892][runtime] Support Job Recovery from JobMaster Failures for Batch Jobs. [flink]

2024-05-24 Thread via GitHub



JunRuiLee commented on code in PR #24771:
URL: https://github.com/apache/flink/pull/24771#discussion_r1612997227


##
flink-tests/src/test/java/org/apache/flink/test/scheduling/JMFailoverITCase.java:
##
@@ -98,7 +98,7 @@
 import java.util.stream.IntStream;
 
 import static org.apache.flink.util.Preconditions.checkState;
-import static org.junit.Assert.assertEquals;
+import static org.assertj.core.api.Assertions.assertThat;
 
 /** ITCase for JM failover. */
 public class JMFailoverITCase {

Review Comment:
   This case utilizes the TestExecutorResource, which is compatible with JUnit 
4.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-34380) Strange RowKind and records about intermediate output when using minibatch join

2024-05-24 Thread xuyang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849209#comment-17849209
 ] 

xuyang commented on FLINK-34380:


Hi, [~rovboyko] . +1 to fix the wrong order. I'll take a look for your pr later.

> Strange RowKind and records about intermediate output when using minibatch 
> join
> ---
>
> Key: FLINK-34380
> URL: https://issues.apache.org/jira/browse/FLINK-34380
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: xuyang
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> // Add it in CalcItCase
> @Test
>   def test(): Unit = {
> env.setParallelism(1)
> val rows = Seq(
>   changelogRow("+I", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("-U", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("+U", java.lang.Integer.valueOf(1), "99"),
>   changelogRow("-D", java.lang.Integer.valueOf(1), "99")
> )
> val dataId = TestValuesTableFactory.registerData(rows)
> val ddl =
>   s"""
>  |CREATE TABLE t1 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl)
> val ddl2 =
>   s"""
>  |CREATE TABLE t2 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl2)
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ENABLED, 
> Boolean.box(true))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ALLOW_LATENCY, 
> Duration.ofSeconds(5))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_SIZE, Long.box(3L))
> println(tEnv.sqlQuery("SELECT * from t1 join t2 on t1.a = 
> t2.a").explain())
> tEnv.executeSql("SELECT * from t1 join t2 on t1.a = t2.a").print()
>   } {code}
> Output:
> {code:java}
> ++-+-+-+-+
> | op |           a |               b |          a0 |      b0 |
> ++-+-+-+-+
> | +U |           1 |               1 |           1 |      99 |
> | +U |           1 |              99 |           1 |      99 |
> | -U |           1 |               1 |           1 |      99 |
> | -D |           1 |              99 |           1 |      99 |
> ++-+-+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35444) Paimon Pipeline Connector support changing column names to lowercase for Hive metastore

2024-05-24 Thread LvYanquan (Jira)

LvYanquan created FLINK-35444:
-

 Summary: Paimon Pipeline Connector support changing column names 
to lowercase for Hive metastore
 Key: FLINK-35444
 URL: https://issues.apache.org/jira/browse/FLINK-35444
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: LvYanquan
 Fix For: cdc-3.2.0


Hive metastore Require column names to be lowercase, but field names from 
upstream tables may not meet the requirements. 
We can add a parameter configuration in sink to convert all column names to 
lowercase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1613069596


##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.functions.scalar;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.table.data.ArrayData;
+import org.apache.flink.table.data.GenericArrayData;
+import org.apache.flink.table.functions.BuiltInFunctionDefinitions;
+import org.apache.flink.table.functions.FunctionContext;
+import org.apache.flink.table.functions.SpecializedFunction;
+import org.apache.flink.table.runtime.util.EqualityAndHashcodeProvider;
+import org.apache.flink.table.runtime.util.ObjectContainer;
+import org.apache.flink.table.types.CollectionDataType;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.util.FlinkRuntimeException;
+
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+/** Implementation of {@link BuiltInFunctionDefinitions#ARRAY_INTERSECT}. */
+@Internal
+public class ArrayIntersectFunction extends BuiltInScalarFunction {
+private final ArrayData.ElementGetter elementGetter;
+private final EqualityAndHashcodeProvider equalityAndHashcodeProvider;
+
+public ArrayIntersectFunction(SpecializedFunction.SpecializedContext 
context) {
+super(BuiltInFunctionDefinitions.ARRAY_INTERSECT, context);
+final DataType dataType =
+((CollectionDataType) 
context.getCallContext().getArgumentDataTypes().get(0))
+.getElementDataType()
+.toInternal();
+elementGetter = 
ArrayData.createElementGetter(dataType.toInternal().getLogicalType());
+this.equalityAndHashcodeProvider = new 
EqualityAndHashcodeProvider(context, dataType);
+}
+
+@Override
+public void open(FunctionContext context) throws Exception {
+equalityAndHashcodeProvider.open(context);
+}
+
+public @Nullable ArrayData eval(ArrayData arrayOne, ArrayData arrayTwo) {
+try {
+if (arrayOne == null || arrayTwo == null) {
+return null;
+}
+
+List list = new ArrayList<>();
+Set set = new HashSet<>();
+for (int pos = 0; pos < arrayTwo.size(); pos++) {
+final Object element = 
elementGetter.getElementOrNull(arrayTwo, pos);
+final ObjectContainer objectContainer = 
createObjectContainer(element);
+set.add(objectContainer);
+}
+for (int pos = 0; pos < arrayOne.size(); pos++) {
+final Object element = 
elementGetter.getElementOrNull(arrayOne, pos);
+final ObjectContainer objectContainer = 
createObjectContainer(element);
+if (set.contains(objectContainer)) {
+list.add(element);
+}
+}
+return new GenericArrayData(list.toArray());
+} catch (Throwable t) {
+throw new FlinkRuntimeException(t);

Review Comment:
   From the discussions in 
[https://github.com/apache/flink/pull/24773](https://github.com/apache/flink/pull/24773)
 I wonder if we should return null on a runtime exception. For this case null 
seems to be what is returned if we cannot do an intersect, so it would make 
send to return a null if there is a runtime error.  I there for suggest 
catching RuntimeException, logging and returning null.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1613069596


##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.functions.scalar;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.table.data.ArrayData;
+import org.apache.flink.table.data.GenericArrayData;
+import org.apache.flink.table.functions.BuiltInFunctionDefinitions;
+import org.apache.flink.table.functions.FunctionContext;
+import org.apache.flink.table.functions.SpecializedFunction;
+import org.apache.flink.table.runtime.util.EqualityAndHashcodeProvider;
+import org.apache.flink.table.runtime.util.ObjectContainer;
+import org.apache.flink.table.types.CollectionDataType;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.util.FlinkRuntimeException;
+
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+/** Implementation of {@link BuiltInFunctionDefinitions#ARRAY_INTERSECT}. */
+@Internal
+public class ArrayIntersectFunction extends BuiltInScalarFunction {
+private final ArrayData.ElementGetter elementGetter;
+private final EqualityAndHashcodeProvider equalityAndHashcodeProvider;
+
+public ArrayIntersectFunction(SpecializedFunction.SpecializedContext 
context) {
+super(BuiltInFunctionDefinitions.ARRAY_INTERSECT, context);
+final DataType dataType =
+((CollectionDataType) 
context.getCallContext().getArgumentDataTypes().get(0))
+.getElementDataType()
+.toInternal();
+elementGetter = 
ArrayData.createElementGetter(dataType.toInternal().getLogicalType());
+this.equalityAndHashcodeProvider = new 
EqualityAndHashcodeProvider(context, dataType);
+}
+
+@Override
+public void open(FunctionContext context) throws Exception {
+equalityAndHashcodeProvider.open(context);
+}
+
+public @Nullable ArrayData eval(ArrayData arrayOne, ArrayData arrayTwo) {
+try {
+if (arrayOne == null || arrayTwo == null) {
+return null;
+}
+
+List list = new ArrayList<>();
+Set set = new HashSet<>();
+for (int pos = 0; pos < arrayTwo.size(); pos++) {
+final Object element = 
elementGetter.getElementOrNull(arrayTwo, pos);
+final ObjectContainer objectContainer = 
createObjectContainer(element);
+set.add(objectContainer);
+}
+for (int pos = 0; pos < arrayOne.size(); pos++) {
+final Object element = 
elementGetter.getElementOrNull(arrayOne, pos);
+final ObjectContainer objectContainer = 
createObjectContainer(element);
+if (set.contains(objectContainer)) {
+list.add(element);
+}
+}
+return new GenericArrayData(list.toArray());
+} catch (Throwable t) {
+throw new FlinkRuntimeException(t);

Review Comment:
   From the discussions in 
[https://github.com/apache/flink/pull/24773](https://github.com/apache/flink/pull/24773)
 I wonder if we should return null on a runtime exception. For this case null 
is returned if we cannot do an intersect, so it would make sense to return a 
null if there is a runtime error.  I there for suggest catching 
RuntimeException, logging and returning null.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35437] Rewrite BlockStatementGrouper so that it uses less memory [flink]

2024-05-24 Thread via GitHub



dawidwys merged PR #24834:
URL: https://github.com/apache/flink/pull/24834


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (FLINK-35437) BlockStatementGrouper uses lots of memory

2024-05-24 Thread Dawid Wysakowicz (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz resolved FLINK-35437.
--
Resolution: Fixed

Fixed in 3d40bd7dd197b12b7b156bd758b4129148e885d1

> BlockStatementGrouper uses lots of memory
> -
>
> Key: FLINK-35437
> URL: https://issues.apache.org/jira/browse/FLINK-35437
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>
> For deeply nested {{if else}} statements {{BlockStatementGrouper}} uses loads 
> of memory and fails with OOM quickly.
> When running JMs with around 400mb a query like:
> {code}
> select case when orderid = 0 then 1 when orderid = 1 then 2 when orderid
> = 2 then 3 when orderid = 3 then 4 when orderid = 4 then 5 when orderid = 
> 5 then
> 6 when orderid = 6 then 7 when orderid = 7 then 8 when orderid = 8 then 9 
> when
> orderid = 9 then 10 when orderid = 10 then 11 when orderid = 11 then 12 
> when orderid
> = 12 then 13 when orderid = 13 then 14 when orderid = 14 then 15 when 
> orderid
> = 15 then 16 when orderid = 16 then 17 when orderid = 17 then 18 when 
> orderid
> = 18 then 19 when orderid = 19 then 20 when orderid = 20 then 21 when 
> orderid
> = 21 then 22 when orderid = 22 then 23 when orderid = 23 then 24 when 
> orderid
> = 24 then 25 when orderid = 25 then 26 when orderid = 26 then 27 when 
> orderid
> = 27 then 28 when orderid = 28 then 29 when orderid = 29 then 30 when 
> orderid
> = 30 then 31 when orderid = 31 then 32 when orderid = 32 then 33 when 
> orderid
> = 33 then 34 when orderid = 34 then 35 when orderid = 35 then 36 when 
> orderid
> = 36 then 37 when orderid = 37 then 38 when orderid = 38 then 39 when 
> orderid
> = 39 then 40 when orderid = 40 then 41 when orderid = 41 then 42 when 
> orderid
> = 42 then 43 when orderid = 43 then 44 when orderid = 44 then 45 when 
> orderid
> = 45 then 46 when orderid = 46 then 47 when orderid = 47 then 48 when 
> orderid
> = 48 then 49 when orderid = 49 then 50 when orderid = 50 then 51 when 
> orderid
> = 51 then 52 when orderid = 52 then 53 when orderid = 53 then 54 when 
> orderid
> = 54 then 55 when orderid = 55 then 56 when orderid = 56 then 57 when 
> orderid
> = 57 then 58 when orderid = 58 then 59 when orderid = 59 then 60 when 
> orderid
> = 60 then 61 when orderid = 61 then 62 when orderid = 62 then 63 when 
> orderid
> = 63 then 64 when orderid = 64 then 65 when orderid = 65 then 66 when 
> orderid
> = 66 then 67 when orderid = 67 then 68 when orderid = 68 then 69 when 
> orderid
> = 69 then 70 when orderid = 70 then 71 when orderid = 71 then 72 when 
> orderid
> = 72 then 73 when orderid = 73 then 74 when orderid = 74 then 75 when 
> orderid
> = 75 then 76 when orderid = 76 then 77 when orderid = 77 then 78 when 
> orderid
> = 78 then 79 when orderid = 79 then 80 when orderid = 80 then 81 when 
> orderid
> = 81 then 82 when orderid = 82 then 83 when orderid = 83 then 84 when 
> orderid
> = 84 then 85 when orderid = 85 then 86 when orderid = 86 then 87 when 
> orderid
> = 87 then 88 when orderid = 88 then 89 when orderid = 89 then 90 when 
> orderid
> = 90 then 91 when orderid = 91 then 92 when orderid = 92 then 93 when 
> orderid
> = 93 then 94 when orderid = 94 then 95 when orderid = 95 then 96 when 
> orderid
> = 96 then 97 when orderid = 97 then 98 when orderid = 98 then 99 when 
> orderid
> = 99 then 100 when orderid = 100 then 101 when orderid = 101 then 102 
> when orderid
> = 102 then 103 when orderid = 103 then 104 when orderid = 104 then 105 
> when orderid
> = 105 then 106 when orderid = 106 then 107 when orderid = 107 then 108 
> when orderid
> = 108 then 109 when orderid = 109 then 110 when orderid = 110 then 111 
> when orderid
> = 111 then 112 when orderid = 112 then 113 when orderid = 113 then 114 
> when orderid
> = 114 then 115 when orderid = 115 then 116 when orderid = 116 then 117 
> when orderid
> = 117 then 118 when orderid = 118 then 119 when orderid = 119 then 120 
> when orderid
> = 120 then 121 when orderid = 121 then 122 when orderid = 122 then 123 
> when orderid
> = 123 then 124 when orderid = 124 then 125 when orderid = 125 then 126 
> when orderid
> = 126 then 127 when orderid = 127 then 128 when orderid = 128 then 129 
> when orderid
> = 129 then 130 when orderid = 130 then 131 when orderid = 131 then 132 
> when orderid
> = 132 then 133 when orderid = 133 then 134 when orderid = 134 then 135 
> when orderid
> = 135 then 136 when o

Re: [PR] [FLINK-31664] Implement ARRAY_INTERSECT function [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1613069596


##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.functions.scalar;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.table.data.ArrayData;
+import org.apache.flink.table.data.GenericArrayData;
+import org.apache.flink.table.functions.BuiltInFunctionDefinitions;
+import org.apache.flink.table.functions.FunctionContext;
+import org.apache.flink.table.functions.SpecializedFunction;
+import org.apache.flink.table.runtime.util.EqualityAndHashcodeProvider;
+import org.apache.flink.table.runtime.util.ObjectContainer;
+import org.apache.flink.table.types.CollectionDataType;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.util.FlinkRuntimeException;
+
+import javax.annotation.Nullable;
+
+import java.util.ArrayList;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Set;
+
+/** Implementation of {@link BuiltInFunctionDefinitions#ARRAY_INTERSECT}. */
+@Internal
+public class ArrayIntersectFunction extends BuiltInScalarFunction {
+private final ArrayData.ElementGetter elementGetter;
+private final EqualityAndHashcodeProvider equalityAndHashcodeProvider;
+
+public ArrayIntersectFunction(SpecializedFunction.SpecializedContext 
context) {
+super(BuiltInFunctionDefinitions.ARRAY_INTERSECT, context);
+final DataType dataType =
+((CollectionDataType) 
context.getCallContext().getArgumentDataTypes().get(0))
+.getElementDataType()
+.toInternal();
+elementGetter = 
ArrayData.createElementGetter(dataType.toInternal().getLogicalType());
+this.equalityAndHashcodeProvider = new 
EqualityAndHashcodeProvider(context, dataType);
+}
+
+@Override
+public void open(FunctionContext context) throws Exception {
+equalityAndHashcodeProvider.open(context);
+}
+
+public @Nullable ArrayData eval(ArrayData arrayOne, ArrayData arrayTwo) {
+try {
+if (arrayOne == null || arrayTwo == null) {
+return null;
+}
+
+List list = new ArrayList<>();
+Set set = new HashSet<>();
+for (int pos = 0; pos < arrayTwo.size(); pos++) {
+final Object element = 
elementGetter.getElementOrNull(arrayTwo, pos);
+final ObjectContainer objectContainer = 
createObjectContainer(element);
+set.add(objectContainer);
+}
+for (int pos = 0; pos < arrayOne.size(); pos++) {
+final Object element = 
elementGetter.getElementOrNull(arrayOne, pos);
+final ObjectContainer objectContainer = 
createObjectContainer(element);
+if (set.contains(objectContainer)) {
+list.add(element);
+}
+}
+return new GenericArrayData(list.toArray());
+} catch (Throwable t) {
+throw new FlinkRuntimeException(t);

Review Comment:
   From the discussions in 
[https://github.com/apache/flink/pull/24773](https://github.com/apache/flink/pull/24773)
 I wonder if we should return null on a runtime exception. For this case null 
is returned if we cannot do an intersect, so it would make sense to return a 
null if there is a runtime error.  From your tests this will occur if the types 
do not match in array 1 or array 2 - it could be argued that there is no 
intersection in this case so return null. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35434] Support pass exception in StateExecutor to runtime [flink]

2024-05-24 Thread via GitHub



zoltar9264 commented on code in PR #24833:
URL: https://github.com/apache/flink/pull/24833#discussion_r1613094184


##
flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/ForStWriteBatchOperation.java:
##
@@ -75,7 +74,10 @@ public CompletableFuture process() {
 request.completeStateFuture();
 }
 } catch (Exception e) {
-throw new CompletionException("Error while adding data 
to ForStDB", e);
+for (ForStDBPutRequest request : batchRequest) {

Review Comment:
   Thanks for reminder @masteryhx , firstly this batch operation should indeed 
fail in this case.  Currently, the  AsyncFrameworkExceptionHandler will fail 
the job directly, then the StateExecutor will be destroy. I don't see the need 
to let the state executor continue executing, so I will fail the executor and 
update the pr.l



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35326][hive] add lineage integration for Hive connector [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24835:
URL: https://github.com/apache/flink/pull/24835#discussion_r1613096352


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/DefaultLineageDataset.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.flink.streaming.api.lineage;
+
+import org.apache.flink.annotation.Internal;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/** Default implementation for {@link LineageDataset}. */
+@Internal
+public class DefaultLineageDataset implements LineageDataset {

Review Comment:
   why are these classes called Default. I see there exists a 
DefaultLineageGraph, which seems a reasonable default.
I wonder whether it would be better to call this HiveLineageDataset and use 
a `Hive` prefix instead of `Default ` .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34545][cdc-pipeline-connector]Add OceanBase pipeline connector to Flink CDC [flink-cdc]

2024-05-24 Thread via GitHub



yuanoOo commented on PR #3360:
URL: https://github.com/apache/flink-cdc/pull/3360#issuecomment-2128951367

   Please trigger the CI again，I have removed the snapshot dependency. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35301][cdc] fix deadlock when loading driver classes [flink-cdc]

2024-05-24 Thread via GitHub



leonardBang commented on code in PR #3300:
URL: https://github.com/apache/flink-cdc/pull/3300#discussion_r1613103947


##
flink-cdc-connect/flink-cdc-source-connectors/flink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/relational/connection/JdbcConnectionPoolFactory.java:
##
@@ -43,6 +45,7 @@ public HikariDataSource 
createPooledDataSource(JdbcSourceConfig sourceConfig) {
 config.setMinimumIdle(MINIMUM_POOL_SIZE);
 config.setMaximumPoolSize(sourceConfig.getConnectionPoolSize());
 
config.setConnectionTimeout(sourceConfig.getConnectTimeout().toMillis());
+DriverManager.getDrivers();

Review Comment:
   Could you add a note for this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35326][hive] add lineage integration for Hive connector [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24835:
URL: https://github.com/apache/flink/pull/24835#discussion_r1613096352


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/DefaultLineageDataset.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.flink.streaming.api.lineage;
+
+import org.apache.flink.annotation.Internal;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/** Default implementation for {@link LineageDataset}. */
+@Internal
+public class DefaultLineageDataset implements LineageDataset {

Review Comment:
   why are these classes called Default. I see there exists a 
DefaultLineageGraph, which seems a reasonable default.
I wonder whether it would be better to call this HiveLineageDataset and use 
a `Hive` prefix instead of `Default ` .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35326][hive] add lineage integration for Hive connector [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24835:
URL: https://github.com/apache/flink/pull/24835#discussion_r1613106837


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/DefaultLineageDataset.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.flink.streaming.api.lineage;
+
+import org.apache.flink.annotation.Internal;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/** Default implementation for {@link LineageDataset}. */
+@Internal
+public class DefaultLineageDataset implements LineageDataset {

Review Comment:
   we should have unit tests for these new classes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35326][hive] add lineage integration for Hive connector [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24835:
URL: https://github.com/apache/flink/pull/24835#discussion_r1613112277


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/DefaultLineageDataset.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.flink.streaming.api.lineage;
+
+import org.apache.flink.annotation.Internal;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/** Default implementation for {@link LineageDataset}. */
+@Internal

Review Comment:
   I wonder whether these names should be LineageDatasetImpl instead of Default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35326][hive] add lineage integration for Hive connector [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24835:
URL: https://github.com/apache/flink/pull/24835#discussion_r1613118293


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/DefaultLineageDataset.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.flink.streaming.api.lineage;
+
+import org.apache.flink.annotation.Internal;
+
+import java.util.HashMap;
+import java.util.Map;
+
+/** Default implementation for {@link LineageDataset}. */
+@Internal

Review Comment:
   why is this @Internal - is it not the case that connectors that do not live 
in this repository are likely to use these implementation classes. I suggest 
using @PublicEvolving in line with the interface.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35242] Optimize schema evolution & add SE IT cases [flink-cdc]

2024-05-24 Thread via GitHub



yuxiqian commented on PR #3339:
URL: https://github.com/apache/flink-cdc/pull/3339#issuecomment-2128993381

   @PatrickRen @leonardBang PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-35272) Pipeline Transform job supports omitting / renaming calculation column

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-35272:
--

Assignee: yux

> Pipeline Transform job supports omitting / renaming calculation column
> --
>
> Key: FLINK-35272
> URL: https://issues.apache.org/jira/browse/FLINK-35272
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: yux
>Assignee: yux
>Priority: Major
>  Labels: pull-request-available
>
> Currently, pipeline transform rules require all columns used by expression 
> calculation must be present in final projected schema, and shall not be 
> renamed or omitted.
>  
> The reason behind this is any column not directly present in projection rules 
> will be filtered out in the PreProjection step, and then the PostProjection 
> process could not find those non-present but indirectly depended columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35272) Pipeline Transform job supports omitting / renaming calculation column

2024-05-24 Thread Leonard Xu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849224#comment-17849224
 ] 

Leonard Xu commented on FLINK-35272:


Thanks [~xiqian_yu] for taking this ticket, assigned it to you!

> Pipeline Transform job supports omitting / renaming calculation column
> --
>
> Key: FLINK-35272
> URL: https://issues.apache.org/jira/browse/FLINK-35272
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: yux
>Assignee: yux
>Priority: Major
>  Labels: pull-request-available
>
> Currently, pipeline transform rules require all columns used by expression 
> calculation must be present in final projected schema, and shall not be 
> renamed or omitted.
>  
> The reason behind this is any column not directly present in projection rules 
> will be filtered out in the PreProjection step, and then the PostProjection 
> process could not find those non-present but indirectly depended columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35301:
---
Affects Version/s: 3.1.0

> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35301:
---
Description: 
In JDK 8 and earlier version, if multiple threads invoking 
Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
deadlock.

There is a issue which analyzes this problem:

[FLINK-19435] Deadlock when loading different driver classes concurrently using 
Class.forName - ASF JIRA (apache.org)

> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> In JDK 8 and earlier version, if multiple threads invoking 
> Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
> deadlock.
> There is a issue which analyzes this problem:
> [FLINK-19435] Deadlock when loading different driver classes concurrently 
> using Class.forName - ASF JIRA (apache.org)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35301:
---
Description: 
In JDK 8 and earlier version, if multiple threads invoking 
Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
deadlock.

FLINK-19435  analyzes this problem too.

  was:
In JDK 8 and earlier version, if multiple threads invoking 
Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
deadlock.

There is a issue which analyzes this problem:

[FLINK-19435] Deadlock when loading different driver classes concurrently using 
Class.forName - ASF JIRA (apache.org)


> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> In JDK 8 and earlier version, if multiple threads invoking 
> Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
> deadlock.
> FLINK-19435  analyzes this problem too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [enhancement] add param which contain column comments of source table field [flink-cdc]

2024-05-24 Thread via GitHub



lvyanquan commented on PR #3039:
URL: https://github.com/apache/flink-cdc/pull/3039#issuecomment-2129028432

   Hi @dufeng1010  can you help to rebase to master？


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35301][cdc] fix deadlock when loading driver classes [flink-cdc]

2024-05-24 Thread via GitHub



Shawn-Hx commented on code in PR #3300:
URL: https://github.com/apache/flink-cdc/pull/3300#discussion_r1613143714


##
flink-cdc-connect/flink-cdc-source-connectors/flink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/relational/connection/JdbcConnectionPoolFactory.java:
##
@@ -43,6 +45,7 @@ public HikariDataSource 
createPooledDataSource(JdbcSourceConfig sourceConfig) {
 config.setMinimumIdle(MINIMUM_POOL_SIZE);
 config.setMaximumPoolSize(sourceConfig.getConnectionPoolSize());
 
config.setConnectionTimeout(sourceConfig.getConnectTimeout().toMillis());
+DriverManager.getDrivers();

Review Comment:
   OK, I have added a description in this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35301:
---
Description: 
In JDK 8 and earlier version, if multiple threads invoke 
Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
deadlock.

FLINK-19435  analyzes this problem too.

  was:
In JDK 8 and earlier version, if multiple threads invoking 
Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
deadlock.

FLINK-19435  analyzes this problem too.


> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> In JDK 8 and earlier version, if multiple threads invoke 
> Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
> deadlock.
> FLINK-19435  analyzes this problem too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Leonard Xu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849226#comment-17849226
 ] 

Leonard Xu commented on FLINK-35301:


Thanks [~ShawnHx] for report this issue, I saw you've opened a PR to fix it, 
thus I assigned this ticket to you

> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> In JDK 8 and earlier version, if multiple threads invoke 
> Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
> deadlock.
> FLINK-19435  analyzes this problem too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-35301:
--

Assignee: Xiao Huang

> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Assignee: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> In JDK 8 and earlier version, if multiple threads invoke 
> Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
> deadlock.
> FLINK-19435  analyzes this problem too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-35301:
---
Fix Version/s: cdc-3.2.0

> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Assignee: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> In JDK 8 and earlier version, if multiple threads invoke 
> Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
> deadlock.
> FLINK-19435  analyzes this problem too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35301) Fix deadlock when loading driver classes

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-35301:
---
Affects Version/s: cdc-3.1.0
   (was: 3.1.0)

> Fix deadlock when loading driver classes
> 
>
> Key: FLINK-35301
> URL: https://issues.apache.org/jira/browse/FLINK-35301
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Assignee: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> In JDK 8 and earlier version, if multiple threads invoke 
> Class.forName("com.xx.Driver") simultaneously, it may cause jdbc driver 
> deadlock.
> FLINK-19435  analyzes this problem too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35294) Use source config to check if the filter should be applied in timestamp starting mode

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35294:
---
Affects Version/s: 3.1.0

> Use source config to check if the filter should be applied in timestamp 
> starting mode
> -
>
> Key: FLINK-35294
> URL: https://issues.apache.org/jira/browse/FLINK-35294
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: 3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34108][table] Add built-in URL_ENCODE and URL_DECODE function. [flink]

2024-05-24 Thread via GitHub



superdiaodiao commented on code in PR #24773:
URL: https://github.com/apache/flink/pull/24773#discussion_r1611984504


##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/UrlDecodeFunction.java:
##
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.functions.scalar;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.functions.BuiltInFunctionDefinitions;
+import org.apache.flink.table.functions.SpecializedFunction;
+
+import javax.annotation.Nullable;
+
+import java.io.UnsupportedEncodingException;
+import java.net.URLDecoder;
+import java.nio.charset.Charset;
+import java.nio.charset.StandardCharsets;
+
+/** Implementation of {@link BuiltInFunctionDefinitions#URL_DECODE}. */
+@Internal
+public class UrlDecodeFunction extends BuiltInScalarFunction {
+
+public UrlDecodeFunction(SpecializedFunction.SpecializedContext context) {
+super(BuiltInFunctionDefinitions.URL_DECODE, context);
+}
+
+public @Nullable StringData eval(StringData value) {
+final Charset charset = StandardCharsets.UTF_8;
+try {
+return StringData.fromString(URLDecoder.decode(value.toString(), 
charset.name()));
+} catch (UnsupportedEncodingException e) {
+throw new RuntimeException(
+"Failed to decode value: " + value + " with charset: " + 
charset.name(), e);
+} catch (RuntimeException e) {
+return value;
+}

Review Comment:
   How about this:
   add a WARN log before returning null?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35294) Use source config to check if the filter should be applied in timestamp starting mode

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35294:
---
Description: 
Since MySQL does not support the ability to quickly locate an binlog offset 
through a timestamp, the current logic for starting from a timestamp is to 
begin from the earliest binlog offset and then filter out the data before the 
user-specified position.

If the user restarts the job during the filtering process, this filter will 
become ineffective.

> Use source config to check if the filter should be applied in timestamp 
> starting mode
> -
>
> Key: FLINK-35294
> URL: https://issues.apache.org/jira/browse/FLINK-35294
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> Since MySQL does not support the ability to quickly locate an binlog offset 
> through a timestamp, the current logic for starting from a timestamp is to 
> begin from the earliest binlog offset and then filter out the data before the 
> user-specified position.
> If the user restarts the job during the filtering process, this filter will 
> become ineffective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34582) release build tools lost the newly added py3.11 packages for mac

2024-05-24 Thread Muhammet Orazov (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-34582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849229#comment-17849229
 ] 

Muhammet Orazov commented on FLINK-34582:
-

Ohh no. Yes indeed you are right [~mapohl] , thanks for the update (y)

> release build tools lost the newly added py3.11 packages for mac
> 
>
> Key: FLINK-34582
> URL: https://issues.apache.org/jira/browse/FLINK-34582
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.19.0, 1.20.0
>Reporter: lincoln lee
>Assignee: Xingbo Huang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.19.0, 1.20.0
>
> Attachments: image-2024-03-07-10-39-49-341.png
>
>
> during 1.19.0-rc1 building binaries via 
> tools/releasing/create_binary_release.sh
> lost the newly added py3.11  2 packages for mac



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35294) Use source config to check if the filter should be applied in timestamp starting mode

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35294:
---
Affects Version/s: cdc-3.1.0

> Use source config to check if the filter should be applied in timestamp 
> starting mode
> -
>
> Key: FLINK-35294
> URL: https://issues.apache.org/jira/browse/FLINK-35294
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> Since MySQL does not support the ability to quickly locate an binlog offset 
> through a timestamp, the current logic for starting from a timestamp is to 
> begin from the earliest binlog offset and then filter out the data before the 
> user-specified position.
> If the user restarts the job during the filtering process, this filter will 
> become ineffective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35294) Use source config to check if the filter should be applied in timestamp starting mode

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35294:
---
Fix Version/s: cdc-3.2.0

> Use source config to check if the filter should be applied in timestamp 
> starting mode
> -
>
> Key: FLINK-35294
> URL: https://issues.apache.org/jira/browse/FLINK-35294
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> Since MySQL does not support the ability to quickly locate an binlog offset 
> through a timestamp, the current logic for starting from a timestamp is to 
> begin from the earliest binlog offset and then filter out the data before the 
> user-specified position.
> If the user restarts the job during the filtering process, this filter will 
> become ineffective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35295) Improve jdbc connection pool initialization failure message

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35295:
---
Description: As described in ticket title.

> Improve jdbc connection pool initialization failure message
> ---
>
> Key: FLINK-35295
> URL: https://issues.apache.org/jira/browse/FLINK-35295
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Major
>  Labels: pull-request-available
>
> As described in ticket title.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35298) Improve metric reporter logic

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35298:
---
Description: 
* In snapshot phase, set -1 as the value of currentFetchEventTimeLag metric.
 * Support currentEmitEventTimeLag metric.

> Improve metric reporter logic
> -
>
> Key: FLINK-35298
> URL: https://issues.apache.org/jira/browse/FLINK-35298
> Project: Flink
>  Issue Type: Improvement
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> * In snapshot phase, set -1 as the value of currentFetchEventTimeLag metric.
>  * Support currentEmitEventTimeLag metric.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35129][cdc][postgres] Add checkpoint cycle option for commit offsets [flink-cdc]

2024-05-24 Thread via GitHub



loserwang1024 commented on code in PR #3349:
URL: https://github.com/apache/flink-cdc/pull/3349#discussion_r1613185257


##
flink-cdc-connect/flink-cdc-source-connectors/flink-connector-postgres-cdc/src/main/java/org/apache/flink/cdc/connectors/postgres/source/reader/PostgresSourceReader.java:
##
@@ -104,6 +108,10 @@ public List snapshotState(long 
checkpointId) {
 
 @Override
 public void notifyCheckpointComplete(long checkpointId) throws Exception {
+this.checkpointCount = (this.checkpointCount + 1) % 
this.checkpointCycle;

Review Comment:
   @morazow, great job. I generally agree with your approach. However, I 
currently have a different perspective. Instead of committing at every third 
checkpoint cycle (rolling window), I prefer to commit the offsets three 
checkpoints in advance (sliding window).
   
   For a detailed design, we can store successful checkpoint IDs in a min heap, 
whose size is three (as decided by the configuration). When a checkpoint is 
successfully performed, we can push its ID into the heap and take the minimum 
checkpoint ID value, then commit it. By doing this, we always have three 
checkpoints whose offsets have not been recycled.
   
   (P.S.: Let's log the heap at each checkpoint, so users can know from which 
checkpoint IDs they can restore.)
   
   @leonardBang , @ruanhang1993 , CC, WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35129][cdc][postgres] Add checkpoint cycle option for commit offsets [flink-cdc]

2024-05-24 Thread via GitHub



loserwang1024 commented on code in PR #3349:
URL: https://github.com/apache/flink-cdc/pull/3349#discussion_r1613185257


##
flink-cdc-connect/flink-cdc-source-connectors/flink-connector-postgres-cdc/src/main/java/org/apache/flink/cdc/connectors/postgres/source/reader/PostgresSourceReader.java:
##
@@ -104,6 +108,10 @@ public List snapshotState(long 
checkpointId) {
 
 @Override
 public void notifyCheckpointComplete(long checkpointId) throws Exception {
+this.checkpointCount = (this.checkpointCount + 1) % 
this.checkpointCycle;

Review Comment:
   @morazow, great job. I generally agree with your approach. However, I 
currently have a different perspective. Instead of committing at every third 
checkpoint cycle (rolling window), I prefer to commit the offsets three 
checkpoints in advance of current checkpoint (sliding window).
   
   For a detailed design, we can store successful checkpoint IDs in a min heap, 
whose size is three (as decided by the configuration). When a checkpoint is 
successfully performed, we can push its ID into the heap and take the minimum 
checkpoint ID value, then commit it. By doing this, we always have three 
checkpoints whose offsets have not been recycled.
   
   (P.S.: Let's log the heap at each checkpoint, so users can know from which 
checkpoint IDs they can restore.)
   
   @leonardBang , @ruanhang1993 , CC, WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35297) Add validation for option connect.timeout

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35297:
---
Description: the value of option `connector.timeout` needs to be checked at 
compile time.

> Add validation for option connect.timeout
> -
>
> Key: FLINK-35297
> URL: https://issues.apache.org/jira/browse/FLINK-35297
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> the value of option `connector.timeout` needs to be checked at compile time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35129][cdc][postgres] Add checkpoint cycle option for commit offsets [flink-cdc]

2024-05-24 Thread via GitHub



leonardBang commented on code in PR #3349:
URL: https://github.com/apache/flink-cdc/pull/3349#discussion_r1613201109


##
flink-cdc-connect/flink-cdc-source-connectors/flink-connector-postgres-cdc/src/main/java/org/apache/flink/cdc/connectors/postgres/source/reader/PostgresSourceReader.java:
##
@@ -104,6 +108,10 @@ public List snapshotState(long 
checkpointId) {
 
 @Override
 public void notifyCheckpointComplete(long checkpointId) throws Exception {
+this.checkpointCount = (this.checkpointCount + 1) % 
this.checkpointCycle;

Review Comment:
   +1 for @loserwang1024 ‘s comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35300) Improve MySqlStreamingChangeEventSource to skip null events in event deserializer

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35300:
---
Description: As described in title.

> Improve MySqlStreamingChangeEventSource to skip null events in event 
> deserializer
> -
>
> Key: FLINK-35300
> URL: https://issues.apache.org/jira/browse/FLINK-35300
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> As described in title.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35298][cdc] improve fetch delay metric reporter logic [flink-cdc]

2024-05-24 Thread via GitHub



leonardBang merged PR #3298:
URL: https://github.com/apache/flink-cdc/pull/3298


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35298) Improve metric reporter logic

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-35298:
---
Affects Version/s: cdc-3.1.0

> Improve metric reporter logic
> -
>
> Key: FLINK-35298
> URL: https://issues.apache.org/jira/browse/FLINK-35298
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Assignee: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> * In snapshot phase, set -1 as the value of currentFetchEventTimeLag metric.
>  * Support currentEmitEventTimeLag metric.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35298) Improve metric reporter logic

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-35298:
---
Fix Version/s: cdc-3.2.0

> Improve metric reporter logic
> -
>
> Key: FLINK-35298
> URL: https://issues.apache.org/jira/browse/FLINK-35298
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Assignee: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> * In snapshot phase, set -1 as the value of currentFetchEventTimeLag metric.
>  * Support currentEmitEventTimeLag metric.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (FLINK-35298) Improve metric reporter logic

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-35298:
--

Assignee: Xiao Huang

> Improve metric reporter logic
> -
>
> Key: FLINK-35298
> URL: https://issues.apache.org/jira/browse/FLINK-35298
> Project: Flink
>  Issue Type: Improvement
>Reporter: Xiao Huang
>Assignee: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> * In snapshot phase, set -1 as the value of currentFetchEventTimeLag metric.
>  * Support currentEmitEventTimeLag metric.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (FLINK-35298) Improve metric reporter logic

2024-05-24 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu resolved FLINK-35298.

Resolution: Implemented

via master: 8e8fd304afdd9668247a8869698e0949806cad7b

> Improve metric reporter logic
> -
>
> Key: FLINK-35298
> URL: https://issues.apache.org/jira/browse/FLINK-35298
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Assignee: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> * In snapshot phase, set -1 as the value of currentFetchEventTimeLag metric.
>  * Support currentEmitEventTimeLag metric.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34746] Switching to the Apache CDN for Dockerfile [flink-docker]

2024-05-24 Thread via GitHub



hlteoh37 commented on PR #190:
URL: https://github.com/apache/flink-docker/pull/190#issuecomment-2129138391

   @MartijnVisser can we merge this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35408) Add 30 min tolerance value when validating the time-zone setting

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35408:
---
Description: 
Now, MySQL CDC connector will retrieve the number of seconds the 
database-configured timezone is offset from UTC by executing SQL statement 
below, and then compare it with the configured timezone.
{code:java}
SELECT TIME_TO_SEC(TIMEDIFF(NOW(), UTC_TIMESTAMP())) {code}
For some MySQL instances, the validating for time-zone is too strict. We can 
add 30min tolerance value.

> Add 30 min tolerance value when validating the time-zone setting
> 
>
> Key: FLINK-35408
> URL: https://issues.apache.org/jira/browse/FLINK-35408
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> Now, MySQL CDC connector will retrieve the number of seconds the 
> database-configured timezone is offset from UTC by executing SQL statement 
> below, and then compare it with the configured timezone.
> {code:java}
> SELECT TIME_TO_SEC(TIMEDIFF(NOW(), UTC_TIMESTAMP())) {code}
> For some MySQL instances, the validating for time-zone is too strict. We can 
> add 30min tolerance value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34487][ci] Adds Python Wheels nightly GHA workflow [flink]

2024-05-24 Thread via GitHub



morazow commented on code in PR #24426:
URL: https://github.com/apache/flink/pull/24426#discussion_r1613226372


##
.github/workflows/nightly.yml:
##
@@ -94,3 +94,51 @@ jobs:
   s3_bucket: ${{ secrets.IT_CASE_S3_BUCKET }}
   s3_access_key: ${{ secrets.IT_CASE_S3_ACCESS_KEY }}
   s3_secret_key: ${{ secrets.IT_CASE_S3_SECRET_KEY }}
+
+  build_python_wheels:
+name: "Build Python Wheels on ${{ matrix.os }}"

Review Comment:
   Hey @XComp,
   
   I think this is fine. The Azure Pipelines 
(https://github.com/apache/flink/blob/master/tools/azure-pipelines/build-python-wheels.yml)
 also run like that, builds each wheel in a separate job. 
   
   From [GitHub migration 
doc](https://docs.github.com/en/actions/migrating-to-github-actions/manually-migrating-to-github-actions/migrating-from-azure-pipelines-to-github-actions#migrating-jobs-and-steps):
   
   ```
   - Jobs contain a series of steps that run sequentially.
   - Jobs run on separate virtual machines or in separate containers.
   - Jobs run in parallel by default, but can be configured to run sequentially.
   ```
   
   So should be fine to build each wheel separately without the Flink build 
similar to the Azure Pipelines.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35408) Add 30 min tolerance value when validating the time-zone setting

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35408:
---
Description: 
Now, MySQL CDC connector will retrieve the offset seconds between the 
configured timezone and UTC by executing the SQL statement below, and then 
compare it with the configured timezone.
{code:java}
SELECT TIME_TO_SEC(TIMEDIFF(NOW(), UTC_TIMESTAMP())) {code}
For some MySQL instances, the validating for time-zone is too strict. We can 
add 30min tolerance value.

  was:
Now, MySQL CDC connector will retrieve the number of seconds the 
database-configured timezone is offset from UTC by executing SQL statement 
below, and then compare it with the configured timezone.
{code:java}
SELECT TIME_TO_SEC(TIMEDIFF(NOW(), UTC_TIMESTAMP())) {code}
For some MySQL instances, the validating for time-zone is too strict. We can 
add 30min tolerance value.


> Add 30 min tolerance value when validating the time-zone setting
> 
>
> Key: FLINK-35408
> URL: https://issues.apache.org/jira/browse/FLINK-35408
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> Now, MySQL CDC connector will retrieve the offset seconds between the 
> configured timezone and UTC by executing the SQL statement below, and then 
> compare it with the configured timezone.
> {code:java}
> SELECT TIME_TO_SEC(TIMEDIFF(NOW(), UTC_TIMESTAMP())) {code}
> For some MySQL instances, the validating for time-zone is too strict. We can 
> add 30min tolerance value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34487][ci] Adds Python Wheels nightly GHA workflow [flink]

2024-05-24 Thread via GitHub



morazow commented on code in PR #24426:
URL: https://github.com/apache/flink/pull/24426#discussion_r1613227700


##
.github/workflows/nightly.yml:
##
@@ -94,3 +94,51 @@ jobs:
   s3_bucket: ${{ secrets.IT_CASE_S3_BUCKET }}
   s3_access_key: ${{ secrets.IT_CASE_S3_ACCESS_KEY }}
   s3_secret_key: ${{ secrets.IT_CASE_S3_SECRET_KEY }}
+
+  build_python_wheels:
+name: "Build Python Wheels on ${{ matrix.os }}"

Review Comment:
   Additionally, I checked the wheel artifacts from above latest Azure 
pipelines run, they are more or less same size as GitHub actions run



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35409) Request more splits if all splits are filtered from addSplits method

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35409:
---
Description: 
Suppose this scenario: A job is still in the snapshot phase, and the remaining 
uncompleted snapshot splits all belong to a few tables that have been deleted 
by the user.

In such case, when restarting from a savepoint, these uncompleted snapshot 
splits will not trigger a call to the addSplits method. Moreover, since the 
BinlogSplit has not been sent yet, the job will not start the SplitReader to 
read data.

> Request more splits if all splits are filtered from addSplits method
> 
>
> Key: FLINK-35409
> URL: https://issues.apache.org/jira/browse/FLINK-35409
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> Suppose this scenario: A job is still in the snapshot phase, and the 
> remaining uncompleted snapshot splits all belong to a few tables that have 
> been deleted by the user.
> In such case, when restarting from a savepoint, these uncompleted snapshot 
> splits will not trigger a call to the addSplits method. Moreover, since the 
> BinlogSplit has not been sent yet, the job will not start the SplitReader to 
> read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35409) Request more splits if all splits are filtered from addSplits method

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35409:
---
Issue Type: Bug  (was: Improvement)

> Request more splits if all splits are filtered from addSplits method
> 
>
> Key: FLINK-35409
> URL: https://issues.apache.org/jira/browse/FLINK-35409
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> Suppose this scenario: A job is still in the snapshot phase, and the 
> remaining uncompleted snapshot splits all belong to a few tables that have 
> been deleted by the user.
> In such case, when restarting from a savepoint, these uncompleted snapshot 
> splits will not trigger a call to the addSplits method. Moreover, since the 
> BinlogSplit has not been sent yet, the job will not start the SplitReader to 
> read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35408) Add 30 min tolerance value when validating the time-zone setting

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35408:
---
Fix Version/s: cdc-3.2.0

> Add 30 min tolerance value when validating the time-zone setting
> 
>
> Key: FLINK-35408
> URL: https://issues.apache.org/jira/browse/FLINK-35408
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> Now, MySQL CDC connector will retrieve the offset seconds between the 
> configured timezone and UTC by executing the SQL statement below, and then 
> compare it with the configured timezone.
> {code:java}
> SELECT TIME_TO_SEC(TIMEDIFF(NOW(), UTC_TIMESTAMP())) {code}
> For some MySQL instances, the validating for time-zone is too strict. We can 
> add 30min tolerance value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35409) Request more splits if all splits are filtered from addSplits method

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35409:
---
Affects Version/s: cdc-3.1.0

> Request more splits if all splits are filtered from addSplits method
> 
>
> Key: FLINK-35409
> URL: https://issues.apache.org/jira/browse/FLINK-35409
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> Suppose this scenario: A job is still in the snapshot phase, and the 
> remaining uncompleted snapshot splits all belong to a few tables that have 
> been deleted by the user.
> In such case, when restarting from a savepoint, these uncompleted snapshot 
> splits will not trigger a call to the addSplits method. Moreover, since the 
> BinlogSplit has not been sent yet, the job will not start the SplitReader to 
> read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35409) Request more splits if all splits are filtered from addSplits method

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35409:
---
Fix Version/s: cdc-3.2.0

> Request more splits if all splits are filtered from addSplits method
> 
>
> Key: FLINK-35409
> URL: https://issues.apache.org/jira/browse/FLINK-35409
> Project: Flink
>  Issue Type: Bug
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: cdc-3.2.0
>
>
> Suppose this scenario: A job is still in the snapshot phase, and the 
> remaining uncompleted snapshot splits all belong to a few tables that have 
> been deleted by the user.
> In such case, when restarting from a savepoint, these uncompleted snapshot 
> splits will not trigger a call to the addSplits method. Moreover, since the 
> BinlogSplit has not been sent yet, the job will not start the SplitReader to 
> read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35408) Add 30 min tolerance value when validating the time-zone setting

2024-05-24 Thread Xiao Huang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Huang updated FLINK-35408:
---
Affects Version/s: cdc-3.1.0

> Add 30 min tolerance value when validating the time-zone setting
> 
>
> Key: FLINK-35408
> URL: https://issues.apache.org/jira/browse/FLINK-35408
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: Xiao Huang
>Priority: Minor
>  Labels: pull-request-available
>
> Now, MySQL CDC connector will retrieve the offset seconds between the 
> configured timezone and UTC by executing the SQL statement below, and then 
> compare it with the configured timezone.
> {code:java}
> SELECT TIME_TO_SEC(TIMEDIFF(NOW(), UTC_TIMESTAMP())) {code}
> For some MySQL instances, the validating for time-zone is too strict. We can 
> add 30min tolerance value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-34487][ci] Adds Python Wheels nightly GHA workflow [flink]

2024-05-24 Thread via GitHub



morazow commented on code in PR #24426:
URL: https://github.com/apache/flink/pull/24426#discussion_r1613261308


##
.github/workflows/nightly.yml:
##
@@ -94,3 +94,51 @@ jobs:
   s3_bucket: ${{ secrets.IT_CASE_S3_BUCKET }}
   s3_access_key: ${{ secrets.IT_CASE_S3_ACCESS_KEY }}
   s3_secret_key: ${{ secrets.IT_CASE_S3_SECRET_KEY }}
+
+  build_python_wheels:
+name: "Build Python Wheels on ${{ matrix.os }}"
+runs-on: ${{ matrix.os }}
+strategy:
+  fail-fast: false
+  matrix:
+include:
+  - os: ubuntu-latest
+os_name: linux
+python-version: 3.9
+  - os: macos-latest
+os_name: macos
+python-version: 3.9

Review Comment:
   Yes on macos the only pyproject.toml is used by cibuildwheel. Tools seems to 
be well documented, maybe we could also use it for linux builds.
   
   It uses the `cp38-*`, etc for Python versions, 
https://cibuildwheel.pypa.io/en/stable/options/#build-skip. For Linux we are 
running bash scripts for building wheel, but we could let the cibuildwheel 
action do for both platforms.
   
   We can ask @HuangXingBo for final review



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35326][hive] add lineage integration for Hive connector [flink]

2024-05-24 Thread via GitHub



davidradl commented on PR #24835:
URL: https://github.com/apache/flink/pull/24835#issuecomment-2129222164

   Shouldn't this pr go in after 
[https://github.com/apache/flink/pull/24618](https://github.com/apache/flink/pull/24618)
 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34746] Switching to the Apache CDN for Dockerfile [flink-docker]

2024-05-24 Thread via GitHub



MartijnVisser merged PR #190:
URL: https://github.com/apache/flink-docker/pull/190


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-34746) Switching to the Apache CDN for Dockerfile

2024-05-24 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-34746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser closed FLINK-34746.
--
Resolution: Fixed

Fixed in apache/flink-docker@master 883600747505c128d97e9d25c9326f0c6f1d31e4

> Switching to the Apache CDN for Dockerfile
> --
>
> Key: FLINK-34746
> URL: https://issues.apache.org/jira/browse/FLINK-34746
> Project: Flink
>  Issue Type: Improvement
>  Components: flink-docker
>Reporter: lincoln lee
>Assignee: Martijn Visser
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.2, 1.20.0, 1.19.1
>
>
> During publishing the official image, we received some comments
> for Switching to the Apache CDN
>  
> See
> https://github.com/docker-library/official-images/pull/16114
> https://github.com/docker-library/official-images/pull/16430
>  
> Reason for switching: [https://apache.org/history/mirror-history.html] (also 
> [https://www.apache.org/dyn/closer.cgi] and [https://www.apache.org/mirrors])



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35406] Use inner serializer when casting RAW type to BINARY or… [flink]

2024-05-24 Thread via GitHub



twalthr commented on code in PR #24818:
URL: https://github.com/apache/flink/pull/24818#discussion_r1613283206


##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CastFunctionMiscITCase.java:
##
@@ -384,4 +410,15 @@ public static class LocalDateTimeToRaw extends 
ScalarFunction {
 return LocalDateTime.parse(str);
 }
 }
+
+public static byte[] serializeLocalDateTime(LocalDateTime localDateTime) {

Review Comment:
   use the existing utility 
`org.apache.flink.util.InstantiationUtil#serializeToByteArray`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33211][table] support flink table lineage [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24618:
URL: https://github.com/apache/flink/pull/24618#discussion_r1613290123


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/LineageGraph.java:
##
@@ -20,13 +20,12 @@
 package org.apache.flink.streaming.api.lineage;
 
 import org.apache.flink.annotation.PublicEvolving;
-import org.apache.flink.streaming.api.graph.StreamGraph;
 
 import java.util.List;
 
 /**
- * Job lineage is built according to {@link StreamGraph}, users can get 
sources, sinks and
- * relationships from lineage and manage the relationship between jobs and 
tables.
+ * Job lineage graph that users can get sources, sinks and relationships from 
lineage and manage the

Review Comment:
   > Thanks David for your comments. Yes, the documentation will be added after 
adding the job lineage listener which is more user facing. It is planned in 
this jira https://issues.apache.org/jira/browse/FLINK-33212. This PR only 
consider source/sink level lineage. Column level lineage is not included for 
this work, so internal transformations not need lineage info for now. Would you 
please elaborate more about "I assume a sink could be a source - so could be in 
both current lists"?
   
   Hi Peter, usually we think of lineage assets as the nodes in the lineage 
(e.g. open lineage). So the asset could be a Kafka topic and that topic would 
be being used as a source for some flows and a sink for other flows. I was 
wondering how this fits with  lineage at the table level, where there could be 
a table defined as a sink and a table defined as a source on the same Kafka 
topic. I guess when exporting / exposing to open lineage there could be many 
Flink tables referring to the same topic that would end up as one open lineage 
node. The natural way for Flink to store the lineage is at the table level - 
rather than at the asset level. So thinking about it, I think this is fine. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35435) [FLIP-451] Introduce timeout configuration to AsyncSink

2024-05-24 Thread Ahmed Hamdy (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hamdy updated FLINK-35435:

Attachment: Screenshot 2024-05-24 at 11.06.30.png
Screenshot 2024-05-24 at 12.06.20.png

> [FLIP-451] Introduce timeout configuration to AsyncSink
> ---
>
> Key: FLINK-35435
> URL: https://issues.apache.org/jira/browse/FLINK-35435
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Ahmed Hamdy
>Priority: Major
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-05-24 at 11.06.30.png, Screenshot 
> 2024-05-24 at 12.06.20.png
>
>
> Implementation Ticket for:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33892][runtime] Support Job Recovery from JobMaster Failures for Batch Jobs. [flink]

2024-05-24 Thread via GitHub



zhuzhurk commented on code in PR #24771:
URL: https://github.com/apache/flink/pull/24771#discussion_r1612975806


##
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/BatchJobRecoveryTest.java:
##
@@ -0,0 +1,1217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler.adaptivebatch;
+
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.api.common.eventtime.WatermarkAlignmentParams;
+import org.apache.flink.api.connector.source.Boundedness;
+import org.apache.flink.api.connector.source.mocks.MockSource;
+import org.apache.flink.api.connector.source.mocks.MockSourceSplit;
+import org.apache.flink.api.connector.source.mocks.MockSplitEnumerator;
+import org.apache.flink.configuration.BatchExecutionOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.runtime.clusterframework.types.ResourceID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.DefaultExecutionGraph;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionGraph;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.executiongraph.IntermediateResultPartition;
+import org.apache.flink.runtime.executiongraph.InternalExecutionGraphAccessor;
+import org.apache.flink.runtime.executiongraph.ResultPartitionBytes;
+import 
org.apache.flink.runtime.executiongraph.TestingComponentMainThreadExecutor;
+import 
org.apache.flink.runtime.executiongraph.failover.FixedDelayRestartBackoffTimeStrategy;
+import org.apache.flink.runtime.io.network.partition.JobMasterPartitionTracker;
+import 
org.apache.flink.runtime.io.network.partition.JobMasterPartitionTrackerImpl;
+import 
org.apache.flink.runtime.io.network.partition.PartitionNotFoundException;
+import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
+import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.jobgraph.JobVertex;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.runtime.jobmaster.event.ExecutionVertexFinishedEvent;
+import org.apache.flink.runtime.jobmaster.event.FileSystemJobEventStore;
+import org.apache.flink.runtime.jobmaster.event.JobEvent;
+import org.apache.flink.runtime.jobmaster.event.JobEventManager;
+import org.apache.flink.runtime.jobmaster.event.JobEventStore;
+import org.apache.flink.runtime.jobmaster.utils.TestingJobMasterGateway;
+import org.apache.flink.runtime.jobmaster.utils.TestingJobMasterGatewayBuilder;
+import org.apache.flink.runtime.operators.coordination.EventReceivingTasks;
+import 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder;
+import 
org.apache.flink.runtime.operators.coordination.RecreateOnResetOperatorCoordinator;
+import 
org.apache.flink.runtime.operators.coordination.TestingOperatorCoordinator;
+import org.apache.flink.runtime.scheduler.DefaultSchedulerBuilder;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingResultPartition;
+import org.apache.flink.runtime.scheduler.strategy.SchedulingTopology;
+import org.apache.flink.runtime.shuffle.DefaultShuffleMetrics;
+import org.apache.flink.runtime.shuffle.JobShuffleContextImpl;
+import org.apache.flink.runtime.shuffle.NettyShuffleDescriptor;
+import org.apache.flink.runtime.shuffle.NettyShuffleMaster;
+import org.apache.flink.runtime.shuffle.PartitionWithMetrics;
+import org.apache.flink.runtime.shuffle.ShuffleDescriptor;
+import org.apache.flink.runtime.shuffle.ShuffleMaster;
+import org.apache.fli

Re: [PR] [FLINK-35415][base] Fix compatibility with Flink 1.19 [flink-cdc]

2024-05-24 Thread via GitHub



yuxiqian commented on PR #3348:
URL: https://github.com/apache/flink-cdc/pull/3348#issuecomment-2129300597

   Pushed another commit to resolve CI issue, could @leonardBang please 
re-trigger the CI? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35435) [FLIP-451] Introduce timeout configuration to AsyncSink

2024-05-24 Thread Ahmed Hamdy (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849270#comment-17849270
 ] 

Ahmed Hamdy commented on FLINK-35435:
-

h1. Non-Functional Backward compatibility 
To assure that we haven't imposed any regressions to existing implementers we 
tested {{ KinesisStreamsSink }} with default request timeout vs no timeout on 2 
levels

h2. Sanity testing

We have run the [example 
job|https://github.com/apache/flink-connector-aws/blob/main/flink-connector-aws/flink-connector-aws-kinesis-streams/src/test/java/org/apache/flink/connector/kinesis/sink/examples/SinkIntoKinesis.java]
 with checkpoint interval of 10 seconds, we set the request timeout for 3 
minutes and verified no requests were retried due to timeout during a period of 
30 minutes of job execution.

h2. Performance Benchmark

I have benchmarked the kinesis sink with the default timeout (10 minutes) with 
batch size = 20, and default values of inflight requests.

The result show no difference (except for a small network blip)

h3. Sink With Timeout
 !Screenshot 2024-05-24 at 11.06.30.png! 

h3. Sink With No Timeout
 !Screenshot 2024-05-24 at 12.06.20.png! 

> [FLIP-451] Introduce timeout configuration to AsyncSink
> ---
>
> Key: FLINK-35435
> URL: https://issues.apache.org/jira/browse/FLINK-35435
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Ahmed Hamdy
>Priority: Major
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-05-24 at 11.06.30.png, Screenshot 
> 2024-05-24 at 12.06.20.png
>
>
> Implementation Ticket for:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35445) Update Async Sink documentation for Timeout configuration

2024-05-24 Thread Ahmed Hamdy (Jira)

Ahmed Hamdy created FLINK-35445:
---

 Summary: Update Async Sink documentation for Timeout configuration 
 Key: FLINK-35445
 URL: https://issues.apache.org/jira/browse/FLINK-35445
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common, Documentation
Reporter: Ahmed Hamdy
 Fix For: 1.20.0


Update Documentation for AsyncSink Changes introduced by 
[FLIP-451|https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35445) Update Async Sink documentation for Timeout configuration

2024-05-24 Thread Ahmed Hamdy (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Hamdy updated FLINK-35445:

Parent: FLINK-35435
Issue Type: Sub-task  (was: Improvement)

> Update Async Sink documentation for Timeout configuration 
> --
>
> Key: FLINK-35445
> URL: https://issues.apache.org/jira/browse/FLINK-35445
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Common, Documentation
>Reporter: Ahmed Hamdy
>Priority: Major
> Fix For: 1.20.0
>
>
> Update Documentation for AsyncSink Changes introduced by 
> [FLIP-451|https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35446) FileMergingSnapshotManagerBase throws a NullPointerException

2024-05-24 Thread Ryan Skraba (Jira)

Ryan Skraba created FLINK-35446:
---

 Summary: FileMergingSnapshotManagerBase throws a 
NullPointerException
 Key: FLINK-35446
 URL: https://issues.apache.org/jira/browse/FLINK-35446
 Project: Flink
  Issue Type: Bug
Reporter: Ryan Skraba


* 1.20 Java 11 / Test (module: tests) 
https://github.com/apache/flink/actions/runs/9217608897/job/25360103124#step:10:8641

{{ResumeCheckpointManuallyITCase.testExternalizedIncrementalRocksDBCheckpointsWithLocalRecoveryZookeeper}}
 throws a NullPointerException when it tries to restore state handles: 

{code}
Error: 02:57:52 02:57:52.551 [ERROR] Tests run: 48, Failures: 0, Errors: 1, 
Skipped: 0, Time elapsed: 268.6 s <<< FAILURE! -- in 
org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase
Error: 02:57:52 02:57:52.551 [ERROR] 
org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase.testExternalizedIncrementalRocksDBCheckpointsWithLocalRecoveryZookeeper[RestoreMode
 = CLAIM] -- Time elapsed: 3.145 s <<< ERROR!
May 24 02:57:52 org.apache.flink.runtime.JobException: Recovery is suppressed 
by NoRestartBackoffTimeStrategy
May 24 02:57:52 at 
org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:219)
May 24 02:57:52 at 
org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.handleFailureAndReport(ExecutionFailureHandler.java:166)
May 24 02:57:52 at 
org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:121)
May 24 02:57:52 at 
org.apache.flink.runtime.scheduler.DefaultScheduler.recordTaskFailure(DefaultScheduler.java:279)
May 24 02:57:52 at 
org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:270)
May 24 02:57:52 at 
org.apache.flink.runtime.scheduler.DefaultScheduler.onTaskFailed(DefaultScheduler.java:263)
May 24 02:57:52 at 
org.apache.flink.runtime.scheduler.SchedulerBase.onTaskExecutionStateUpdate(SchedulerBase.java:788)
May 24 02:57:52 at 
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:765)
May 24 02:57:52 at 
org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:83)
May 24 02:57:52 at 
org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:496)
May 24 02:57:52 at 
jdk.internal.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
May 24 02:57:52 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
May 24 02:57:52 at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
May 24 02:57:52 at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRpcInvocation$1(PekkoRpcActor.java:318)
May 24 02:57:52 at 
org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
May 24 02:57:52 at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcInvocation(PekkoRpcActor.java:316)
May 24 02:57:52 at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:229)
May 24 02:57:52 at 
org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:88)
May 24 02:57:52 at 
org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:174)
May 24 02:57:52 at 
org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33)
May 24 02:57:52 at 
org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29)
May 24 02:57:52 at 
scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
May 24 02:57:52 at 
scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
May 24 02:57:52 at 
org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29)
May 24 02:57:52 at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
May 24 02:57:52 at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
May 24 02:57:52 at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
May 24 02:57:52 at 
org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
May 24 02:57:52 at 
org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)
May 24 02:57:52 at 
org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229)
May 24 02:57:52 at 
org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)
May 24 02:57:52 at 
org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)
May 24 02:57:52 at 
org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280)
May 24 02:57:52 at 
org.apache.pekko.dispatch.M

[PR] [FLINK-35435] Add timeout Configuration to Async Sink [flink]

2024-05-24 Thread via GitHub



vahmed-hamdy opened a new pull request, #24839:
URL: https://github.com/apache/flink/pull/24839

   
   
   ## What is the purpose of the change
   
   Implementation of 
[FLIP-451](https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API)
 introducing Timeout Configuration to Async Sink.
   
   
   ## Brief change log
   
   - Add `ResultHandler` class to be used by Sink implementers
   - Add `AsyncSinkWriterResultHandler` implementation that supports timeout
   - Add `requestTimeoutMs` and `failOnTimeout` configuration to 
`AsyncSinkWriterConfiguration` and to `AsyncSinkWriterConfigurationBuilder`
   - Add default values to `requestTimeoutMs` and `failOnTimeout` as suggested 
in FLIP-451
   - Add needed unit tests and refactored existing tests
   
   ## Verifying this change
   
   
   
   This change added tests and can be verified as follows:
   
   - Added unit tests
   - Performed Sanity testing and benchmarks on Kinesis Implementation as 
described in the [Ticket](https://issues.apache.org/jira/browse/FLINK-35435).
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? JavaDocs + [Follow Up Ticket
   ](https://issues.apache.org/jira/browse/FLINK-35445)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35435) [FLIP-451] Introduce timeout configuration to AsyncSink

2024-05-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35435:
---
Labels: pull-request-available  (was: )

> [FLIP-451] Introduce timeout configuration to AsyncSink
> ---
>
> Key: FLINK-35435
> URL: https://issues.apache.org/jira/browse/FLINK-35435
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Ahmed Hamdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-05-24 at 11.06.30.png, Screenshot 
> 2024-05-24 at 12.06.20.png
>
>
> Implementation Ticket for:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35435] Add timeout Configuration to Async Sink [flink]

2024-05-24 Thread via GitHub



flinkbot commented on PR #24839:
URL: https://github.com/apache/flink/pull/24839#issuecomment-2129354320

   
   ## CI report:
   
   * 2b91a9c92af7d97d492a9c83d43d7c544f85d355 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-35435) [FLIP-451] Introduce timeout configuration to AsyncSink

2024-05-24 Thread Ahmed Hamdy (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849272#comment-17849272
 ] 

Ahmed Hamdy commented on FLINK-35435:
-

[~danny.cranmer] Could you please take a look when you have time?

> [FLIP-451] Introduce timeout configuration to AsyncSink
> ---
>
> Key: FLINK-35435
> URL: https://issues.apache.org/jira/browse/FLINK-35435
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Common
>Reporter: Ahmed Hamdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
> Attachments: Screenshot 2024-05-24 at 11.06.30.png, Screenshot 
> 2024-05-24 at 12.06.20.png
>
>
> Implementation Ticket for:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-35447) Flink CDC Document document file had removed but website can access

2024-05-24 Thread Zhongqiang Gong (Jira)

Zhongqiang Gong created FLINK-35447:
---

 Summary: Flink CDC Document document file had removed but website 
can access
 Key: FLINK-35447
 URL: https://issues.apache.org/jira/browse/FLINK-35447
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Zhongqiang Gong


https://nightlies.apache.org/flink/flink-cdc-docs-master/docs/connectors/overview/
 the link should not appeared.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35434] Support pass exception in StateExecutor to runtime [flink]

2024-05-24 Thread via GitHub



zoltar9264 commented on code in PR #24833:
URL: https://github.com/apache/flink/pull/24833#discussion_r1613094184


##
flink-state-backends/flink-statebackend-forst/src/main/java/org/apache/flink/state/forst/ForStWriteBatchOperation.java:
##
@@ -75,7 +74,10 @@ public CompletableFuture process() {
 request.completeStateFuture();
 }
 } catch (Exception e) {
-throw new CompletionException("Error while adding data 
to ForStDB", e);
+for (ForStDBPutRequest request : batchRequest) {

Review Comment:
   Thanks for reminder @masteryhx , firstly this batch operation should indeed 
fail in this case.  Currently, the  AsyncFrameworkExceptionHandler will fail 
the job directly, then the StateExecutor will be destroy. I don't see the need 
to let the state executor continue executing, so I will fail the executor and 
update the pr.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-35447][doc-ci] Flink CDC Document document file had removed but website can access [flink-cdc]

2024-05-24 Thread via GitHub



GOODBOY008 opened a new pull request, #3362:
URL: https://github.com/apache/flink-cdc/pull/3362

   Solution:
   Deletes files from the destination directory that are not present in the 
source directory.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35447][doc-ci] Flink CDC Document document file had removed but website can access [flink-cdc]

2024-05-24 Thread via GitHub



GOODBOY008 commented on PR #3362:
URL: https://github.com/apache/flink-cdc/pull/3362#issuecomment-2129373300

   @leonardBang PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35447) Flink CDC Document document file had removed but website can access

2024-05-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35447:
---
Labels: pull-request-available  (was: )

> Flink CDC Document document file had removed but website can access
> ---
>
> Key: FLINK-35447
> URL: https://issues.apache.org/jira/browse/FLINK-35447
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Reporter: Zhongqiang Gong
>Priority: Minor
>  Labels: pull-request-available
>
> https://nightlies.apache.org/flink/flink-cdc-docs-master/docs/connectors/overview/
>  the link should not appeared.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-33892][runtime] Support Job Recovery from JobMaster Failures for Batch Jobs. [flink]

2024-05-24 Thread via GitHub



zhuzhurk commented on code in PR #24771:
URL: https://github.com/apache/flink/pull/24771#discussion_r1612989352


##
flink-tests/src/test/java/org/apache/flink/test/scheduling/JMFailoverITCase.java:
##
@@ -532,9 +508,9 @@ private JobGraph 
createJobGraphWithUnsupportedBatchSnapshotOperatorCoordinator(
 return StreamingJobGraphGenerator.createJobGraph(streamGraph);
 }
 
-private static void fillKeepGoing(
-List indices, boolean going, Map 
keepGoing) {
-indices.forEach(index -> keepGoing.put(index, going));
+private static void fillBlockSubTasks(

Review Comment:
   fillBlockSubTasks -> setSubtaskBlocked



##
flink-runtime/src/test/java/org/apache/flink/runtime/scheduler/adaptivebatch/BatchJobRecoveryTest.java:
##
@@ -0,0 +1,1173 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.scheduler.adaptivebatch;
+
+import org.apache.flink.api.common.JobID;
+import org.apache.flink.api.common.eventtime.WatermarkAlignmentParams;
+import org.apache.flink.api.connector.source.Boundedness;
+import org.apache.flink.api.connector.source.mocks.MockSource;
+import org.apache.flink.api.connector.source.mocks.MockSourceSplit;
+import org.apache.flink.api.connector.source.mocks.MockSplitEnumerator;
+import org.apache.flink.configuration.BatchExecutionOptions;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.runtime.clusterframework.types.ResourceID;
+import org.apache.flink.runtime.execution.ExecutionState;
+import org.apache.flink.runtime.executiongraph.DefaultExecutionGraph;
+import org.apache.flink.runtime.executiongraph.Execution;
+import org.apache.flink.runtime.executiongraph.ExecutionAttemptID;
+import org.apache.flink.runtime.executiongraph.ExecutionGraph;
+import org.apache.flink.runtime.executiongraph.ExecutionJobVertex;
+import org.apache.flink.runtime.executiongraph.ExecutionVertex;
+import org.apache.flink.runtime.executiongraph.IntermediateResultPartition;
+import org.apache.flink.runtime.executiongraph.InternalExecutionGraphAccessor;
+import org.apache.flink.runtime.executiongraph.ResultPartitionBytes;
+import 
org.apache.flink.runtime.executiongraph.TestingComponentMainThreadExecutor;
+import 
org.apache.flink.runtime.executiongraph.failover.FixedDelayRestartBackoffTimeStrategy;
+import org.apache.flink.runtime.io.network.partition.JobMasterPartitionTracker;
+import 
org.apache.flink.runtime.io.network.partition.JobMasterPartitionTrackerImpl;
+import 
org.apache.flink.runtime.io.network.partition.PartitionNotFoundException;
+import org.apache.flink.runtime.io.network.partition.ResultPartitionID;
+import org.apache.flink.runtime.io.network.partition.ResultPartitionType;
+import org.apache.flink.runtime.jobgraph.DistributionPattern;
+import org.apache.flink.runtime.jobgraph.IntermediateResultPartitionID;
+import org.apache.flink.runtime.jobgraph.JobGraph;
+import org.apache.flink.runtime.jobgraph.JobVertex;
+import org.apache.flink.runtime.jobgraph.JobVertexID;
+import org.apache.flink.runtime.jobgraph.OperatorID;
+import org.apache.flink.runtime.jobmaster.event.ExecutionVertexFinishedEvent;
+import org.apache.flink.runtime.jobmaster.event.FileSystemJobEventStore;
+import org.apache.flink.runtime.jobmaster.event.JobEvent;
+import org.apache.flink.runtime.jobmaster.event.JobEventManager;
+import org.apache.flink.runtime.jobmaster.event.JobEventStore;
+import org.apache.flink.runtime.jobmaster.utils.TestingJobMasterGateway;
+import org.apache.flink.runtime.jobmaster.utils.TestingJobMasterGatewayBuilder;
+import org.apache.flink.runtime.operators.coordination.EventReceivingTasks;
+import 
org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder;
+import 
org.apache.flink.runtime.operators.coordination.RecreateOnResetOperatorCoordinator;
+import 
org.apache.flink.runtime.operators.coordination.TestingOperatorCoordinator;
+import org.apache.flink.runtime.scheduler.DefaultSchedulerBuilder;
+import org.apache.flink.runtime.scheduler.SchedulerBase;
+import org.apache.flink.runtime.scheduler.strategy.ExecutionVertexID;
+import org.apache.flink.runtime.scheduler.strateg

Re: [PR] [FLINK-31223][sqlgateway] Introduce getFlinkConfigurationOptions to [flink]

2024-05-24 Thread via GitHub



davidradl commented on code in PR #24741:
URL: https://github.com/apache/flink/pull/24741#discussion_r1613388125


##
flink-table/flink-sql-client/src/test/java/org/apache/flink/table/client/SqlClientTestBase.java:
##
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.client;
+
+import org.apache.flink.core.testutils.CommonTestUtils;
+import org.apache.flink.table.client.cli.TerminalUtils;
+
+import org.jline.terminal.Size;
+import org.jline.terminal.Terminal;
+import org.junit.jupiter.api.AfterEach;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.io.TempDir;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.File;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.nio.file.StandardOpenOption;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static 
org.apache.flink.configuration.ConfigConstants.ENV_FLINK_CONF_DIR;
+
+/** Base class for test {@link SqlClient}. */
+class SqlClientTestBase {
+@TempDir private Path tempFolder;
+
+protected String historyPath;
+
+protected Map originalEnv;
+
+@BeforeEach
+void before() throws IOException {
+originalEnv = System.getenv();
+
+// prepare conf dir
+File confFolder = Files.createTempDirectory(tempFolder, 
"conf").toFile();
+File confYaml = new File(confFolder, "config.yaml");

Review Comment:
   that was it - thanks for you support @reswqa 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-35129][cdc][postgres] Add checkpoint cycle option for commit offsets [flink-cdc]

2024-05-24 Thread via GitHub



morazow commented on code in PR #3349:
URL: https://github.com/apache/flink-cdc/pull/3349#discussion_r1613394340


##
flink-cdc-connect/flink-cdc-source-connectors/flink-connector-postgres-cdc/src/main/java/org/apache/flink/cdc/connectors/postgres/source/reader/PostgresSourceReader.java:
##
@@ -104,6 +108,10 @@ public List snapshotState(long 
checkpointId) {
 
 @Override
 public void notifyCheckpointComplete(long checkpointId) throws Exception {
+this.checkpointCount = (this.checkpointCount + 1) % 
this.checkpointCycle;

Review Comment:
   Thanks guys for the feedback, I am going to check it 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-28440) EventTimeWindowCheckpointingITCase failed with restore

2024-05-24 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-28440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849282#comment-17849282
 ] 

Ryan Skraba commented on FLINK-28440:
-

* 1.19 Hadoop 3.1.3 / Test (module: tests) 
https://github.com/apache/flink/actions/runs/9217608890/job/25360146799#step:10:8157

> EventTimeWindowCheckpointingITCase failed with restore
> --
>
> Key: FLINK-28440
> URL: https://issues.apache.org/jira/browse/FLINK-28440
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.16.0, 1.17.0, 1.18.0, 1.19.0
>Reporter: Huang Xingbo
>Assignee: Yanfei Lei
>Priority: Critical
>  Labels: auto-deprioritized-critical, pull-request-available, 
> stale-assigned, test-stability
> Fix For: 1.20.0
>
> Attachments: image-2023-02-01-00-51-54-506.png, 
> image-2023-02-01-01-10-01-521.png, image-2023-02-01-01-19-12-182.png, 
> image-2023-02-01-16-47-23-756.png, image-2023-02-01-16-57-43-889.png, 
> image-2023-02-02-10-52-56-599.png, image-2023-02-03-10-09-07-586.png, 
> image-2023-02-03-12-03-16-155.png, image-2023-02-03-12-03-56-614.png
>
>
> {code:java}
> Caused by: java.lang.Exception: Exception while creating 
> StreamOperatorStateContext.
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:256)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:268)
>   at 
> org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:722)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:698)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:665)
>   at 
> org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
>   at 
> org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904)
>   at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed 
> state backend for WindowOperator_0a448493b4782967b150582570326227_(2/4) from 
> any of the 1 provided restore options.
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165)
>   ... 11 more
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
> /tmp/junit1835099326935900400/junit1113650082510421526/52ee65b7-033f-4429-8ddd-adbe85e27ced
>  (No such file or directory)
>   at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321)
>   at 
> org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87)
>   at 
> org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69)
>   at 
> org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:96)
>   at 
> org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:75)
>   at 
> org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:92)
>   at 
> org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136)
>   at 
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
>   at 
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
>   ... 13 more
> Caused by: java.io.FileNotFoundException: 
> /tmp/junit1835099326935900

[jira] [Commented] (FLINK-35342) MaterializedTableStatementITCase test can check for wrong status

2024-05-24 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849283#comment-17849283
 ] 

Ryan Skraba commented on FLINK-35342:
-

* 1.20 AdaptiveScheduler / Test (module: table) 
https://github.com/apache/flink/actions/runs/9217608897/job/25360076574#step:10:12483

> MaterializedTableStatementITCase test can check for wrong status
> 
>
> Key: FLINK-35342
> URL: https://issues.apache.org/jira/browse/FLINK-35342
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Ryan Skraba
>Assignee: Feng Jin
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> * 1.20 AdaptiveScheduler / Test (module: table) 
> https://github.com/apache/flink/actions/runs/9056197319/job/24879135605#step:10:12490
>  
> It looks like 
> {{MaterializedTableStatementITCase.testAlterMaterializedTableSuspendAndResume}}
>  can be flaky, where the expected status is not yet RUNNING:
> {code}
> Error: 03:24:03 03:24:03.902 [ERROR] Tests run: 6, Failures: 1, Errors: 0, 
> Skipped: 0, Time elapsed: 26.78 s <<< FAILURE! -- in 
> org.apache.flink.table.gateway.service.MaterializedTableStatementITCase
> Error: 03:24:03 03:24:03.902 [ERROR] 
> org.apache.flink.table.gateway.service.MaterializedTableStatementITCase.testAlterMaterializedTableSuspendAndResume(Path,
>  RestClusterClient) -- Time elapsed: 3.850 s <<< FAILURE!
> May 13 03:24:03 org.opentest4j.AssertionFailedError: 
> May 13 03:24:03 
> May 13 03:24:03 expected: "RUNNING"
> May 13 03:24:03  but was: "CREATED"
> May 13 03:24:03   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> May 13 03:24:03   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> May 13 03:24:03   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> May 13 03:24:03   at 
> org.apache.flink.table.gateway.service.MaterializedTableStatementITCase.testAlterMaterializedTableSuspendAndResume(MaterializedTableStatementITCase.java:650)
> May 13 03:24:03   at java.lang.reflect.Method.invoke(Method.java:498)
> May 13 03:24:03   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> May 13 03:24:03   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> May 13 03:24:03   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> May 13 03:24:03   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> May 13 03:24:03   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> May 13 03:24:03 
> May 13 03:24:04 03:24:04.270 [INFO] 
> May 13 03:24:04 03:24:04.270 [INFO] Results:
> May 13 03:24:04 03:24:04.270 [INFO] 
> Error: 03:24:04 03:24:04.270 [ERROR] Failures: 
> Error: 03:24:04 03:24:04.271 [ERROR]   
> MaterializedTableStatementITCase.testAlterMaterializedTableSuspendAndResume:650
>  
> May 13 03:24:04 expected: "RUNNING"
> May 13 03:24:04  but was: "CREATED"
> May 13 03:24:04 03:24:04.271 [INFO] 
> Error: 03:24:04 03:24:04.271 [ERROR] Tests run: 82, Failures: 1, Errors: 0, 
> Skipped: 0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-35446) FileMergingSnapshotManagerBase throws a NullPointerException

2024-05-24 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849284#comment-17849284
 ] 

Ryan Skraba commented on FLINK-35446:
-

* 1.20 Java 11 / Test (module: tests) 
https://github.com/apache/flink/actions/runs/9217608897/job/25360103124#step:10:8641
* 1.20 Default (Java 8) / Test (module: table) 
https://github.com/apache/flink/actions/runs/9219075449/job/25363874486#step:10:11847
 {{PruneAggregateCallITCase.testNoneEmptyGroupKey}}
* 1.20 Default (Java 8) / Test (module: tests) 
https://github.com/apache/flink/actions/runs/9219075449/job/25363874825#step:10:8005

The last one is different than the others: 
{code}
Error: 05:48:38 05:48:38.790 [ERROR] Tests run: 11, Failures: 1, Errors: 0, 
Skipped: 0, Time elapsed: 12.78 s <<< FAILURE! -- in 
org.apache.flink.test.classloading.ClassLoaderITCase
Error: 05:48:38 05:48:38.790 [ERROR] 
org.apache.flink.test.classloading.ClassLoaderITCase.testCheckpointedStreamingClassloaderJobWithCustomClassLoader
 -- Time elapsed: 2.492 s <<< FAILURE!
May 24 05:48:38 org.assertj.core.error.AssertJMultipleFailuresError: 
May 24 05:48:38 
May 24 05:48:38 Multiple Failures (1 failure)
May 24 05:48:38 -- failure 1 --
May 24 05:48:38 [Any cause is instance of class 'class 
org.apache.flink.util.SerializedThrowable' and contains message 
'org.apache.flink.test.classloading.jar.CheckpointedStreamingProgram$SuccessException']
 
May 24 05:48:38 Expecting any element of:
May 24 05:48:38   [org.apache.flink.client.program.ProgramInvocationException: 
The main method caused an error: Job execution failed.
May 24 05:48:38 at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:373)
May 24 05:48:38 at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:223)
May 24 05:48:38 at 
org.apache.flink.test.classloading.ClassLoaderITCase.lambda$testCheckpointedStreamingClassloaderJobWithCustomClassLoader$1(ClassLoaderITCase.java:260)
May 24 05:48:38 ...(54 remaining lines not displayed - this can be 
changed with Assertions.setMaxStackTraceElementsDisplayed),
May 24 05:48:38 org.apache.flink.runtime.client.JobExecutionException: Job 
execution failed.
May 24 05:48:38 at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
May 24 05:48:38 at 
org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
May 24 05:48:38 at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
May 24 05:48:38 ...(45 remaining lines not displayed - this can be 
changed with Assertions.setMaxStackTraceElementsDisplayed),
May 24 05:48:38 org.apache.flink.runtime.JobException: Recovery is 
suppressed by FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=1, 
backoffTimeMS=100)
May 24 05:48:38 at 
org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:219)
May 24 05:48:38 at 
org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.handleFailureAndReport(ExecutionFailureHandler.java:166)
May 24 05:48:38 at 
org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:121)
May 24 05:48:38 ...(36 remaining lines not displayed - this can be 
changed with Assertions.setMaxStackTraceElementsDisplayed),
May 24 05:48:38 java.lang.NullPointerException
May 24 05:48:38 at 
org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManagerBase.isManagedByFileMergingManager(FileMergingSnapshotManagerBase.java:733)
May 24 05:48:38 at 
org.apache.flink.runtime.checkpoint.filemerging.FileMergingSnapshotManagerBase.lambda$null$4(FileMergingSnapshotManagerBase.java:687)
May 24 05:48:38 at java.util.HashMap.computeIfAbsent(HashMap.java:1128)
May 24 05:48:38 ...(41 remaining lines not displayed - this can be 
changed with Assertions.setMaxStackTraceElementsDisplayed)]
May 24 05:48:38 to satisfy the given assertions requirements but none did:
May 24 05:48:38 
May 24 05:48:38 org.apache.flink.client.program.ProgramInvocationException: The 
main method caused an error: Job execution failed.
May 24 05:48:38 at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:373)
May 24 05:48:38 at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:223)
May 24 05:48:38 at 
org.apache.flink.test.classloading.ClassLoaderITCase.lambda$testCheckpointedStreamingClassloaderJobWithCustomClassLoader$1(ClassLoaderITCase.java:260)
May 24 05:48:38 ...(54 remaining lines not displayed - this can be 
changed with Assertions.setMaxStackTraceE

[jira] [Commented] (FLINK-35446) FileMergingSnapshotManagerBase throws a NullPointerException

2024-05-24 Thread Ryan Skraba (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-35446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849285#comment-17849285
 ] 

Ryan Skraba commented on FLINK-35446:
-

[~lijinzhong] or [~zakelly] Do you think this needs a similar fix as 
FLINK-35382 ? 

> FileMergingSnapshotManagerBase throws a NullPointerException
> 
>
> Key: FLINK-35446
> URL: https://issues.apache.org/jira/browse/FLINK-35446
> Project: Flink
>  Issue Type: Bug
>Reporter: Ryan Skraba
>Priority: Critical
>  Labels: test-stability
>
> * 1.20 Java 11 / Test (module: tests) 
> https://github.com/apache/flink/actions/runs/9217608897/job/25360103124#step:10:8641
> {{ResumeCheckpointManuallyITCase.testExternalizedIncrementalRocksDBCheckpointsWithLocalRecoveryZookeeper}}
>  throws a NullPointerException when it tries to restore state handles: 
> {code}
> Error: 02:57:52 02:57:52.551 [ERROR] Tests run: 48, Failures: 0, Errors: 1, 
> Skipped: 0, Time elapsed: 268.6 s <<< FAILURE! -- in 
> org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase
> Error: 02:57:52 02:57:52.551 [ERROR] 
> org.apache.flink.test.checkpointing.ResumeCheckpointManuallyITCase.testExternalizedIncrementalRocksDBCheckpointsWithLocalRecoveryZookeeper[RestoreMode
>  = CLAIM] -- Time elapsed: 3.145 s <<< ERROR!
> May 24 02:57:52 org.apache.flink.runtime.JobException: Recovery is suppressed 
> by NoRestartBackoffTimeStrategy
> May 24 02:57:52   at 
> org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:219)
> May 24 02:57:52   at 
> org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.handleFailureAndReport(ExecutionFailureHandler.java:166)
> May 24 02:57:52   at 
> org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:121)
> May 24 02:57:52   at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.recordTaskFailure(DefaultScheduler.java:279)
> May 24 02:57:52   at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:270)
> May 24 02:57:52   at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.onTaskFailed(DefaultScheduler.java:263)
> May 24 02:57:52   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.onTaskExecutionStateUpdate(SchedulerBase.java:788)
> May 24 02:57:52   at 
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:765)
> May 24 02:57:52   at 
> org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:83)
> May 24 02:57:52   at 
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:496)
> May 24 02:57:52   at 
> jdk.internal.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
> May 24 02:57:52   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> May 24 02:57:52   at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> May 24 02:57:52   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.lambda$handleRpcInvocation$1(PekkoRpcActor.java:318)
> May 24 02:57:52   at 
> org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
> May 24 02:57:52   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcInvocation(PekkoRpcActor.java:316)
> May 24 02:57:52   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleRpcMessage(PekkoRpcActor.java:229)
> May 24 02:57:52   at 
> org.apache.flink.runtime.rpc.pekko.FencedPekkoRpcActor.handleRpcMessage(FencedPekkoRpcActor.java:88)
> May 24 02:57:52   at 
> org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleMessage(PekkoRpcActor.java:174)
> May 24 02:57:52   at 
> org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33)
> May 24 02:57:52   at 
> org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29)
> May 24 02:57:52   at 
> scala.PartialFunction.applyOrElse(PartialFunction.scala:127)
> May 24 02:57:52   at 
> scala.PartialFunction.applyOrElse$(PartialFunction.scala:126)
> May 24 02:57:52   at 
> org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29)
> May 24 02:57:52   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175)
> May 24 02:57:52   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
> May 24 02:57:52   at 
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176)
> May 24 02:57:52   at 
> org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
> May 24 02:57:52   at 
> org.apache.pekko.actor.Actor.ar

Re: [PR] [FLINK-34487][ci] Adds Python Wheels nightly GHA workflow [flink]

2024-05-24 Thread via GitHub



morazow commented on PR #24426:
URL: https://github.com/apache/flink/pull/24426#issuecomment-2129446733

   Hey @XComp, @HuangXingBo ,
   
   What do you think of the latest commit changes for migration?
   
   This way both Linux & MacOS platform use similar build system, we also don't 
depend on bash script on Linux. All the wheel build requirements are defined in 
the `pyproject.toml` (python versions, architecture, etc) that are used for 
both platforms.
   
   For Linux I had to add `manylinux2014` version since `manylinux1` does not 
support python3.10+ versions.
   
   Please let me know what you think


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34746] Switching to the Apache CDN for Dockerfile [flink-docker]

2024-05-24 Thread via GitHub



MartijnVisser commented on PR #190:
URL: https://github.com/apache/flink-docker/pull/190#issuecomment-2129446485

   @hlteoh37 Done, but I'm not sure if this actually needs to be backported to 
the other branches. @lincoln-lil do you know?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-33759] [flink-parquet] Add support for nested array with row/map/array type [flink]

2024-05-24 Thread via GitHub



ViktorCosenza commented on PR #24795:
URL: https://github.com/apache/flink/pull/24795#issuecomment-2129462557

   @JingGe Are you able to trigger the CI manually? I think I't wasnt triggered 
after the squash because no changes were detected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-34487][ci] Adds Python Wheels nightly GHA workflow [flink]

2024-05-24 Thread via GitHub



morazow commented on PR #24426:
URL: https://github.com/apache/flink/pull/24426#issuecomment-2129494739

   Test run built wheel artifacts: 
https://github.com/morazow/flink/actions/runs/9224143298


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-35448) Translate pod templates documentation into Chinese

2024-05-24 Thread Caican Cai (Jira)

Caican Cai created FLINK-35448:
--

 Summary: Translate pod templates documentation into Chinese
 Key: FLINK-35448
 URL: https://issues.apache.org/jira/browse/FLINK-35448
 Project: Flink
  Issue Type: Sub-task
Reporter: Caican Cai


Translate pod templates documentation into Chinese
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35448) Translate pod templates documentation into Chinese

2024-05-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35448:
---
Labels: pull-request-available  (was: )

> Translate pod templates documentation into Chinese
> --
>
> Key: FLINK-35448
> URL: https://issues.apache.org/jira/browse/FLINK-35448
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Caican Cai
>Priority: Minor
>  Labels: pull-request-available
>
> Translate pod templates documentation into Chinese
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[PR] [FLINK-35448] Translate pod templates documentation into Chinese [flink-kubernetes-operator]

2024-05-24 Thread via GitHub



caicancai opened a new pull request, #830:
URL: https://github.com/apache/flink-kubernetes-operator/pull/830

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request adds a new feature to periodically create 
and maintain savepoints through the `FlinkDeployment` custom resource.)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *Periodic savepoint trigger is introduced to the custom resource*
 - *The operator checks on reconciliation whether the required time has 
passed*
 - *The JobManager's dispose savepoint API is used to clean up obsolete 
savepoints*
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(yes / no)
 - Core observer or reconciler logic that is regularly executed: (yes / no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[PR] [FLINK-35052] Reject unsupported versions in the webhook validator [flink-kubernetes-operator]

2024-05-24 Thread via GitHub



afedulov opened a new pull request, #831:
URL: https://github.com/apache/flink-kubernetes-operator/pull/831

   ## What is the purpose of the change
   
   The admission webhook currently does not verify if FlinkDeployment CR 
utilizes Flink versions that are not supported by the Operator. This causes the 
CR to be accepted and the failure to be postponed until the reconciliation 
phase. We should instead fail fast and provide users direct feedback.
   
   ## Brief change log
   
   Adds a Flink version check to the validator
   
   ## Verifying this change
   Added a test case to the existing test suite
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(yes / **no**)
 - Core observer or reconciler logic that is regularly executed: (yes / 
**no**)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**yes** / no)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-35052) Webhook validator should reject unsupported Flink versions

2024-05-24 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-35052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35052:
---
Labels: pull-request-available  (was: )

> Webhook validator should reject unsupported Flink versions
> --
>
> Key: FLINK-35052
> URL: https://issues.apache.org/jira/browse/FLINK-35052
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available
>
> The admission webhook currently does not verify if FlinkDeployment CR 
> utilizes Flink versions that are not supported by the Operator. This causes 
> the CR to be accepted and the failure to be postponed until the 
> reconciliation phase. We should instead fail fast and provide users direct 
> feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [PR] [FLINK-35424][connectors/elasticsearch] Elasticsearch connector 8 supports SSL context [flink-connector-elasticsearch]

2024-05-24 Thread via GitHub



reta commented on code in PR #104:
URL: 
https://github.com/apache/flink-connector-elasticsearch/pull/104#discussion_r1613488632


##
flink-connector-elasticsearch8/src/test/java/org/apache/flink/connector/elasticsearch/sink/Elasticsearch8TestUtils.java:
##
@@ -0,0 +1,69 @@
+package org.apache.flink.connector.elasticsearch.sink;
+
+import org.apache.http.util.EntityUtils;
+import org.elasticsearch.client.Request;
+import org.elasticsearch.client.Response;
+import org.elasticsearch.client.RestClient;
+import org.testcontainers.utility.DockerImageName;
+
+import java.io.IOException;
+
+import static org.assertj.core.api.Assertions.assertThat;
+
+/** Utility class for Elasticsearch8 tests. */
+public class Elasticsearch8TestUtils {

Review Comment:
   Why do we need this class vs using existing `ElasticsearchSinkBaseITCase` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 >

1 - 100 of 135 matches

Mail list logo