[jira] [Assigned] (FLINK-32445) BlobStore.closeAndCleanupAllData doesn't do any close action
[ https://issues.apache.org/jira/browse/FLINK-32445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl reassigned FLINK-32445: - Assignee: Jiabao Sun > BlobStore.closeAndCleanupAllData doesn't do any close action > > > Key: FLINK-32445 > URL: https://issues.apache.org/jira/browse/FLINK-32445 > Project: Flink > Issue Type: Technical Debt > Components: Runtime / Coordination >Affects Versions: 1.18.0 >Reporter: Matthias Pohl >Assignee: Jiabao Sun >Priority: Minor > Labels: auto-deprioritized-major, pull-request-available, starter > > We might want to refactor {{BlobStore.closeAndCleanupAllData}}: It doesn't > close any resources (and doesn't need to). Therefore, renaming the interfaces > method to {{cleanAllData}} seems to be more appropriate. > This enables us to remove redundant code in > {{AbstractHaServices.closeAndCleanupAllData}} and {{AbstractHaServices.close}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData [flink]
XComp commented on code in PR #23424: URL: https://github.com/apache/flink/pull/23424#discussion_r1345334927 ## flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/AbstractHaServices.java: ## @@ -192,32 +192,18 @@ public void closeAndCleanupAllData() throws Exception { try { internalCleanup(); deletedHAData = true; +blobStoreService.cleanupAllData(); } catch (Exception t) { exception = t; } -try { -if (leaderElectionService != null) { -leaderElectionService.close(); -} -} catch (Throwable t) { -exception = ExceptionUtils.firstOrSuppressed(t, exception); +if (!deletedHAData) { +logger.info( +"Cannot delete HA blobs because we failed to delete the pointers in the HA store."); } try { -internalClose(); -} catch (Throwable t) { -exception = ExceptionUtils.firstOrSuppressed(t, exception); -} - -try { -if (deletedHAData) { -blobStoreService.closeAndCleanupAllData(); -} else { -logger.info( -"Cannot delete HA blobs because we failed to delete the pointers in the HA store."); -blobStoreService.close(); -} +close(); Review Comment: I'm wondering whether we should apply the same pattern to the `HighAvailabilityServices` interface. There we have two methods `close()` and `closeAndCleanupAllData()` which are then called [MiniCluster#terminateMiniClusterServices(boolean):1281ff](https://github.com/apache/flink/blob/603181da811edb47c0d573492639a381fbbedc28/flink-runtime/src/main/java/org/apache/flink/runtime/minicluster/MiniCluster.java#L1281) and [ClusterEntrypoint#stopClusterServices(boolean):503ff](https://github.com/apache/flink/blob/603181da811edb47c0d573492639a381fbbedc28/flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java#L503) respectively. Instead, `HighAvailabilityServices` could implement `AutoCloseable` and `closeAndCleanupAllData` will be refactored to `cleanupAllData()`. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33007] Integrate autoscaler config validation into the general validator flow [flink-kubernetes-operator]
gyfora commented on code in PR #682: URL: https://github.com/apache/flink-kubernetes-operator/pull/682#discussion_r1345402764 ## flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/validation/AutoScalerValidator.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.kubernetes.operator.autoscaler.validation; + +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.kubernetes.operator.api.FlinkDeployment; +import org.apache.flink.kubernetes.operator.api.FlinkSessionJob; +import org.apache.flink.kubernetes.operator.api.spec.FlinkDeploymentSpec; +import org.apache.flink.kubernetes.operator.autoscaler.config.AutoScalerOptions; +import org.apache.flink.kubernetes.operator.utils.FlinkUtils; +import org.apache.flink.kubernetes.operator.validation.FlinkResourceValidator; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Optional; + +/** Validator implementation for the AutoScaler from {@link Configuration}. */ +public class AutoScalerValidator implements FlinkResourceValidator { + +private static final Logger LOG = LoggerFactory.getLogger(FlinkUtils.class); + +@Override +public Optional validateSessionJob( +FlinkSessionJob sessionJob, Optional session) { +LOG.info("AutoScaler Configurations validator {}", sessionJob); +var spec = sessionJob.getSpec(); +if (spec.getFlinkConfiguration() != null) { +var flinkConfiguration = Configuration.fromMap(spec.getFlinkConfiguration()); +return validateAutoScalerFlinkConfiguration(flinkConfiguration); +} +return Optional.empty(); +} + +@Override +public Optional validateDeployment(FlinkDeployment deployment) { +LOG.info("AutoScaler Configurations validator {}", deployment); +FlinkDeploymentSpec spec = deployment.getSpec(); +if (spec.getFlinkConfiguration() != null) { +var flinkConfiguration = Configuration.fromMap(spec.getFlinkConfiguration()); +return validateAutoScalerFlinkConfiguration(flinkConfiguration); +} +return Optional.empty(); +} + +private Optional validateAutoScalerFlinkConfiguration( +Configuration flinkConfiguration) { +return firstPresent( +validateNumber(flinkConfiguration, AutoScalerOptions.MAX_SCALE_DOWN_FACTOR, 0.0d), +validateNumber(flinkConfiguration, AutoScalerOptions.MAX_SCALE_UP_FACTOR, 1.0d), +validateNumber(flinkConfiguration, AutoScalerOptions.TARGET_UTILIZATION, 0.0d), +validateNumber( +flinkConfiguration, +AutoScalerOptions.TARGET_UTILIZATION_BOUNDARY, +0.0d, +1.0d)); Review Comment: I think we should not validate the max here. ## flink-kubernetes-operator-autoscaler/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/validation/AutoScalerValidator.java: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.kubernetes.operator.autoscaler.validation; + +import org.apache.flink.configuration.ConfigOption; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.kubernetes.
Re: [PR] [FLINK-33095] Job jar issue should be reported as BAD_REQUEST instead of INTERNAL_SERVER_ERROR. [flink]
XComp commented on code in PR #23428: URL: https://github.com/apache/flink/pull/23428#discussion_r1345443798 ## flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/utils/JarHandlerUtils.java: ## @@ -189,7 +189,11 @@ public PackagedProgram toPackagedProgram(Configuration configuration) { .setArguments(programArgs.toArray(new String[0])) .build(); } catch (final ProgramInvocationException e) { -throw new CompletionException(e); +throw new CompletionException( Review Comment: I'm not sure whether we assume here that the `ProgramInvocationException` is caused by something the user provided (which would allow the BAD_REQUEST response). [PackagedProgram#extractContainedLibraries](https://github.com/apache/flink/blob/c08bef1f7913eb1c416c278354fd62b82b172549/flink-clients/src/main/java/org/apache/flink/client/program/PackagedProgram.java#L554) throws a `ProgramInvocationException` for unknown IO errors, for instance, which would be actually an internal issue, wouldn't it? ## flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/AbstractHandler.java: ## @@ -242,8 +242,14 @@ private CompletableFuture handleException( return CompletableFuture.completedFuture(null); } int maxLength = flinkHttpObjectAggregator.maxContentLength() - OTHER_RESP_PAYLOAD_OVERHEAD; -if (throwable instanceof RestHandlerException) { -RestHandlerException rhe = (RestHandlerException) throwable; +if (throwable instanceof RestHandlerException +|| throwable.getCause() instanceof RestHandlerException) { Review Comment: That's a strange change (checking for both, the exception and the exception's cause), don't you think? It might be an indication that we should do the change somewhere else. WDYT? :thinking: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (FLINK-30025) Unified the max display column width for SqlClient and Table APi in both Streaming and Batch execMode
[ https://issues.apache.org/jira/browse/FLINK-30025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Ge resolved FLINK-30025. - Resolution: Fixed > Unified the max display column width for SqlClient and Table APi in both > Streaming and Batch execMode > - > > Key: FLINK-30025 > URL: https://issues.apache.org/jira/browse/FLINK-30025 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Affects Versions: 1.16.0, 1.15.2 >Reporter: Jing Ge >Assignee: Jing Ge >Priority: Major > Labels: pull-request-available > Fix For: 1.18.0 > > > FLIP-279 > [https://cwiki.apache.org/confluence/display/FLINK/FLIP-279+Unified+the+max+display+column+width+for+SqlClient+and+Table+APi+in+both+Streaming+and+Batch+execMode] > > Background info: > table.execute().print() can only use the default max column width. When > running table API program "table.execute().print();", the columns with long > string value are truncated to 30 chars. E.g.,: > !https://static.dingtalk.com/media/lALPF6XTM7ZO1FXNASrNBEI_1090_298.png_620x1q90.jpg?auth_bizType=%27IM%27&bizType=im|width=457,height=125! > I tried set the max width with: > tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width", > 100); It has no effect. How can I set the max-width? > Here is the example code: > val env = StreamExecutionEnvironment.getExecutionEnvironment > val tEnv = StreamTableEnvironment.create(env) > tEnv.getConfig.getConfiguration.setInteger("sql-client.display.max-column-width", > 100) > val orderA = env > .fromCollection(Seq(Order(1L, "beer", 3), Order(1L, > "diaper-{-}{{-}}.diaper{{-}}{-}{-}.diaper{-}{-}{{-}}.diaper{{-}}{-}-.", 4), > Order(3L, "rubber", 2))) > .toTable(tEnv) > orderA.execute().print() > > "sql-client.display.max-column-width" seems only work in cli: SET > 'sql-client.display.max-column-width' = '40'; > While using Table API, by default, the DEFAULT_MAX_COLUMN_WIDTH in PrintStyle > is used now. It should be configurable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33053][zookeeper] Manually remove the leader watcher after ret… [flink]
XComp commented on code in PR #23415: URL: https://github.com/apache/flink/pull/23415#discussion_r1345486880 ## flink-runtime/src/main/java/org/apache/flink/runtime/leaderretrieval/ZooKeeperLeaderRetrievalDriver.java: ## @@ -122,6 +125,19 @@ public void close() throws Exception { client.getConnectionStateListenable().removeListener(connectionStateListener); cache.close(); + +try { +if (client.getZookeeperClient().isConnected() +&& !connectionInformationPath.contains(RESOURCE_MANAGER_NODE)) { Review Comment: @KarmaGYZ Sorry for jumping in that late. I was on vacation for the past three weeks. I have a question about this change: Why do we have to filter the ResourceManager node here? It feels odd to refer to the ResourceManager in the `ZooKeeperLeaderRetrievalDriver`. Knowledge about the ResourceManager component should be out-of-scope for the driver implementation. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33113) YARNSessionFIFOITCase.checkForProhibitedLogContents: Interrupted waiting to send RPC request to server
[ https://issues.apache.org/jira/browse/FLINK-33113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-33113: -- Description: This build https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53197&view=logs&j=5cae8624-c7eb-5c51-92d3-4d2dacedd221&t=5acec1b4-945b-59ca-34f8-168928ce5199&l=27683 fails as {code} Sep 14 05:52:35 2023-09-14 05:51:31,456 ERROR org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl [] - Exception on heartbeat Sep 14 05:52:35 java.io.InterruptedIOException: Interrupted waiting to send RPC request to server Sep 14 05:52:35 java.io.InterruptedIOException: Interrupted waiting to send RPC request to server Sep 14 05:52:35 at org.apache.hadoop.ipc.Client.call(Client.java:1461) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.ipc.Client.call(Client.java:1403) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at com.sun.proxy.$Proxy31.allocate(Unknown Source) ~[?:?] Sep 14 05:52:35 at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77) ~[hadoop-yarn-common-2.10.2.jar:?] Sep 14 05:52:35 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_292] Sep 14 05:52:35 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_292] Sep 14 05:52:35 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_292] Sep 14 05:52:35 at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_292] Sep 14 05:52:35 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at com.sun.proxy.$Proxy32.allocate(Unknown Source) ~[?:?] Sep 14 05:52:35 at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:297) ~[hadoop-yarn-client-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:274) [hadoop-yarn-client-2.10.2.jar:?] Sep 14 05:52:35 Caused by: java.lang.InterruptedException Sep 14 05:52:35 at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_292] Sep 14 05:52:35 at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_292] Sep 14 05:52:35 at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1177) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.ipc.Client.call(Client.java:1456) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 ... 17 more {code} was: This build https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53197&view=logs&j=5cae8624-c7eb-5c51-92d3-4d2dacedd221&t=5acec1b4-945b-59ca-34f8-168928ce5199&l=27683 fails as {noformat} Sep 14 05:52:35 at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) ~[hadoop-common-2.10.2.jar:?] Sep 14 05:52:35 at com.sun.proxy.$Proxy32.allocate(Unknown Source) ~[?:?] Sep 14 05:52:35 at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:297) ~[hadoop-yarn-client-2.10.2.jar:?] Sep 14 05:52:35 at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:274) [hadoop-yarn-client-2.10.2.jar:?] Sep 14 05:52:35 Caused by: java.lang.InterruptedException Sep 14 05:52:35 at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_292] Sep 14 05:52:35 at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_292] Sep 14 05:52:35 at org.apache.hadoop.ipc.Client$Connection.s
Re: [PR] [FLINK-33179] Throw exception when serialising or deserialising ExecNode with invalid type [flink]
twalthr commented on code in PR #23488: URL: https://github.com/apache/flink/pull/23488#discussion_r1345505603 ## flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeContext.java: ## @@ -104,9 +104,16 @@ private ExecNodeContext(@Nullable Integer id, String name, Integer version) { @JsonCreator public ExecNodeContext(String value) { this.id = null; -String[] split = value.split("_"); -this.name = split[0]; -this.version = Integer.valueOf(split[1]); +try { +String[] split = value.split("_"); +this.name = split[0]; +this.version = Integer.valueOf(split[1]); +} catch (Exception e) { +throw new TableException( Review Comment: Not sure if this generic catch all is helpful. Esp instructing users to file a ticket although we already know that we serialized an invalid node. Should we explicitly check for `name == "null"` and tell users that the node is unsupported? ## flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeContext.java: ## @@ -167,12 +174,19 @@ public ExecNodeContext withId(int id) { */ @JsonValue public String getTypeAsString() { +if (name == null || version == null) { +throw new TableException( +String.format( +"Can not serialise ExecNode with id: %d. Missing type, this is a bug," Review Comment: nit: use american english (`Can not serialize`) ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/serde/ExecNodeGraphJsonSerializerTest.java: ## @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.nodes.exec.serde; + +import org.apache.flink.FlinkVersion; +import org.apache.flink.api.dag.Transformation; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.table.api.DataTypes; +import org.apache.flink.table.data.RowData; +import org.apache.flink.table.planner.delegation.PlannerBase; +import org.apache.flink.table.planner.plan.nodes.exec.ExecNodeBase; +import org.apache.flink.table.planner.plan.nodes.exec.ExecNodeConfig; +import org.apache.flink.table.planner.plan.nodes.exec.ExecNodeContext; +import org.apache.flink.table.planner.plan.nodes.exec.ExecNodeGraph; + +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectWriter; + +import org.junit.jupiter.api.Test; + +import java.util.Collections; + +import static org.assertj.core.api.Assertions.assertThatThrownBy; + +/** Tests for {@link ExecNodeGraphJsonSerializer}. */ +class ExecNodeGraphJsonSerializerTest { + +@Test +void testSerialisingUnsupportedNode() { Review Comment: `testSerializing` ## flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/UnsupportedNodesInPlanTest.java: ## @@ -0,0 +1,219 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.planner.plan.nodes.exec; + +import org.apache.flink.table.api.EnvironmentSettings; +import org.apache.flink.table.api.PlanReference; +import org.apache.flink.table.api.TableEnvironment; +import org.apache.flink.table.planner.utils.TableTestBase; + +import org.junit.Test; + +import static org
Re: [PR] [hotfix][docs] Update KDA to MSF in vendor solutions docs (1.15) [flink]
dannycranmer merged PR #23487: URL: https://github.com/apache/flink/pull/23487 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33178) Highly parallel apps suffer from bottleneck in NativePRNG
[ https://issues.apache.org/jira/browse/FLINK-33178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771788#comment-17771788 ] Emre Kartoglu commented on FLINK-33178: --- [~martijnvisser] Thanks for the comment. You're right to point that out. Just double-checked this, I can see the same code path to the same Netty call in the master branch: [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/PartitionRequestQueue.java#L320C21-L320C28] Unless Netty has done something to improve it, I believe the bottleneck might still be there. Looking at the code [https://github.com/netty/netty/blob/d773f37e3422b8bc38429bbde94583173c3b7e4a/handler/src/main/java/io/netty/handler/ssl/SslHandler.java#L809] I can see the same call to SSLEngine. So I believe the same bottleneck might be there in the current master head. It also depends on the JDK being used, so users could work around the issue by using an upgraded version etc. I am happy to report the issue. But please feel free to close the ticket if you think the issue is best addressed elsewhere, or if you think we should observe it in practice in newer Flink versions. Judging by the codepath, I believe we would see the issue in newer versions too. > Highly parallel apps suffer from bottleneck in NativePRNG > -- > > Key: FLINK-33178 > URL: https://issues.apache.org/jira/browse/FLINK-33178 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Affects Versions: 1.13.2 >Reporter: Emre Kartoglu >Priority: Major > > I observed the below thread dumps that highlighted a potential bottleneck in > Flink/Netty/JDK. The application (Flink 1.13) from which I took the thread > dumps had very high parallelism and was distributed on nodes with >150GB > random access memory. > It appears that there is a call to "Arrays.copyOfRange" in a syncrhonized > block in "sun.security.provider.NativePRNG", which blocks other threads > waiting for the lock to the same synchronized block. This appears to be a > problem only with highly parallel applications. I don't know exactly at what > parallelism it starts becoming a problem, and how much of a bottleneck it > actually is. > I was also slightly hesitant about creating a Flink ticket as the improvement > could well be made in Netty or even JDK. But I believe we should have a > record of the issue in Flink Jira. > Related: [https://bugs.openjdk.org/browse/JDK-8278371] > > > {code:java} > "Flink Netty Server (6121) Thread 43" #930 daemon prio=5 os_prio=0 > cpu=2298176.43ms elapsed=44352.31s allocated=155G defined_classes=0 > tid=0x7f0a3397f800 nid=0x519 waiting for monitor entry > [0x7efc5d549000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > sun.security.provider.NativePRNG$RandomIO.implNextBytes(java.base@11.0.18/NativePRNG.java:544) > - waiting to lock <0x7f0b62c2eee8> (a java.lang.Object) > at > sun.security.provider.NativePRNG.engineNextBytes(java.base@11.0.18/NativePRNG.java:220) > at > java.security.SecureRandom.nextBytes(java.base@11.0.18/SecureRandom.java:751) > at > sun.security.ssl.SSLCipher$T11BlockWriteCipherGenerator$BlockWriteCipher.encrypt(java.base@11.0.18/SSLCipher.java:1498) > at > sun.security.ssl.OutputRecord.t10Encrypt(java.base@11.0.18/OutputRecord.java:441) > at > sun.security.ssl.OutputRecord.encrypt(java.base@11.0.18/OutputRecord.java:345) > at > sun.security.ssl.SSLEngineOutputRecord.encode(java.base@11.0.18/SSLEngineOutputRecord.java:287) > at > sun.security.ssl.SSLEngineOutputRecord.encode(java.base@11.0.18/SSLEngineOutputRecord.java:189) > at > sun.security.ssl.SSLEngineImpl.encode(java.base@11.0.18/SSLEngineImpl.java:285) > at > sun.security.ssl.SSLEngineImpl.writeRecord(java.base@11.0.18/SSLEngineImpl.java:231) > at > sun.security.ssl.SSLEngineImpl.wrap(java.base@11.0.18/SSLEngineImpl.java:136) > - eliminated <0x7f0b6aab70c8> (a sun.security.ssl.SSLEngineImpl) > at > sun.security.ssl.SSLEngineImpl.wrap(java.base@11.0.18/SSLEngineImpl.java:116) > - locked <0x7f0b6aab70c8> (a sun.security.ssl.SSLEngineImpl) > at javax.net.ssl.SSLEngine.wrap(java.base@11.0.18/SSLEngine.java:522) > at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:1071) > at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.wrap(SslHandler.java:843) > at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.wrapAndFlush(SslHandler.java:811) > at > org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.flush(SslHandler.java:792) >
Re: [PR] [hotfix][docs] Update KDA to MSF in vendor solutions docs (1.17) [flink]
dannycranmer merged PR #23486: URL: https://github.com/apache/flink/pull/23486 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [hotfix][docs] Update KDA to MSF in vendor solutions docs (1.16) [flink]
dannycranmer merged PR #23485: URL: https://github.com/apache/flink/pull/23485 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [hotfix][docs] Update KDA to MSF in vendor solutions docs (1.18) [flink]
dannycranmer merged PR #23484: URL: https://github.com/apache/flink/pull/23484 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [hotfix][docs] Update KDA to MSF in vendor solutions docs [flink]
dannycranmer merged PR #23483: URL: https://github.com/apache/flink/pull/23483 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33181) Table using `kinesis` connector can not be used for both read & write operations if it's defined with unsupported sink property
Khanh Vu created FLINK-33181: Summary: Table using `kinesis` connector can not be used for both read & write operations if it's defined with unsupported sink property Key: FLINK-33181 URL: https://issues.apache.org/jira/browse/FLINK-33181 Project: Flink Issue Type: Bug Components: Connectors / Kinesis, Table SQL / Runtime Affects Versions: 1.15.4 Reporter: Khanh Vu First, I define a table which uses `kinesis` connector with an unsupported property for sink, e.g. `scan.stream.initpos`: ``` %flink.ssql(type=update) -- Create input DROP TABLE IF EXISTS `kds_input`; CREATE TABLE `kds_input` ( `some_string` STRING, `some_int` BIGINT, `time` AS PROCTIME() ) WITH ( 'connector' = 'kinesis', 'stream' = 'ExampleInputStream', 'aws.region' = 'us-east-1', 'scan.stream.initpos' = 'LATEST', 'format' = 'csv' ); ``` I can read from my table (kds_input) without any issue, but it will throw exception if I try to write to the table: ``` %flink.ssql(type=update) -- Use to generate data in the input table DROP TABLE IF EXISTS connector_cve_datagen; CREATE TABLE connector_cve_datagen( `some_string` STRING, `some_int` BIGINT ) WITH ( 'connector' = 'datagen', 'rows-per-second' = '1', 'fields.some_string.length' = '2'); INSERT INTO kds_input SELECT some_string, some_int from connector_cve_datagen ``` Exception observed: ``` Caused by: org.apache.flink.table.api.ValidationException: Unsupported options found for 'kinesis'. Unsupported options: scan.stream.initpos Supported options: aws.region connector csv.allow-comments csv.array-element-delimiter csv.disable-quote-character csv.escape-character csv.field-delimiter csv.ignore-parse-errors csv.null-literal csv.quote-character format property-version sink.batch.max-size sink.fail-on-error sink.flush-buffer.size sink.flush-buffer.timeout sink.partitioner sink.partitioner-field-delimiter sink.producer.collection-max-count (deprecated) sink.producer.collection-max-size (deprecated) sink.producer.fail-on-error (deprecated) sink.producer.record-max-buffered-time (deprecated) sink.requests.max-buffered sink.requests.max-inflight stream at org.apache.flink.table.factories.FactoryUtil.validateUnconsumedKeys(FactoryUtil.java:624) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validate(FactoryUtil.java:914) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validate(FactoryUtil.java:978) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validateExcept(FactoryUtil.java:938) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validateExcept(FactoryUtil.java:978) at org.apache.flink.connector.kinesis.table.KinesisDynamicTableSinkFactory.createDynamicTableSink(KinesisDynamicTableSinkFactory.java:65) at org.apache.flink.table.factories.FactoryUtil.createDynamicTableSink(FactoryUtil.java:259) ... 36 more ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33181) Table using `kinesis` connector can not be used for both read & write operations if it's defined with unsupported sink property
[ https://issues.apache.org/jira/browse/FLINK-33181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khanh Vu updated FLINK-33181: - Description: First, I define a table which uses `kinesis` connector with an unsupported property for sink, e.g. `scan.stream.initpos`: ``` %flink.ssql(type=update) – Create input DROP TABLE IF EXISTS `kds_input`; CREATE TABLE `kds_input` ( `some_string` STRING, `some_int` BIGINT, `time` AS PROCTIME() ) WITH ( 'connector' = 'kinesis', 'stream' = 'ExampleInputStream', 'aws.region' = 'us-east-1', 'scan.stream.initpos' = 'LATEST', 'format' = 'csv' ); ``` I can read from my table (kds_input) without any issue, but it will throw exception if I try to write to the table: ``` %flink.ssql(type=update) – Use to generate data in the input table DROP TABLE IF EXISTS connector_cve_datagen; CREATE TABLE connector_cve_datagen( `some_string` STRING, `some_int` BIGINT ) WITH ( 'connector' = 'datagen', 'rows-per-second' = '1', 'fields.some_string.length' = '2'); INSERT INTO kds_input SELECT some_string, some_int from connector_cve_datagen ``` Exception observed: ``` Caused by: org.apache.flink.table.api.ValidationException: Unsupported options found for 'kinesis'. Unsupported options: scan.stream.initpos Supported options: aws.region connector csv.allow-comments csv.array-element-delimiter csv.disable-quote-character csv.escape-character csv.field-delimiter csv.ignore-parse-errors csv.null-literal csv.quote-character format property-version sink.batch.max-size sink.fail-on-error sink.flush-buffer.size sink.flush-buffer.timeout sink.partitioner sink.partitioner-field-delimiter sink.producer.collection-max-count (deprecated) sink.producer.collection-max-size (deprecated) sink.producer.fail-on-error (deprecated) sink.producer.record-max-buffered-time (deprecated) sink.requests.max-buffered sink.requests.max-inflight stream at org.apache.flink.table.factories.FactoryUtil.validateUnconsumedKeys(FactoryUtil.java:624) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validate(FactoryUtil.java:914) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validate(FactoryUtil.java:978) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validateExcept(FactoryUtil.java:938) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validateExcept(FactoryUtil.java:978) at org.apache.flink.connector.kinesis.table.KinesisDynamicTableSinkFactory.createDynamicTableSink(KinesisDynamicTableSinkFactory.java:65) at org.apache.flink.table.factories.FactoryUtil.createDynamicTableSink(FactoryUtil.java:259) ... 36 more ``` was: First, I define a table which uses `kinesis` connector with an unsupported property for sink, e.g. `scan.stream.initpos`: ``` %flink.ssql(type=update) -- Create input DROP TABLE IF EXISTS `kds_input`; CREATE TABLE `kds_input` ( `some_string` STRING, `some_int` BIGINT, `time` AS PROCTIME() ) WITH ( 'connector' = 'kinesis', 'stream' = 'ExampleInputStream', 'aws.region' = 'us-east-1', 'scan.stream.initpos' = 'LATEST', 'format' = 'csv' ); ``` I can read from my table (kds_input) without any issue, but it will throw exception if I try to write to the table: ``` %flink.ssql(type=update) -- Use to generate data in the input table DROP TABLE IF EXISTS connector_cve_datagen; CREATE TABLE connector_cve_datagen( `some_string` STRING, `some_int` BIGINT ) WITH ( 'connector' = 'datagen', 'rows-per-second' = '1', 'fields.some_string.length' = '2'); INSERT INTO kds_input SELECT some_string, some_int from connector_cve_datagen ``` Exception observed: ``` Caused by: org.apache.flink.table.api.ValidationException: Unsupported options found for 'kinesis'. Unsupported options: scan.stream.initpos Supported options: aws.region connector csv.allow-comments csv.array-element-delimiter csv.disable-quote-character csv.escape-character csv.field-delimiter csv.ignore-parse-errors csv.null-literal csv.quote-character format property-version sink.batch.max-size sink.fail-on-error sink.flush-buffer.size sink.flush-buffer.timeout sink.partitioner sink.partitioner-field-delimiter sink.producer.collection-max-count (deprecated) sink.producer.collection-max-size (deprecated) sink.producer.fail-on-error (deprecated) sink.producer.record-max-buffered-time (deprecated) sink.requests.max-buffered sink.requests.max-inflight stream at org.apache.flink.table.factories.FactoryUtil.validateUnconsumedKeys(FactoryUtil.java:624) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validate(FactoryUtil.java:914) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validate(FactoryUtil.java:978) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validateExcept(FactoryUtil.java:938) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validateExcept(FactoryUtil.java:978) at org.apac
Re: [PR] [FLINK-33179] Throw exception when serialising or deserialising ExecNode with invalid type [flink]
dawidwys commented on code in PR #23488: URL: https://github.com/apache/flink/pull/23488#discussion_r1345519922 ## flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeContext.java: ## @@ -104,9 +104,16 @@ private ExecNodeContext(@Nullable Integer id, String name, Integer version) { @JsonCreator public ExecNodeContext(String value) { this.id = null; -String[] split = value.split("_"); -this.name = split[0]; -this.version = Integer.valueOf(split[1]); +try { +String[] split = value.split("_"); +this.name = split[0]; +this.version = Integer.valueOf(split[1]); +} catch (Exception e) { +throw new TableException( Review Comment: Initially I also thought of just null checking. I went for `try catch` instead of a null check, because theoretically there might be a few reasons for an exception: 1. Error while splitting.. we don't have two parts 2. Error while parsing the Integer, the second part is not a number 3. Error while parsing the Integer, the part is a null I understand 1 & 2 are unlikely atm, but I guess we thought the same about the `null_null` situation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33181) Table using `kinesis` connector can not be used for both read & write operations if it's defined with unsupported sink property
[ https://issues.apache.org/jira/browse/FLINK-33181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khanh Vu updated FLINK-33181: - Description: First, I define a table which uses `kinesis` connector with an unsupported property for sink, e.g. `scan.stream.initpos`: {code:sql} %flink.ssql(type=update) – Create input DROP TABLE IF EXISTS `kds_input`; CREATE TABLE `kds_input` ( `some_string` STRING, `some_int` BIGINT, `time` AS PROCTIME() ) WITH ( 'connector' = 'kinesis', 'stream' = 'ExampleInputStream', 'aws.region' = 'us-east-1', 'scan.stream.initpos' = 'LATEST', 'format' = 'csv' ); {code} I can read from my table (kds_input) without any issue, but it will throw exception if I try to write to the table: {code:sql} %flink.ssql(type=update) – Use to generate data in the input table DROP TABLE IF EXISTS connector_cve_datagen; CREATE TABLE connector_cve_datagen( `some_string` STRING, `some_int` BIGINT ) WITH ( 'connector' = 'datagen', 'rows-per-second' = '1', 'fields.some_string.length' = '2'); INSERT INTO kds_input SELECT some_string, some_int from connector_cve_datagen {code} Exception observed: {code:java} Caused by: org.apache.flink.table.api.ValidationException: Unsupported options found for 'kinesis'. Unsupported options: scan.stream.initpos Supported options: aws.region connector csv.allow-comments csv.array-element-delimiter csv.disable-quote-character csv.escape-character csv.field-delimiter csv.ignore-parse-errors csv.null-literal csv.quote-character format property-version sink.batch.max-size sink.fail-on-error sink.flush-buffer.size sink.flush-buffer.timeout sink.partitioner sink.partitioner-field-delimiter sink.producer.collection-max-count (deprecated) sink.producer.collection-max-size (deprecated) sink.producer.fail-on-error (deprecated) sink.producer.record-max-buffered-time (deprecated) sink.requests.max-buffered sink.requests.max-inflight stream at org.apache.flink.table.factories.FactoryUtil.validateUnconsumedKeys(FactoryUtil.java:624) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validate(FactoryUtil.java:914) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validate(FactoryUtil.java:978) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validateExcept(FactoryUtil.java:938) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validateExcept(FactoryUtil.java:978) at org.apache.flink.connector.kinesis.table.KinesisDynamicTableSinkFactory.createDynamicTableSink(KinesisDynamicTableSinkFactory.java:65) at org.apache.flink.table.factories.FactoryUtil.createDynamicTableSink(FactoryUtil.java:259) ... 36 more {code} was: First, I define a table which uses `kinesis` connector with an unsupported property for sink, e.g. `scan.stream.initpos`: ``` %flink.ssql(type=update) – Create input DROP TABLE IF EXISTS `kds_input`; CREATE TABLE `kds_input` ( `some_string` STRING, `some_int` BIGINT, `time` AS PROCTIME() ) WITH ( 'connector' = 'kinesis', 'stream' = 'ExampleInputStream', 'aws.region' = 'us-east-1', 'scan.stream.initpos' = 'LATEST', 'format' = 'csv' ); ``` I can read from my table (kds_input) without any issue, but it will throw exception if I try to write to the table: ``` %flink.ssql(type=update) – Use to generate data in the input table DROP TABLE IF EXISTS connector_cve_datagen; CREATE TABLE connector_cve_datagen( `some_string` STRING, `some_int` BIGINT ) WITH ( 'connector' = 'datagen', 'rows-per-second' = '1', 'fields.some_string.length' = '2'); INSERT INTO kds_input SELECT some_string, some_int from connector_cve_datagen ``` Exception observed: ``` Caused by: org.apache.flink.table.api.ValidationException: Unsupported options found for 'kinesis'. Unsupported options: scan.stream.initpos Supported options: aws.region connector csv.allow-comments csv.array-element-delimiter csv.disable-quote-character csv.escape-character csv.field-delimiter csv.ignore-parse-errors csv.null-literal csv.quote-character format property-version sink.batch.max-size sink.fail-on-error sink.flush-buffer.size sink.flush-buffer.timeout sink.partitioner sink.partitioner-field-delimiter sink.producer.collection-max-count (deprecated) sink.producer.collection-max-size (deprecated) sink.producer.fail-on-error (deprecated) sink.producer.record-max-buffered-time (deprecated) sink.requests.max-buffered sink.requests.max-inflight stream at org.apache.flink.table.factories.FactoryUtil.validateUnconsumedKeys(FactoryUtil.java:624) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validate(FactoryUtil.java:914) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validate(FactoryUtil.java:978) at org.apache.flink.table.factories.FactoryUtil$FactoryHelper.validateExcept(FactoryUtil.java:938) at org.apache.flink.table.factories.FactoryUtil$TableFactoryHelper.validateExcept(FactoryUtil.java:978) at org.apache.flink.connector.kinesis.table.KinesisDy
[jira] [Commented] (FLINK-30649) Shutting down MiniCluster times out
[ https://issues.apache.org/jira/browse/FLINK-30649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771793#comment-17771793 ] Matthias Pohl commented on FLINK-30649: --- Yeah, essentially, it's hard to investigate if it's not reproducible. That's why I was curious whether you have an idea what's going on. It could be also a bug in Minikube which could be resolved by upgrading to a newer version. > Shutting down MiniCluster times out > --- > > Key: FLINK-30649 > URL: https://issues.apache.org/jira/browse/FLINK-30649 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Test Infrastructure >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Priority: Critical > Labels: stale-assigned, starter, test-stability > > {{Run kubernetes session test (default input)}} failed with a timeout. > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44748&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=b2642e3a-5b86-574d-4c8a-f7e2842bfb14&l=6317] > It appears that there was some issue with shutting down the pods of the > MiniCluster: > {code:java} > 2023-01-12T08:22:13.1388597Z timed out waiting for the condition on > pods/flink-native-k8s-session-1-7dc9976688-gq788 > 2023-01-12T08:22:13.1390040Z timed out waiting for the condition on > pods/flink-native-k8s-session-1-taskmanager-1-1 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (FLINK-30883) Missing JobID caused the k8s e2e test to fail
[ https://issues.apache.org/jira/browse/FLINK-30883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17683432#comment-17683432 ] Matthias Pohl edited comment on FLINK-30883 at 10/4/23 10:01 AM: - I extracted the available logs into dedicated files. It appears that the jobmanager restarted at least once. The strange thing is that even in {{jobmanager.0.log}} it states that the job was recovered. Source: jobmanager.0.log {code:java} Feb 01 15:03:03 2023-02-01 14:55:28,715 INFO org.apache.flink.client.deployment.application.executors.EmbeddedExecutor [] - Job 68e961ca was recovered successfully. {code} It looks like there was more than one JobManager restart happening with the expected log line "Job \{jobId} is submitted" probably being located in the logs of the missing JobManager run. But I struggle to find evidence for this theory: No additional logs are provided. was (Author: mapohl): I extracted the available logs into dedicated files. It appears that the jobmanager restarted at least once. The strange thing is that even in {{jobmanager.0.log}} it states that the job was recovered. Source: jobmanager.0.log {code} Feb 01 15:03:03 2023-02-01 14:55:28,715 INFO org.apache.flink.client.deployment.application.executors.EmbeddedExecutor [] - Job 68e961ca was recovered successfully. {code} It looks like there was more than one JobManager restart happening with the expected log line "Job {jobId} is submitted" probably being located in the logs of the missing JobManager run. But I struggle to find evidence for this theory: No additional logs are provided. > Missing JobID caused the k8s e2e test to fail > - > > Key: FLINK-30883 > URL: https://issues.apache.org/jira/browse/FLINK-30883 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Runtime / Coordination >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Priority: Major > Labels: auto-deprioritized-critical, test-stability > Attachments: e2e_test_failure.log, > flink-vsts-client-fv-az378-840.log, jobmanager.0.log, jobmanager.1.log, > taskmanager.log > > > We've experienced a test failure in {{Run kubernetes application HA test}} > due to a {{CliArgsException}}: > {code} > Feb 01 15:03:15 org.apache.flink.client.cli.CliArgsException: Missing JobID. > Specify a JobID to cancel a job. > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.cancel(CliFrontend.java:689) > ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1107) > ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189) > ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) > [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189) > [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157) > [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45569&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&s=ae4f8708-9994-57d3-c2d7-b892156e7812&t=b2642e3a-5b86-574d-4c8a-f7e2842bfb14&l=9866 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30883) Missing JobID caused the k8s e2e test to fail
[ https://issues.apache.org/jira/browse/FLINK-30883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771796#comment-17771796 ] Matthias Pohl commented on FLINK-30883: --- Hi [~Fei Feng] , could you provide logs of your scenario? Is it reproducible? > Missing JobID caused the k8s e2e test to fail > - > > Key: FLINK-30883 > URL: https://issues.apache.org/jira/browse/FLINK-30883 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Runtime / Coordination >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Priority: Major > Labels: auto-deprioritized-critical, test-stability > Attachments: e2e_test_failure.log, > flink-vsts-client-fv-az378-840.log, jobmanager.0.log, jobmanager.1.log, > taskmanager.log > > > We've experienced a test failure in {{Run kubernetes application HA test}} > due to a {{CliArgsException}}: > {code} > Feb 01 15:03:15 org.apache.flink.client.cli.CliArgsException: Missing JobID. > Specify a JobID to cancel a job. > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.cancel(CliFrontend.java:689) > ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1107) > ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.lambda$mainInternal$9(CliFrontend.java:1189) > ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) > [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.mainInternal(CliFrontend.java:1189) > [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > Feb 01 15:03:15 at > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1157) > [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT] > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45569&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&s=ae4f8708-9994-57d3-c2d7-b892156e7812&t=b2642e3a-5b86-574d-4c8a-f7e2842bfb14&l=9866 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33000) SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory
[ https://issues.apache.org/jira/browse/FLINK-33000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771803#comment-17771803 ] Matthias Pohl commented on FLINK-33000: --- [~fsk119] are you taking care of the backports as well? > SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using > a ThreadFactory > - > > Key: FLINK-33000 > URL: https://issues.apache.org/jira/browse/FLINK-33000 > Project: Flink > Issue Type: Bug > Components: Table SQL / Gateway, Tests >Affects Versions: 1.16.2, 1.18.0, 1.17.1, 1.19.0 >Reporter: Matthias Pohl >Assignee: Jiabao Sun >Priority: Major > Labels: pull-request-available, starter > > {{SqlGatewayServiceITCase}} uses a {{ExecutorThreadFactory}} for its > asynchronous operations. Instead, one should use {{TestExecutorExtension}} to > ensure proper cleanup of threads. > We might also want to remove the {{AbstractTestBase}} parent class because > that uses JUnit4 whereas {{SqlGatewayServiceITCase}} is already based on > JUnit5 -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-20625][pubsub,e2e] Add PubSubSource connector using FLIP-27 [flink-connector-gcp-pubsub]
XComp commented on PR #2: URL: https://github.com/apache/flink-connector-gcp-pubsub/pull/2#issuecomment-1746591605 @dchristle how about you? ;-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33179] Throw exception when serialising or deserialising ExecNode with invalid type [flink]
twalthr commented on code in PR #23488: URL: https://github.com/apache/flink/pull/23488#discussion_r1345578855 ## flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeContext.java: ## @@ -104,9 +104,16 @@ private ExecNodeContext(@Nullable Integer id, String name, Integer version) { @JsonCreator public ExecNodeContext(String value) { this.id = null; -String[] split = value.split("_"); -this.name = split[0]; -this.version = Integer.valueOf(split[1]); +try { +String[] split = value.split("_"); +this.name = split[0]; +this.version = Integer.valueOf(split[1]); +} catch (Exception e) { +throw new TableException( Review Comment: The null_null situation was a bug in the serialization method. During JSON deserialization so many things can go wrong if users modify the JSON plan themselves. I vote for only implement what we need for this particular use case. Also adding a comment there, why we need to to support plans older than 1.18.1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33116) CliClientTest.testCancelExecutionInteractiveMode fails with NPE on AZP
[ https://issues.apache.org/jira/browse/FLINK-33116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-33116: -- Description: This build [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53309&view=logs&j=ce3801ad-3bd5-5f06-d165-34d37e757d90&t=5e4d9387-1dcc-5885-a901-90469b7e6d2f&l=12264] fails as {noformat} Sep 18 02:26:15 02:26:15.743 [ERROR] org.apache.flink.table.client.cli.CliClientTest.testCancelExecutionInteractiveMode Time elapsed: 0.1 s <<< ERROR! Sep 18 02:26:15 java.lang.NullPointerException Sep 18 02:26:15 at org.apache.flink.table.client.cli.CliClient.closeTerminal(CliClient.java:284) Sep 18 02:26:15 at org.apache.flink.table.client.cli.CliClient.close(CliClient.java:108) Sep 18 02:26:15 at org.apache.flink.table.client.cli.CliClientTest.testCancelExecutionInteractiveMode(CliClientTest.java:314) Sep 18 02:26:15 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} was: This build https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53309&view=logs&j=ce3801ad-3bd5-5f06-d165-34d37e757d90&t=5e4d9387-1dcc-5885-a901-90469b7e6d2f&l=12264 fails as {noformat} Sep 18 02:26:15 at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86) Sep 18 02:26:15 at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103) Sep 18 02:26:15 at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93) Sep 18 02:26:15 at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) Sep 18 02:26:15 at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) Sep 18 02:26:15 at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) ... {noformat} > CliClientTest.testCancelExecutionInteractiveMode fails with NPE on AZP > -- > > Key: FLINK-33116 > URL: https://issues.apache.org/jira/browse/FLINK-33116 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: stale-critical, test-stability > > This build > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53309&view=logs&j=ce3801ad-3bd5-5f06-d165-34d37e757d90&t=5e4d9387-1dcc-5885-a901-90469b7e6d2f&l=12264] > fails as > {noformat} > Sep 18 02:26:15 02:26:15.743 [ERROR] > org.apache.flink.table.client.cli.CliClientTest.testCancelExecutionInteractiveMode > Time elapsed: 0.1 s <<< ERROR! > Sep 18 02:26:15 java.lang.NullPointerException > Sep 18 02:26:15 at > org.apache.flink.table.client.cli.CliClient.closeTerminal(CliClient.java:284) > Sep 18 02:26:15 at > org.apache.flink.table.client.cli.CliClient.close(CliClient.java:108) > Sep 18 02:26:15 at > org.apache.flink.table.client.cli.CliClientTest.testCancelExecutionInteractiveMode(CliClientTest.java:314) > Sep 18 02:26:15 at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33116) CliClientTest.testCancelExecutionInteractiveMode fails with NPE on AZP
[ https://issues.apache.org/jira/browse/FLINK-33116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771813#comment-17771813 ] Matthias Pohl commented on FLINK-33116: --- Looks like the {{CliClient}} is not thread-safe: The close call fails with a {{NullPointerException}} because {{CliClient.close()}} calls {{closeTerminal()}} if a terminal set. But the test finishes the {{client.executeInInteractiveMode()}} in a separate thread which calls {{closeTerminal()}} in the end as well. [~fsk119] can you delegate this issue? > CliClientTest.testCancelExecutionInteractiveMode fails with NPE on AZP > -- > > Key: FLINK-33116 > URL: https://issues.apache.org/jira/browse/FLINK-33116 > Project: Flink > Issue Type: Bug > Components: Table SQL / Client >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: stale-critical, test-stability > > This build > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53309&view=logs&j=ce3801ad-3bd5-5f06-d165-34d37e757d90&t=5e4d9387-1dcc-5885-a901-90469b7e6d2f&l=12264] > fails as > {noformat} > Sep 18 02:26:15 02:26:15.743 [ERROR] > org.apache.flink.table.client.cli.CliClientTest.testCancelExecutionInteractiveMode > Time elapsed: 0.1 s <<< ERROR! > Sep 18 02:26:15 java.lang.NullPointerException > Sep 18 02:26:15 at > org.apache.flink.table.client.cli.CliClient.closeTerminal(CliClient.java:284) > Sep 18 02:26:15 at > org.apache.flink.table.client.cli.CliClient.close(CliClient.java:108) > Sep 18 02:26:15 at > org.apache.flink.table.client.cli.CliClientTest.testCancelExecutionInteractiveMode(CliClientTest.java:314) > Sep 18 02:26:15 at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-32108][test] KubernetesExtension calls assumeThat in @BeforeAll callback which doesn't print the actual failure message [flink]
XComp commented on PR #23472: URL: https://github.com/apache/flink/pull/23472#issuecomment-1746649051 How is this different to the solution before? :thinking: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-32582) Move TypeSerializerUpgradeTestBase from Kafka connector into flink-connector-common
[ https://issues.apache.org/jira/browse/FLINK-32582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-32582: -- Description: The externalization of connectors caused problems with the Flink's test data generation. The Kafka connector relied on TypeSerializerUpgradeTestBase for some test cases which was fine prior to FLINK-27518 where the test data generation was handled individually. With FLINK-27518 the process was automated in Flink 1.18. For now, the TypeSerializerUpgradeTestBase class was just copied over into the Kafka connector since it was the only connector that would utilize this test base. But we might want to provide a more generalized solution where the test base is provided by {{flink-connector-common}} to offer a generalized approach for any connector. was: The externalization of connectors made caused problems with the Flink's test data generation. The Kafka connector relied on TypeSerializerUpgradeTestBase for some test cases which was fine prior to FLINK-27518 where the test data generation was handled individually. With FLINK-27518 the process was automated in Flink 1.18. For now, the TypeSerializerUpgradeTestBase class was just copied over into the Kafka connector since it was the only connector that would utilize this test base. But we might want to provide a more generalized solution where the test base is provided by {{flink-connector-common}} to offer a generalized approach for any connector. > Move TypeSerializerUpgradeTestBase from Kafka connector into > flink-connector-common > --- > > Key: FLINK-32582 > URL: https://issues.apache.org/jira/browse/FLINK-32582 > Project: Flink > Issue Type: Improvement > Components: Connectors / Common >Reporter: Matthias Pohl >Priority: Minor > Labels: auto-deprioritized-major, pull-request-available > > The externalization of connectors caused problems with the Flink's test data > generation. The Kafka connector relied on TypeSerializerUpgradeTestBase for > some test cases which was fine prior to FLINK-27518 where the test data > generation was handled individually. > With FLINK-27518 the process was automated in Flink 1.18. For now, the > TypeSerializerUpgradeTestBase class was just copied over into the Kafka > connector since it was the only connector that would utilize this test base. > But we might want to provide a more generalized solution where the test base > is provided by {{flink-connector-common}} to offer a generalized approach for > any connector. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33176][Connectors/Kinesis] Handle null value in RowDataKinesisDeserializationSchema [flink-connector-aws]
dannycranmer commented on code in PR #102: URL: https://github.com/apache/flink-connector-aws/pull/102#discussion_r1345737928 ## flink-connector-aws/flink-connector-kinesis/src/test/java/org/apache/flink/streaming/connectors/kinesis/table/RowDataKinesisDeserializationSchemaTest.java: ## @@ -0,0 +1,90 @@ +package org.apache.flink.streaming.connectors.kinesis.table; Review Comment: Missing copyright header -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData [flink]
Jiabao-Sun commented on code in PR #23424: URL: https://github.com/apache/flink/pull/23424#discussion_r1345742478 ## flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/AbstractHaServices.java: ## @@ -192,32 +192,18 @@ public void closeAndCleanupAllData() throws Exception { try { internalCleanup(); deletedHAData = true; +blobStoreService.cleanupAllData(); } catch (Exception t) { exception = t; } -try { -if (leaderElectionService != null) { -leaderElectionService.close(); -} -} catch (Throwable t) { -exception = ExceptionUtils.firstOrSuppressed(t, exception); +if (!deletedHAData) { +logger.info( +"Cannot delete HA blobs because we failed to delete the pointers in the HA store."); } try { -internalClose(); -} catch (Throwable t) { -exception = ExceptionUtils.firstOrSuppressed(t, exception); -} - -try { -if (deletedHAData) { -blobStoreService.closeAndCleanupAllData(); -} else { -logger.info( -"Cannot delete HA blobs because we failed to delete the pointers in the HA store."); -blobStoreService.close(); -} +close(); Review Comment: Thanks @XComp. It make sense to me. Could you help review it again when you have time? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33176][Connectors/Kinesis] Handle null value in RowDataKinesisDeserializationSchema [flink-connector-aws]
z3d1k commented on code in PR #102: URL: https://github.com/apache/flink-connector-aws/pull/102#discussion_r1345798939 ## flink-connector-aws/flink-connector-kinesis/src/test/java/org/apache/flink/streaming/connectors/kinesis/table/RowDataKinesisDeserializationSchemaTest.java: ## @@ -0,0 +1,90 @@ +package org.apache.flink.streaming.connectors.kinesis.table; Review Comment: Added missing header -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33149] Bump snappy to 1.1.10.4 [flink-statefun]
davidradl commented on code in PR #341: URL: https://github.com/apache/flink-statefun/pull/341#discussion_r1345814020 ## statefun-kafka-io/pom.xml: ## @@ -43,9 +43,7 @@ under the License. ${kafka.version} +org.apache.flink:flink-streaming-java --> Review Comment: It would be good to leave a comment here. As far as I can see Flink 1.16.2 has snappy-java 1.1.8.3 which is vulnerable - so you want to exclude it here. But 1.17 Flink and above uses snappy-java 1.1.10.4. So this is a point in time change, because of your dependancy on the back level Flink. I assume we would want to move to a provided dependancy when we depend on a Flink 1.17 or above. Have I understood this correctly? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Bump org.xerial.snappy:snappy-java from 1.1.10.1 to 1.1.10.4 in /statefun-flink [flink-statefun]
davidradl commented on PR #340: URL: https://github.com/apache/flink-statefun/pull/340#issuecomment-1746899156 See comment in https://github.com/apache/flink-statefun/pull/341 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Max Retries [flink-statefun]
davidradl commented on PR #337: URL: https://github.com/apache/flink-statefun/pull/337#issuecomment-1746901840 Is there a Jira associated with this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-31049] [flink-connector-kafka]Add support for Kafka record headers to KafkaSink [flink]
davidradl commented on PR #8: URL: https://github.com/apache/flink/pull/8#issuecomment-1746926177 @tzulitai @MartijnVisser I see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-28185][Connector/Kafka] Handle missing timestamps when there a… [flink]
davidradl commented on PR #22416: URL: https://github.com/apache/flink/pull/22416#issuecomment-1746924764 @junjiem see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-31262][kafka] Move kafka sql connector fat jar test to SmokeKafkaITCase [flink]
davidradl commented on PR #22060: URL: https://github.com/apache/flink/pull/22060#issuecomment-1746927104 I see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-31006][connector/kafka] Fix noMoreNewPartitionSplits is not se… [flink]
davidradl commented on PR #21909: URL: https://github.com/apache/flink/pull/21909#issuecomment-1746928308 @liuyongvs I see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-29492] Return Kafka producer to the pool when the Kafka sink is not the end of the chain [flink]
davidradl commented on PR #21226: URL: https://github.com/apache/flink/pull/21226#issuecomment-1746929409 @ruanhang1993 I see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-29492) Kafka exactly-once sink causes OutOfMemoryError
[ https://issues.apache.org/jira/browse/FLINK-29492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-29492: --- Labels: pull-request-available (was: ) > Kafka exactly-once sink causes OutOfMemoryError > --- > > Key: FLINK-29492 > URL: https://issues.apache.org/jira/browse/FLINK-29492 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.15.3 >Reporter: Robert Metzger >Assignee: Hang Ruan >Priority: Critical > Labels: pull-request-available > > My Kafka exactly-once sinks are periodically failing with a > {{OutOfMemoryError: Java heap space}}. > This looks very similar to FLINK-28250. But I am running 1.15.2, which > contains a fix for FLINK-28250. > Exception: > {code:java} > java.io.IOException: Could not perform checkpoint 2281 for operator > http_events[3]: Writer (1/1)#1. > at > org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1210) > at > org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:147) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:287) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.access$100(SingleCheckpointBarrierHandler.java:64) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:493) > at > org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.triggerGlobalCheckpoint(AbstractAlignedBarrierHandlerState.java:74) > at > org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:66) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.lambda$processBarrier$2(SingleCheckpointBarrierHandler.java:234) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.markCheckpointAlignedAndTransformState(SingleCheckpointBarrierHandler.java:262) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:231) > at > org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:181) > at > org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:159) > at > org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) > at java.base/java.lang.Thread.run(Unknown Source) > Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Could not > complete snapshot 2281 for operator http_events[3]: Writer (1/1)#1. Failure > reason: Checkpoint was declined. > at > org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:269) > at > org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:173) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:348) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.checkpointStreamOperator(RegularOperatorChain.java:227) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.buildOperatorSnapshotFutures(RegularOperatorChain.java:212) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.snapshotState(RegularOperatorChain.java:192) > at > org.apache.flink.streaming.runtime.tasks.Sub
Re: [PR] [FLINK-25538][flink-connector-kafka] JUnit5 Migration [flink]
davidradl commented on PR #20991: URL: https://github.com/apache/flink/pull/20991#issuecomment-1746931693 @snuyanzin I see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-13226] [connectors / kafka] Fix race condition between transaction commit and produc… [flink]
davidradl commented on PR #9224: URL: https://github.com/apache/flink/pull/9224#issuecomment-1746934218 @pnowojski I see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-26614]Upsert Kafka SQL Connector:Support startup mode of timestamp and specific offsets [flink]
davidradl commented on PR #19067: URL: https://github.com/apache/flink/pull/19067#issuecomment-1746937641 @slinkydeveloper I see https://github.com/apache/flink/pull/22797 , this pr is no longer relevant to this repository, I suggest closing this pr and porting the change to the [new Flink connector kafka repo]( https://github.com/apache/flink-connector-kafka) if it is still relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-31049] [flink-connector-kafka]Add support for Kafka record headers to KafkaSink [flink]
AlexAxeman commented on PR #8: URL: https://github.com/apache/flink/pull/8#issuecomment-1746945765 It was implemented in the new repo in [this PR](https://github.com/apache/flink-connector-kafka/pull/18). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-31049] [flink-connector-kafka]Add support for Kafka record headers to KafkaSink [flink]
AlexAxeman closed pull request #8: [FLINK-31049] [flink-connector-kafka]Add support for Kafka record headers to KafkaSink URL: https://github.com/apache/flink/pull/8 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-13226] [connectors / kafka] Fix race condition between transaction commit and produc… [flink]
flinkbot commented on PR #9224: URL: https://github.com/apache/flink/pull/9224#issuecomment-1746946522 ## CI report: * 838d369f5c230c4debd3eac7a28bd2bc75711a01 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-33182) Allow metadata columns in NduAnalyzer with ChangelogNormalize
[ https://issues.apache.org/jira/browse/FLINK-33182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771868#comment-17771868 ] Timo Walther commented on FLINK-33182: -- [~lincoln.86xy] do you agree with this assumption? > Allow metadata columns in NduAnalyzer with ChangelogNormalize > - > > Key: FLINK-33182 > URL: https://issues.apache.org/jira/browse/FLINK-33182 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Timo Walther >Priority: Major > > Currently, the NduAnalyzer is very strict about metadata columns in updating > sources. However, for upsert sources (like Kafka) that contain an incomplete > changelog, the planner always adds a ChangelogNormalize node. > ChangelogNormalize will make sure that metadata columns can be considered > deterministic. So the NduAnalyzer should be satisfied in this case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33182) Allow metadata columns in NduAnalyzer with ChangelogNormalize
Timo Walther created FLINK-33182: Summary: Allow metadata columns in NduAnalyzer with ChangelogNormalize Key: FLINK-33182 URL: https://issues.apache.org/jira/browse/FLINK-33182 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Timo Walther Currently, the NduAnalyzer is very strict about metadata columns in updating sources. However, for upsert sources (like Kafka) that contain an incomplete changelog, the planner always adds a ChangelogNormalize node. ChangelogNormalize will make sure that metadata columns can be considered deterministic. So the NduAnalyzer should be satisfied in this case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData [flink]
XComp commented on code in PR #23424: URL: https://github.com/apache/flink/pull/23424#discussion_r1345865694 ## flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java: ## @@ -498,12 +498,15 @@ protected CompletableFuture stopClusterServices(boolean cleanupHaData) { } if (haServices != null) { -try { -if (cleanupHaData) { -haServices.closeAndCleanupAllData(); -} else { -haServices.close(); +if (cleanupHaData) { Review Comment: Initially, I thought that it would be nicer code afterwards by just removing the if/else. But you're right with having separate try/catch blocks as well (which unfortunately, makes the code less readable). We could introduce a default implementation {{HighAvailabilityServices#closeWithOptionalClean(boolean)}} that does this optional cleanup while closing the services. That would remove the redundant code from {{MiniCluster}} and {{ClusterEntrypoint}}. ## flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServices.java: ## @@ -212,19 +212,17 @@ default LeaderRetrievalService getClusterRestEndpointLeaderRetriever() { void close() throws Exception; /** - * Closes the high availability services (releasing all resources) and deletes all data stored - * by these services in external stores. + * Deletes all data stored by high availability services in external stores. * * After this method was called, the any job or session that was managed by these high Review Comment: ```suggestion * After this method was called, any job or session that was managed by these high ``` nit ## flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServices.java: ## @@ -212,19 +212,17 @@ default LeaderRetrievalService getClusterRestEndpointLeaderRetriever() { void close() throws Exception; Review Comment: Keep the method to keep the JavaDoc. But we could make {{HighAvailabilityServices}} extend {{AutoCloseable}}. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData [flink]
XComp commented on code in PR #23424: URL: https://github.com/apache/flink/pull/23424#discussion_r1345868304 ## flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServices.java: ## @@ -212,19 +212,17 @@ default LeaderRetrievalService getClusterRestEndpointLeaderRetriever() { void close() throws Exception; Review Comment: Keep the method to keep the JavaDoc. But we could make {{HighAvailabilityServices}} extend {{AutoCloseable}}. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33183) Enable metadata columns in NduAnalyzer with retract if non-virtual
Timo Walther created FLINK-33183: Summary: Enable metadata columns in NduAnalyzer with retract if non-virtual Key: FLINK-33183 URL: https://issues.apache.org/jira/browse/FLINK-33183 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Timo Walther Currently, the NduAnalyzer is very strict about metadata columns in updating sources. Compared to append and upsert sources (see also FLINK-33182), retract sources are tricky. And the analyzer is actually correct. However, for retract sources we should expose more functionality to the user and add a warning to the documentation that retract mode could potentially cause NDU problems if not enough attention is paid. We should only throw an error on virtual metadata columns. Persisted metadata columns can be considered “safe“. When a metadata column is persisted, we can assume that an upstream Flink job fills its content thus likely also fills its correct retraction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33183) Enable metadata columns in NduAnalyzer with retract if non-virtual
[ https://issues.apache.org/jira/browse/FLINK-33183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771876#comment-17771876 ] Timo Walther commented on FLINK-33183: -- [~lincoln.86xy] what is you opinion here? This might be more controversial than FLINK-33182. But I think that the behavior actually depends on VIRTUAL vs non-VIRTUAL in order to cause issues. > Enable metadata columns in NduAnalyzer with retract if non-virtual > -- > > Key: FLINK-33183 > URL: https://issues.apache.org/jira/browse/FLINK-33183 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Timo Walther >Priority: Major > > Currently, the NduAnalyzer is very strict about metadata columns in updating > sources. Compared to append and upsert sources (see also FLINK-33182), > retract sources are tricky. And the analyzer is actually correct. > However, for retract sources we should expose more functionality to the user > and add a warning to the documentation that retract mode could potentially > cause NDU problems if not enough attention is paid. We should only throw an > error on virtual metadata columns. Persisted metadata columns can be > considered “safe“. When a metadata column is persisted, we can assume that an > upstream Flink job fills its content thus likely also fills its correct > retraction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-30831) Improving e2e test output in case errors/exceptions are found
[ https://issues.apache.org/jira/browse/FLINK-30831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl reassigned FLINK-30831: - Assignee: Flaviu Cicio > Improving e2e test output in case errors/exceptions are found > - > > Key: FLINK-30831 > URL: https://issues.apache.org/jira/browse/FLINK-30831 > Project: Flink > Issue Type: Improvement > Components: Test Infrastructure >Affects Versions: 1.17.0, 1.15.3, 1.16.1 >Reporter: Matthias Pohl >Assignee: Flaviu Cicio >Priority: Minor > Labels: auto-deprioritized-major, starter > > Some e2e tests parse the Flink logs for exceptions using {{grep}}. We then > print the first 500 lines of the each log file in case an exception that > shouldn't be ignored is found (see > [internal_check_logs_for_exceptions|https://github.com/apache/flink/blob/c9e87fe410c42f7e7c19c81456d4212a58564f5e/flink-end-to-end-tests/test-scripts/common.sh#L449] > or > [check_logs_for_errors|https://github.com/apache/flink/blob/c9e87fe410c42f7e7c19c81456d4212a58564f5e/flink-end-to-end-tests/test-scripts/common.sh#L387]). > Instead, we could use {{grep -C200}} to actually print the context of the > exception. > This would help especially in those situations where the exception doesn't > appear in the first 500 lines. > This issue does not necessarily only include the two aforementioned code > locations. One should check the scripts for other code with a similar issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30831) Improving e2e test output in case errors/exceptions are found
[ https://issues.apache.org/jira/browse/FLINK-30831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771882#comment-17771882 ] Matthias Pohl commented on FLINK-30831: --- Thanks for volunteering. i assigned the issue to you. > Improving e2e test output in case errors/exceptions are found > - > > Key: FLINK-30831 > URL: https://issues.apache.org/jira/browse/FLINK-30831 > Project: Flink > Issue Type: Improvement > Components: Test Infrastructure >Affects Versions: 1.17.0, 1.15.3, 1.16.1 >Reporter: Matthias Pohl >Assignee: Flaviu Cicio >Priority: Minor > Labels: auto-deprioritized-major, starter > > Some e2e tests parse the Flink logs for exceptions using {{grep}}. We then > print the first 500 lines of the each log file in case an exception that > shouldn't be ignored is found (see > [internal_check_logs_for_exceptions|https://github.com/apache/flink/blob/c9e87fe410c42f7e7c19c81456d4212a58564f5e/flink-end-to-end-tests/test-scripts/common.sh#L449] > or > [check_logs_for_errors|https://github.com/apache/flink/blob/c9e87fe410c42f7e7c19c81456d4212a58564f5e/flink-end-to-end-tests/test-scripts/common.sh#L387]). > Instead, we could use {{grep -C200}} to actually print the context of the > exception. > This would help especially in those situations where the exception doesn't > appear in the first 500 lines. > This issue does not necessarily only include the two aforementioned code > locations. One should check the scripts for other code with a similar issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-32380) Support Java records
[ https://issues.apache.org/jira/browse/FLINK-32380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora reassigned FLINK-32380: -- Assignee: Gyula Fora > Support Java records > > > Key: FLINK-32380 > URL: https://issues.apache.org/jira/browse/FLINK-32380 > Project: Flink > Issue Type: Improvement > Components: API / Type Serialization System >Reporter: Chesnay Schepler >Assignee: Gyula Fora >Priority: Major > > Reportedly Java records are not supported, because they are neither detected > by our Pojo serializer nor supported by Kryo 2.x -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData [flink]
Jiabao-Sun commented on code in PR #23424: URL: https://github.com/apache/flink/pull/23424#discussion_r1346010506 ## flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/ClusterEntrypoint.java: ## @@ -498,12 +498,15 @@ protected CompletableFuture stopClusterServices(boolean cleanupHaData) { } if (haServices != null) { -try { -if (cleanupHaData) { -haServices.closeAndCleanupAllData(); -} else { -haServices.close(); +if (cleanupHaData) { Review Comment: Thanks @XComp. Sounds great to me. Could you help take a look again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-30859] Remove all Kafka connector code from main repo [flink]
davidradl commented on PR #22797: URL: https://github.com/apache/flink/pull/22797#issuecomment-1747160496 @zentol @tzulitai As the kafka connector code has now been removed from the repo in the change that has been merged in [#8](https://github.com/apache/flink/commit/149a5e34c1ed8d8943c901a98c65c70693915811) , this and the other referenced Kafka connector orientated prs need to be moved to the new Flink Kafka connector repository. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-30859] Remove all Kafka connector code from main repo [flink]
davidradl commented on PR #22797: URL: https://github.com/apache/flink/pull/22797#issuecomment-1747162915 @zentol @tzulitai As the kafka connector code has now been removed from the repo in the change that has been merged in [#8](https://github.com/apache/flink/commit/149a5e34c1ed8d8943c901a98c65c70693915811) , this and the other referenced Kafka connector orientated prs need to be moved to the new Flink Kafka connector repository. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update filesystem.md spell error : continously to continuously [flink]
XComp commented on PR #23422: URL: https://github.com/apache/flink/pull/23422#issuecomment-1747197573 I see that you provided another PR with similar changes (#23431). Please move all the changes into a single PR. I'm gonna close the other PR without merging it. We should do the typo changes covering the same issue in a single hotfix commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] spell error : continous to continuous [flink]
XComp commented on PR #23431: URL: https://github.com/apache/flink/pull/23431#issuecomment-1747198710 Closing this PR in favor of #23422. Please collect all the typo fixes in there. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] spell error : continous to continuous [flink]
XComp closed pull request #23431: spell error : continous to continuous URL: https://github.com/apache/flink/pull/23431 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-33176][Connectors/Kinesis] Handle null value in RowDataKinesisDeserializationSchema [flink-connector-aws]
dannycranmer merged PR #102: URL: https://github.com/apache/flink-connector-aws/pull/102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-33176) Kinesis source throws NullPointerException in Table API on ignored parsing errors
[ https://issues.apache.org/jira/browse/FLINK-33176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer reassigned FLINK-33176: - Assignee: Aleksandr Pilipenko > Kinesis source throws NullPointerException in Table API on ignored parsing > errors > - > > Key: FLINK-33176 > URL: https://issues.apache.org/jira/browse/FLINK-33176 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.15.4, aws-connector-4.1.0 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > Using following example table: > > {code:java} > CREATE TABLE source ( > text STRING, > `arrival_time` TIMESTAMP(3) METADATA FROM 'timestamp' VIRTUAL > ) WITH ( > 'connector' = 'kinesis', > 'stream' = 'test', > 'aws.region' = 'us-east-1', > 'json.ignore-parse-errors' = 'true', > 'format' = 'json' > ) {code} > Connector throws NullPointerException when source consumes malformed json > message: > {code:java} > java.lang.NullPointerException > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:137) > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:44) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.deserializeRecordForCollectionAndUpdateState(ShardConsumer.java:202) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.lambda$run$0(ShardConsumer.java:126) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:118) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:102) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:114) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at > java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33176) Kinesis source throws NullPointerException in Table API on ignored parsing errors
[ https://issues.apache.org/jira/browse/FLINK-33176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer resolved FLINK-33176. --- Resolution: Fixed > Kinesis source throws NullPointerException in Table API on ignored parsing > errors > - > > Key: FLINK-33176 > URL: https://issues.apache.org/jira/browse/FLINK-33176 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.15.4, aws-connector-4.1.0 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > Fix For: aws-connector-4.2.0 > > > Using following example table: > > {code:java} > CREATE TABLE source ( > text STRING, > `arrival_time` TIMESTAMP(3) METADATA FROM 'timestamp' VIRTUAL > ) WITH ( > 'connector' = 'kinesis', > 'stream' = 'test', > 'aws.region' = 'us-east-1', > 'json.ignore-parse-errors' = 'true', > 'format' = 'json' > ) {code} > Connector throws NullPointerException when source consumes malformed json > message: > {code:java} > java.lang.NullPointerException > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:137) > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:44) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.deserializeRecordForCollectionAndUpdateState(ShardConsumer.java:202) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.lambda$run$0(ShardConsumer.java:126) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:118) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:102) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:114) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at > java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33176) Kinesis source throws NullPointerException in Table API on ignored parsing errors
[ https://issues.apache.org/jira/browse/FLINK-33176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771905#comment-17771905 ] Danny Cranmer commented on FLINK-33176: --- merged commit [{{5105d48}}|https://github.com/apache/flink-connector-aws/commit/5105d48bf7334503b314d188462c3542f438f1d6] into apache:main > Kinesis source throws NullPointerException in Table API on ignored parsing > errors > - > > Key: FLINK-33176 > URL: https://issues.apache.org/jira/browse/FLINK-33176 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.15.4, aws-connector-4.1.0 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > Fix For: aws-connector-4.2.0 > > > Using following example table: > > {code:java} > CREATE TABLE source ( > text STRING, > `arrival_time` TIMESTAMP(3) METADATA FROM 'timestamp' VIRTUAL > ) WITH ( > 'connector' = 'kinesis', > 'stream' = 'test', > 'aws.region' = 'us-east-1', > 'json.ignore-parse-errors' = 'true', > 'format' = 'json' > ) {code} > Connector throws NullPointerException when source consumes malformed json > message: > {code:java} > java.lang.NullPointerException > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:137) > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:44) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.deserializeRecordForCollectionAndUpdateState(ShardConsumer.java:202) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.lambda$run$0(ShardConsumer.java:126) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:118) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:102) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:114) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at > java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33176) Kinesis source throws NullPointerException in Table API on ignored parsing errors
[ https://issues.apache.org/jira/browse/FLINK-33176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Cranmer updated FLINK-33176: -- Fix Version/s: aws-connector-4.2.0 > Kinesis source throws NullPointerException in Table API on ignored parsing > errors > - > > Key: FLINK-33176 > URL: https://issues.apache.org/jira/browse/FLINK-33176 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.15.4, aws-connector-4.1.0 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > Fix For: aws-connector-4.2.0 > > > Using following example table: > > {code:java} > CREATE TABLE source ( > text STRING, > `arrival_time` TIMESTAMP(3) METADATA FROM 'timestamp' VIRTUAL > ) WITH ( > 'connector' = 'kinesis', > 'stream' = 'test', > 'aws.region' = 'us-east-1', > 'json.ignore-parse-errors' = 'true', > 'format' = 'json' > ) {code} > Connector throws NullPointerException when source consumes malformed json > message: > {code:java} > java.lang.NullPointerException > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:137) > at > org.apache.flink.streaming.connectors.kinesis.table.RowDataKinesisDeserializationSchema.deserialize(RowDataKinesisDeserializationSchema.java:44) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.deserializeRecordForCollectionAndUpdateState(ShardConsumer.java:202) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.lambda$run$0(ShardConsumer.java:126) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:118) > at > org.apache.flink.streaming.connectors.kinesis.internals.publisher.polling.PollingRecordPublisher.run(PollingRecordPublisher.java:102) > at > org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:114) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at > java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] Update with.md defintion spell error ,fix with definition [flink]
XComp closed pull request #23432: Update with.md defintion spell error ,fix with definition URL: https://github.com/apache/flink/pull/23432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update with.md defintion spell error ,fix with definition [flink]
XComp commented on PR #23432: URL: https://github.com/apache/flink/pull/23432#issuecomment-1747214634 Thanks for your contribution, @yanfangli85 and for looking into this, @davidradl. David is right: We should do documentation changes in both the English and Chinese version at the same time to keep them on par. I'm gonna close this PR in favor of your first (as far as I can see?) PR #23422 to reduce the clutter. Feel free to put the changes in individual hotfix commits within PR #23422. That helps keeping the discussion in a single location. Feel free to ask if you need help with creating additional commits in the existing PR (if that's the reason why you create multiple PRs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update match_recognize.md , preterite of seek fix ,change seeked to sought [flink]
XComp commented on PR #23433: URL: https://github.com/apache/flink/pull/23433#issuecomment-1747218067 Thanks for your contribution, @yanfangli85 As pointed out in other PRs: We should do documentation changes in both the English and Chinese version at the same time to keep them on par. I'm gonna close this PR in favor of your first (as far as I can see?) PR #23422 to reduce the clutter. Feel free to put the changes in individual hotfix commits within PR #23422. That helps keeping the discussion in a single location. Feel free to ask if you need help with creating additional commits in the existing PR (if that's the reason why you create multiple PRs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update match_recognize.md , preterite of seek fix ,change seeked to sought [flink]
XComp closed pull request #23433: Update match_recognize.md , preterite of seek fix ,change seeked to sought URL: https://github.com/apache/flink/pull/23433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update create.md , ifx spell error, change statment to statement [flink]
XComp closed pull request #23434: Update create.md , ifx spell error, change statment to statement URL: https://github.com/apache/flink/pull/23434 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update create.md , ifx spell error, change statment to statement [flink]
XComp commented on PR #23434: URL: https://github.com/apache/flink/pull/23434#issuecomment-174778 Thanks for your contribution, @yanfangli85 and for looking into this, @davidradl. David is right: We should do documentation changes in both the English and Chinese version at the same time to keep them on par. I'm gonna close this PR in favor of your first (as far as I can see?) PR #23422 to reduce the clutter. Feel free to put the changes in individual hotfix commits within PR #23422. That helps keeping the discussion in a single location. Creating individual PRs for each typo change seems to be a bit too much overhead. Feel free to ask if you need help with creating additional commits in the existing PR (if that's the reason why you create multiple PRs). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Update local_installation.md [flink]
snuyanzin commented on code in PR #23240: URL: https://github.com/apache/flink/pull/23240#discussion_r1346137247 ## docs/content/docs/try-flink/local_installation.md: ## @@ -46,6 +46,8 @@ to have __Java 11__ installed. To check the Java version installed, type in your $ java -version ``` +**Notes**: JDK 17 and above is not supported yet!!! + Review Comment: In fact it is supported started from 1.18.x Also above in this doc there is already mentioned that there should be JAva 11 downloaded could you please give more insites why dow we need it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] fix(sec): upgrade com.google.guava:guava to 32.0.0-jre [flink]
snuyanzin commented on PR #23150: URL: https://github.com/apache/flink/pull/23150#issuecomment-1747230400 @ChengDaqi2023 the build is failing could you have a look please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData [flink]
XComp commented on code in PR #23424: URL: https://github.com/apache/flink/pull/23424#discussion_r1346146391 ## flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServices.java: ## @@ -212,19 +213,53 @@ default LeaderRetrievalService getClusterRestEndpointLeaderRetriever() { void close() throws Exception; /** - * Closes the high availability services (releasing all resources) and deletes all data stored - * by these services in external stores. + * Deletes all data stored by high availability services in external stores. * - * After this method was called, the any job or session that was managed by these high + * After this method was called, any job or session that was managed by these high * availability services will be unrecoverable. * * If an exception occurs during cleanup, this method will attempt to continue the cleanup * and report exceptions only after all cleanup steps have been attempted. * - * @throws Exception Thrown, if an exception occurred while closing these services or cleaning - * up data stored by them. + * @throws Exception Thrown, if an exception occurred while cleaning up data stored by them. */ -void closeAndCleanupAllData() throws Exception; +void cleanupAllData() throws Exception; + +/** + * Closes the high availability services (releasing all resources) and optionally deletes all + * data stored by these services in external stores. + * + * After this method was called, any job or session that was managed by these high + * availability services will be unrecoverable. + * + * If an exception occurs during cleanup or close, this method will attempt to continue the + * cleanup or close and report exceptions only after all cleanup and close steps have been + * attempted. + * + * @param cleanupData Whether cleanup of data stored by the high availability services should be + * attempted. + * @throws Exception Thrown, if an exception occurred while cleaning up data stored by the high + * availability services. Review Comment: ```suggestion * Calls {@link #cleanupAllData()} (if {@code true} is passed as a parameter) before calling {@link #close()} on this instance. Any error that appeared during the cleanup will be propagated after calling {@code close()}. ``` nit: I'm not a big fan of duplicating documentation because it makes maintaining the documentation harder. What do you think of the proposal above? ## flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServices.java: ## @@ -212,19 +213,53 @@ default LeaderRetrievalService getClusterRestEndpointLeaderRetriever() { void close() throws Exception; /** - * Closes the high availability services (releasing all resources) and deletes all data stored - * by these services in external stores. + * Deletes all data stored by high availability services in external stores. * - * After this method was called, the any job or session that was managed by these high + * After this method was called, any job or session that was managed by these high * availability services will be unrecoverable. * * If an exception occurs during cleanup, this method will attempt to continue the cleanup * and report exceptions only after all cleanup steps have been attempted. * - * @throws Exception Thrown, if an exception occurred while closing these services or cleaning - * up data stored by them. + * @throws Exception Thrown, if an exception occurred while cleaning up data stored by them. Review Comment: ```suggestion * @throws Exception if an error occurred while cleaning up data stored by them. ``` nit: just to fix using "throw" twice ## flink-runtime/src/test/java/org/apache/flink/runtime/highavailability/AbstractHaServicesTest.java: ## @@ -97,7 +98,8 @@ void testCloseAndCleanupAllDataDoesNotDeleteBlobsIfCleaningUpHADataFails() throw }, ignored -> {}); - assertThatThrownBy(haServices::closeAndCleanupAllData).isInstanceOf(FlinkException.class); + assertThatThrownBy(haServices::cleanupAllData).isInstanceOf(FlinkException.class); +haServices.close(); Review Comment: ```suggestion assertThatThrownBy(() -> haServices.closeWithOptionalCleanup(true)).isInstanceOf(FlinkException.class); ``` This improves the test coverage (i.e. it would check the newly added default implementation). ## flink-runtime/src/test/java/org/apache/flink/runtime/highavailability/AbstractHaServicesTest.java: ## @@ -66,13 +66,14 @@ void testCloseAndCleanupAllDataDeletesBlobsAfterCleaningUpHAData() throws Except () -> clos
Re: [PR] [FLINK-33020] OpensearchSinkTest.testAtLeastOnceSink timed out [flink-connector-opensearch]
reta commented on PR #32: URL: https://github.com/apache/flink-connector-opensearch/pull/32#issuecomment-1747262996 Blocked by https://github.com/opensearch-project/OpenSearch/pull/9286/files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-33020) OpensearchSinkTest.testAtLeastOnceSink timed out
[ https://issues.apache.org/jira/browse/FLINK-33020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-33020: --- Labels: pull-request-available (was: ) > OpensearchSinkTest.testAtLeastOnceSink timed out > > > Key: FLINK-33020 > URL: https://issues.apache.org/jira/browse/FLINK-33020 > Project: Flink > Issue Type: Bug > Components: Connectors / Opensearch >Affects Versions: opensearch-1.0.2 >Reporter: Martijn Visser >Priority: Blocker > Labels: pull-request-available > > https://github.com/apache/flink-connector-opensearch/actions/runs/6061205003/job/16446139552#step:13:1029 > {code:java} > Error: Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.837 > s <<< FAILURE! - in > org.apache.flink.streaming.connectors.opensearch.OpensearchSinkTest > Error: > org.apache.flink.streaming.connectors.opensearch.OpensearchSinkTest.testAtLeastOnceSink > Time elapsed: 5.022 s <<< ERROR! > java.util.concurrent.TimeoutException: testAtLeastOnceSink() timed out after > 5 seconds > at > org.junit.jupiter.engine.extension.TimeoutInvocation.createTimeoutException(TimeoutInvocation.java:70) > at > org.junit.jupiter.engine.extension.TimeoutInvocation.proceed(TimeoutInvocation.java:59) > at > org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149) > at > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140) > at > org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84) > at > org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115) > at > org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) > at > org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104) > at > org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135) > at > org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141) > at > org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) > at > org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155) > at > org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) > at > org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141) > at > org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) > at > org.j
Re: [PR] [FLINK-33020] OpensearchSinkTest.testAtLeastOnceSink timed out [flink-connector-opensearch]
snuyanzin commented on PR #32: URL: https://github.com/apache/flink-connector-opensearch/pull/32#issuecomment-1747270859 jira issue https://issues.apache.org/jira/browse/FLINK-33175 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process
[ https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771919#comment-17771919 ] Sergey Nuyanzin commented on FLINK-18356: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53549&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=11712 > flink-table-planner Exit code 137 returned from process > --- > > Key: FLINK-18356 > URL: https://issues.apache.org/jira/browse/FLINK-18356 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines, Tests >Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, > 1.19.0 >Reporter: Piotr Nowojski >Assignee: Yunhong Zheng >Priority: Critical > Labels: pull-request-available, test-stability > Attachments: 1234.jpg, app-profiling_4.gif, > image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, > image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, > image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, > image-2023-07-11-19-41-37-105.png > > > {noformat} > = test session starts > == > platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1 > cachedir: .tox/py37-cython/.pytest_cache > rootdir: /__w/3/s/flink-python > collected 568 items > pyflink/common/tests/test_configuration.py ..[ > 1%] > pyflink/common/tests/test_execution_config.py ...[ > 5%] > pyflink/dataset/tests/test_execution_environment.py . > ##[error]Exit code 137 returned from process: file name '/bin/docker', > arguments 'exec -i -u 1002 > 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb > /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'. > Finishing: Test - python > {noformat} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [FLINK-33020] OpensearchSinkTest.testAtLeastOnceSink timed out [flink-connector-opensearch]
reta commented on PR #32: URL: https://github.com/apache/flink-connector-opensearch/pull/32#issuecomment-1747279406 > jira issue https://issues.apache.org/jira/browse/FLINK-33175 Wrong copy / paste on my side 🗡️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [FLINK-32445][runtime] Refactor BlobStoreService's closeAndCleanupAllData to cleanupAllData [flink]
Jiabao-Sun commented on PR #23424: URL: https://github.com/apache/flink/pull/23424#issuecomment-1747284639 Thanks @XComp for the hard review. PTAL again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map
Sergey Nuyanzin created FLINK-33184: --- Summary: HybridShuffleITCase fails with exception in resource cleanup of task Map Key: FLINK-33184 URL: https://issues.apache.org/jira/browse/FLINK-33184 Project: Flink Issue Type: Bug Components: Runtime / Network Affects Versions: 1.19.0 Reporter: Sergey Nuyanzin This build fails https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 {noformat} Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task [] - FATAL - exception in resource cleanup of task Map (5/10)#0 (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) . java.lang.IllegalStateException: Leaking buffers. at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292] 01:17:22,375 [flink-pekko.actor.default-dispatcher-5] INFO org.apache.flink.runtime.taskmanager.Task[] - Task Sink: Unnamed (3/10)#0 is already in state CANCELING 01:17:22,375 [Map (5/10)#0] ERROR org.apache.flink.runtime.taskexecutor.TaskExecutor [] - FATAL - exception in resource cleanup of task Map (5/10)#0 (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) . java.lang.IllegalStateException: Leaking buffers. at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) [flink-runtime
[jira] [Commented] (FLINK-33000) SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using a ThreadFactory
[ https://issues.apache.org/jira/browse/FLINK-33000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771922#comment-17771922 ] Jiabao Sun commented on FLINK-33000: Thanks [~mapohl] and [~fsk119]. The PRs for release 1.18/1.17/1.16 is ready now. Please help review it when you have time. > SqlGatewayServiceITCase should utilize TestExecutorExtension instead of using > a ThreadFactory > - > > Key: FLINK-33000 > URL: https://issues.apache.org/jira/browse/FLINK-33000 > Project: Flink > Issue Type: Bug > Components: Table SQL / Gateway, Tests >Affects Versions: 1.16.2, 1.18.0, 1.17.1, 1.19.0 >Reporter: Matthias Pohl >Assignee: Jiabao Sun >Priority: Major > Labels: pull-request-available, starter > > {{SqlGatewayServiceITCase}} uses a {{ExecutorThreadFactory}} for its > asynchronous operations. Instead, one should use {{TestExecutorExtension}} to > ensure proper cleanup of threads. > We might also want to remove the {{AbstractTestBase}} parent class because > that uses JUnit4 whereas {{SqlGatewayServiceITCase}} is already based on > JUnit5 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30719) flink-runtime-web failed due to a corrupted nodejs dependency
[ https://issues.apache.org/jira/browse/FLINK-30719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771924#comment-17771924 ] Sergey Nuyanzin commented on FLINK-30719: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53545&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=9332 > flink-runtime-web failed due to a corrupted nodejs dependency > - > > Key: FLINK-30719 > URL: https://issues.apache.org/jira/browse/FLINK-30719 > Project: Flink > Issue Type: Bug > Components: Runtime / Web Frontend, Test Infrastructure, Tests >Affects Versions: 1.16.0, 1.17.0, 1.18.0 >Reporter: Matthias Pohl >Assignee: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44954&view=logs&j=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5&t=54421a62-0c80-5aad-3319-094ff69180bb&l=12550 > The build failed due to a corrupted nodejs dependency: > {code} > [ERROR] The archive file > /__w/1/.m2/repository/com/github/eirslett/node/16.13.2/node-16.13.2-linux-x64.tar.gz > is corrupted and will be deleted. Please try the build again. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map
[ https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771925#comment-17771925 ] Sergey Nuyanzin commented on FLINK-33184: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53535&view=logs&j=a596f69e-60d2-5a4b-7d39-dc69e4cdaed3&t=712ade8c-ca16-5b76-3acd-14df33bc1cb1&l=8823 > HybridShuffleITCase fails with exception in resource cleanup of task Map > > > Key: FLINK-33184 > URL: https://issues.apache.org/jira/browse/FLINK-33184 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build fails > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 > {noformat} > Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task >[] - FATAL - exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292] > 01:17:22,375 [flink-pekko.actor.default-dispatcher-5] INFO > org.apache.flink.runtime.taskmanager.Task[] - Task Sink: > Unnamed (3/10)#0 is already in state CANCELING > 01:17:22,375 [Map (5/10)#0] ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - FATAL - > exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionMana
[jira] [Commented] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map
[ https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771926#comment-17771926 ] Sergey Nuyanzin commented on FLINK-33184: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53530&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=9039 > HybridShuffleITCase fails with exception in resource cleanup of task Map > > > Key: FLINK-33184 > URL: https://issues.apache.org/jira/browse/FLINK-33184 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build fails > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 > {noformat} > Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task >[] - FATAL - exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292] > 01:17:22,375 [flink-pekko.actor.default-dispatcher-5] INFO > org.apache.flink.runtime.taskmanager.Task[] - Task Sink: > Unnamed (3/10)#0 is already in state CANCELING > 01:17:22,375 [Map (5/10)#0] ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - FATAL - > exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionMana
[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process
[ https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771929#comment-17771929 ] Sergey Nuyanzin commented on FLINK-18356: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53530&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=11718 > flink-table-planner Exit code 137 returned from process > --- > > Key: FLINK-18356 > URL: https://issues.apache.org/jira/browse/FLINK-18356 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines, Tests >Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, > 1.19.0 >Reporter: Piotr Nowojski >Assignee: Yunhong Zheng >Priority: Critical > Labels: pull-request-available, test-stability > Attachments: 1234.jpg, app-profiling_4.gif, > image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, > image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, > image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, > image-2023-07-11-19-41-37-105.png > > > {noformat} > = test session starts > == > platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1 > cachedir: .tox/py37-cython/.pytest_cache > rootdir: /__w/3/s/flink-python > collected 568 items > pyflink/common/tests/test_configuration.py ..[ > 1%] > pyflink/common/tests/test_execution_config.py ...[ > 5%] > pyflink/dataset/tests/test_execution_environment.py . > ##[error]Exit code 137 returned from process: file name '/bin/docker', > arguments 'exec -i -u 1002 > 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb > /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'. > Finishing: Test - python > {noformat} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map
[ https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771930#comment-17771930 ] Sergey Nuyanzin commented on FLINK-33184: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53525&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=8664 > HybridShuffleITCase fails with exception in resource cleanup of task Map > > > Key: FLINK-33184 > URL: https://issues.apache.org/jira/browse/FLINK-33184 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build fails > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 > {noformat} > Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task >[] - FATAL - exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292] > 01:17:22,375 [flink-pekko.actor.default-dispatcher-5] INFO > org.apache.flink.runtime.taskmanager.Task[] - Task Sink: > Unnamed (3/10)#0 is already in state CANCELING > 01:17:22,375 [Map (5/10)#0] ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - FATAL - > exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionMana
[jira] [Created] (FLINK-33185) HybridShuffleITCase fails with TimeoutException: Pending slot request timed out in slot pool.
Sergey Nuyanzin created FLINK-33185: --- Summary: HybridShuffleITCase fails with TimeoutException: Pending slot request timed out in slot pool. Key: FLINK-33185 URL: https://issues.apache.org/jira/browse/FLINK-33185 Project: Flink Issue Type: Bug Components: Runtime / Network Affects Versions: 1.19.0 Reporter: Sergey Nuyanzin This build https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53519&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=8641 fails as {noformat} Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: java.util.concurrent.CompletionException: java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: Pending slot request with SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released. Sep 29 05:13:54 at org.apache.flink.runtime.scheduler.DefaultExecutionDeployer.lambda$assignResource$4(DefaultExecutionDeployer.java:226) Sep 29 05:13:54 ... 36 more Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: Pending slot request with SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released. Sep 29 05:13:54 at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) Sep 29 05:13:54 at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) Sep 29 05:13:54 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) Sep 29 05:13:54 at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) Sep 29 05:13:54 ... 34 more Sep 29 05:13:54 Caused by: org.apache.flink.util.FlinkException: org.apache.flink.util.FlinkException: Pending slot request with SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released. Sep 29 05:13:54 at org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.releaseSlot(DeclarativeSlotPoolBridge.java:373) Sep 29 05:13:54 ... 30 more Sep 29 05:13:54 Caused by: java.util.concurrent.TimeoutException: java.util.concurrent.TimeoutException: Pending slot request timed out in slot pool. Sep 29 05:13:54 ... 30 more {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map
[ https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771938#comment-17771938 ] Sergey Nuyanzin commented on FLINK-33184: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53519&view=logs&j=2c3cbe13-dee0-5837-cf47-3053da9a8a78&t=b78d9d30-509a-5cea-1fef-db7abaa325ae > HybridShuffleITCase fails with exception in resource cleanup of task Map > > > Key: FLINK-33184 > URL: https://issues.apache.org/jira/browse/FLINK-33184 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build fails > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 > {noformat} > Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task >[] - FATAL - exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292] > 01:17:22,375 [flink-pekko.actor.default-dispatcher-5] INFO > org.apache.flink.runtime.taskmanager.Task[] - Task Sink: > Unnamed (3/10)#0 is already in state CANCELING > 01:17:22,375 [Map (5/10)#0] ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - FATAL - > exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.rel
[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process
[ https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771939#comment-17771939 ] Sergey Nuyanzin commented on FLINK-18356: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53512&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be > flink-table-planner Exit code 137 returned from process > --- > > Key: FLINK-18356 > URL: https://issues.apache.org/jira/browse/FLINK-18356 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines, Tests >Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, > 1.19.0 >Reporter: Piotr Nowojski >Assignee: Yunhong Zheng >Priority: Critical > Labels: pull-request-available, test-stability > Attachments: 1234.jpg, app-profiling_4.gif, > image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, > image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, > image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, > image-2023-07-11-19-41-37-105.png > > > {noformat} > = test session starts > == > platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1 > cachedir: .tox/py37-cython/.pytest_cache > rootdir: /__w/3/s/flink-python > collected 568 items > pyflink/common/tests/test_configuration.py ..[ > 1%] > pyflink/common/tests/test_execution_config.py ...[ > 5%] > pyflink/dataset/tests/test_execution_environment.py . > ##[error]Exit code 137 returned from process: file name '/bin/docker', > arguments 'exec -i -u 1002 > 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb > /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'. > Finishing: Test - python > {noformat} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map
[ https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771941#comment-17771941 ] Sergey Nuyanzin commented on FLINK-33184: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53509&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=8993 > HybridShuffleITCase fails with exception in resource cleanup of task Map > > > Key: FLINK-33184 > URL: https://issues.apache.org/jira/browse/FLINK-33184 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build fails > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 > {noformat} > Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task >[] - FATAL - exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292] > 01:17:22,375 [flink-pekko.actor.default-dispatcher-5] INFO > org.apache.flink.runtime.taskmanager.Task[] - Task Sink: > Unnamed (3/10)#0 is already in state CANCELING > 01:17:22,375 [Map (5/10)#0] ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - FATAL - > exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionMana
[jira] [Updated] (FLINK-33184) HybridShuffleITCase fails with exception in resource cleanup of task Map on AZP
[ https://issues.apache.org/jira/browse/FLINK-33184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Nuyanzin updated FLINK-33184: Summary: HybridShuffleITCase fails with exception in resource cleanup of task Map on AZP (was: HybridShuffleITCase fails with exception in resource cleanup of task Map) > HybridShuffleITCase fails with exception in resource cleanup of task Map on > AZP > --- > > Key: FLINK-33184 > URL: https://issues.apache.org/jira/browse/FLINK-33184 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build fails > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53548&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8710 > {noformat} > Map (5/10)#0] ERROR org.apache.flink.runtime.taskmanager.Task >[] - FATAL - exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionManager.java:88) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.fail(ResultPartition.java:284) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.failAllResultPartitions(Task.java:1004) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.taskmanager.Task.releaseResources(Task.java:990) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:838) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) > [flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292] > 01:17:22,375 [flink-pekko.actor.default-dispatcher-5] INFO > org.apache.flink.runtime.taskmanager.Task[] - Task Sink: > Unnamed (3/10)#0 is already in state CANCELING > 01:17:22,375 [Map (5/10)#0] ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor [] - FATAL - > exception in resource cleanup of task Map (5/10)#0 > (159f887fbd200ea7cfa4aaeb1127c4ab_0a448493b4782967b150582570326227_4_0) > . > java.lang.IllegalStateException: Leaking buffers. > at > org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > ~[flink-core-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageMemoryManagerImpl.release(TieredStorageMemoryManagerImpl.java:236) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at java.util.ArrayList.forEach(ArrayList.java:1259) ~[?:1.8.0_292] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.storage.TieredStorageResourceRegistry.clearResourceFor(TieredStorageResourceRegistry.java:59) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.hybrid.tiered.shuffle.TieredResultPartition.releaseInternal(TieredResultPartition.java:195) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:262) > ~[flink-runtime-1.19-SNAPSHOT.jar:1.19-SNAPSHOT] > at > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartition(ResultPartitionM
[jira] [Updated] (FLINK-33185) HybridShuffleITCase fails with TimeoutException: Pending slot request timed out in slot pool on AZP
[ https://issues.apache.org/jira/browse/FLINK-33185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Nuyanzin updated FLINK-33185: Summary: HybridShuffleITCase fails with TimeoutException: Pending slot request timed out in slot pool on AZP (was: HybridShuffleITCase fails with TimeoutException: Pending slot request timed out in slot pool.) > HybridShuffleITCase fails with TimeoutException: Pending slot request timed > out in slot pool on AZP > --- > > Key: FLINK-33185 > URL: https://issues.apache.org/jira/browse/FLINK-33185 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53519&view=logs&j=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3&t=0c010d0c-3dec-5bf1-d408-7b18988b1b2b&l=8641 > fails as > {noformat} > Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: > java.util.concurrent.CompletionException: > java.util.concurrent.CompletionException: > org.apache.flink.util.FlinkException: Pending slot request with > SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released. > Sep 29 05:13:54 at > org.apache.flink.runtime.scheduler.DefaultExecutionDeployer.lambda$assignResource$4(DefaultExecutionDeployer.java:226) > Sep 29 05:13:54 ... 36 more > Sep 29 05:13:54 Caused by: java.util.concurrent.CompletionException: > java.util.concurrent.CompletionException: > org.apache.flink.util.FlinkException: Pending slot request with > SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released. > Sep 29 05:13:54 at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > Sep 29 05:13:54 at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) > Sep 29 05:13:54 at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) > Sep 29 05:13:54 at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > Sep 29 05:13:54 ... 34 more > Sep 29 05:13:54 Caused by: org.apache.flink.util.FlinkException: > org.apache.flink.util.FlinkException: Pending slot request with > SlotRequestId{b6e57c09274f4edc50697300bc8859a8} has been released. > Sep 29 05:13:54 at > org.apache.flink.runtime.jobmaster.slotpool.DeclarativeSlotPoolBridge.releaseSlot(DeclarativeSlotPoolBridge.java:373) > Sep 29 05:13:54 ... 30 more > Sep 29 05:13:54 Caused by: java.util.concurrent.TimeoutException: > java.util.concurrent.TimeoutException: Pending slot request timed out in slot > pool. > Sep 29 05:13:54 ... 30 more > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33186) CheckpointAfterAllTasksFinishedITCase.testRestoreAfterSomeTasksFinished fails on AZP
Sergey Nuyanzin created FLINK-33186: --- Summary: CheckpointAfterAllTasksFinishedITCase.testRestoreAfterSomeTasksFinished fails on AZP Key: FLINK-33186 URL: https://issues.apache.org/jira/browse/FLINK-33186 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.19.0 Reporter: Sergey Nuyanzin This build https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53509&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8762 fails as {noformat} Sep 28 01:23:43 Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Task local checkpoint failure. Sep 28 01:23:43 at org.apache.flink.runtime.checkpoint.PendingCheckpoint.abort(PendingCheckpoint.java:550) Sep 28 01:23:43 at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:2248) Sep 28 01:23:43 at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:2235) Sep 28 01:23:43 at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.lambda$null$9(CheckpointCoordinator.java:817) Sep 28 01:23:43 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) Sep 28 01:23:43 at java.util.concurrent.FutureTask.run(FutureTask.java:266) Sep 28 01:23:43 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) Sep 28 01:23:43 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) Sep 28 01:23:43 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) Sep 28 01:23:43 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) Sep 28 01:23:43 at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33186) CheckpointAfterAllTasksFinishedITCase.testRestoreAfterSomeTasksFinished fails on AZP
[ https://issues.apache.org/jira/browse/FLINK-33186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771942#comment-17771942 ] Sergey Nuyanzin commented on FLINK-33186: - [~Jiang Xin], [~lindong] since it is very similar to FLINK-32996 could you please have a look? > CheckpointAfterAllTasksFinishedITCase.testRestoreAfterSomeTasksFinished > fails on AZP > - > > Key: FLINK-33186 > URL: https://issues.apache.org/jira/browse/FLINK-33186 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Critical > Labels: test-stability > > This build > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53509&view=logs&j=baf26b34-3c6a-54e8-f93f-cf269b32f802&t=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9&l=8762 > fails as > {noformat} > Sep 28 01:23:43 Caused by: > org.apache.flink.runtime.checkpoint.CheckpointException: Task local > checkpoint failure. > Sep 28 01:23:43 at > org.apache.flink.runtime.checkpoint.PendingCheckpoint.abort(PendingCheckpoint.java:550) > Sep 28 01:23:43 at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:2248) > Sep 28 01:23:43 at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:2235) > Sep 28 01:23:43 at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.lambda$null$9(CheckpointCoordinator.java:817) > Sep 28 01:23:43 at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > Sep 28 01:23:43 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > Sep 28 01:23:43 at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > Sep 28 01:23:43 at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > Sep 28 01:23:43 at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > Sep 28 01:23:43 at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > Sep 28 01:23:43 at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31339) PlannerScalaFreeITCase.testImperativeUdaf
[ https://issues.apache.org/jira/browse/FLINK-31339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771943#comment-17771943 ] Sergey Nuyanzin commented on FLINK-31339: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53504&view=logs&j=af184cdd-c6d8-5084-0b69-7e9c67b35f7a&t=0f3adb59-eefa-51c6-2858-3654d9e0749d&l=15329 > PlannerScalaFreeITCase.testImperativeUdaf > - > > Key: FLINK-31339 > URL: https://issues.apache.org/jira/browse/FLINK-31339 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Priority: Major > Labels: auto-deprioritized-critical, test-stability > > {{PlannerScalaFreeITCase.testImperativeUdaf}} failed: > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46812&view=logs&j=87489130-75dc-54e4-1f45-80c30aa367a3&t=73da6d75-f30d-5d5a-acbe-487a9dcff678&l=15012 > {code} > Mar 05 05:55:50 [ERROR] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 62.028 s <<< FAILURE! - in > org.apache.flink.table.sql.codegen.PlannerScalaFreeITCase > Mar 05 05:55:50 [ERROR] PlannerScalaFreeITCase.testImperativeUdaf Time > elapsed: 40.924 s <<< FAILURE! > Mar 05 05:55:50 org.opentest4j.AssertionFailedError: Did not get expected > results before timeout, actual result: > [{"before":null,"after":{"user_name":"Bob","order_cnt":1},"op":"c"}, > {"before":null,"after":{"user_name":"Alice","order_cnt":1},"op":"c"}, > {"before":{"user_name":"Bob","order_cnt":1},"after":null,"op":"d"}, > {"before":null,"after":{"user_name":"Bob","order_cnt":2},"op":"c"}]. ==> > expected: but was: > Mar 05 05:55:50 at > org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151) > Mar 05 05:55:50 at > org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132) > Mar 05 05:55:50 at > org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63) > Mar 05 05:55:50 at > org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36) > Mar 05 05:55:50 at > org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:211) > Mar 05 05:55:50 at > org.apache.flink.table.sql.codegen.SqlITCaseBase.checkJsonResultFile(SqlITCaseBase.java:168) > Mar 05 05:55:50 at > org.apache.flink.table.sql.codegen.SqlITCaseBase.runAndCheckSQL(SqlITCaseBase.java:111) > Mar 05 05:55:50 at > org.apache.flink.table.sql.codegen.PlannerScalaFreeITCase.testImperativeUdaf(PlannerScalaFreeITCase.java:43) > [...] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-18356) flink-table-planner Exit code 137 returned from process
[ https://issues.apache.org/jira/browse/FLINK-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771945#comment-17771945 ] Sergey Nuyanzin commented on FLINK-18356: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53486&view=logs&j=a9db68b9-a7e0-54b6-0f98-010e0aff39e2&t=cdd32e0b-6047-565b-c58f-14054472f1be&l=1000 > flink-table-planner Exit code 137 returned from process > --- > > Key: FLINK-18356 > URL: https://issues.apache.org/jira/browse/FLINK-18356 > Project: Flink > Issue Type: Bug > Components: Build System / Azure Pipelines, Tests >Affects Versions: 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, > 1.19.0 >Reporter: Piotr Nowojski >Assignee: Yunhong Zheng >Priority: Critical > Labels: pull-request-available, test-stability > Attachments: 1234.jpg, app-profiling_4.gif, > image-2023-01-11-22-21-57-784.png, image-2023-01-11-22-22-32-124.png, > image-2023-02-16-20-18-09-431.png, image-2023-07-11-19-28-52-851.png, > image-2023-07-11-19-35-54-530.png, image-2023-07-11-19-41-18-626.png, > image-2023-07-11-19-41-37-105.png > > > {noformat} > = test session starts > == > platform linux -- Python 3.7.3, pytest-5.4.3, py-1.8.2, pluggy-0.13.1 > cachedir: .tox/py37-cython/.pytest_cache > rootdir: /__w/3/s/flink-python > collected 568 items > pyflink/common/tests/test_configuration.py ..[ > 1%] > pyflink/common/tests/test_execution_config.py ...[ > 5%] > pyflink/dataset/tests/test_execution_environment.py . > ##[error]Exit code 137 returned from process: file name '/bin/docker', > arguments 'exec -i -u 1002 > 97fc4e22522d2ced1f4d23096b8929045d083dd0a99a4233a8b20d0489e9bddb > /__a/externals/node/bin/node /__w/_temp/containerHandlerInvoker.js'. > Finishing: Test - python > {noformat} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3729&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=8d78fe4f-d658-5c70-12f8-4921589024c3 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32604) PyFlink end-to-end fails with kafka-server-stop.sh: No such file or directory
[ https://issues.apache.org/jira/browse/FLINK-32604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771947#comment-17771947 ] Sergey Nuyanzin commented on FLINK-32604: - https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=53434&view=logs&j=87489130-75dc-54e4-1f45-80c30aa367a3&t=efbee0b1-38ac-597d-6466-1ea8fc908c50&l=8897 > PyFlink end-to-end fails with kafka-server-stop.sh: No such file or > directory > --- > > Key: FLINK-32604 > URL: https://issues.apache.org/jira/browse/FLINK-32604 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.18.0 >Reporter: Sergey Nuyanzin >Priority: Major > Labels: auto-deprioritized-critical, test-stability > > This build > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51253&view=logs&j=af184cdd-c6d8-5084-0b69-7e9c67b35f7a&t=0f3adb59-eefa-51c6-2858-3654d9e0749d&l=7883 > fails as > {noformat} > /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/kafka-common.sh: line > 117: > /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-27379214502/kafka_2.12-3.2.3/bin/kafka-server-stop.sh: > No such file or directory > /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/kafka-common.sh: line > 121: > /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-27379214502/kafka_2.12-3.2.3/bin/zookeeper-server-stop.sh: > No such file or directory > Jul 13 19:43:07 [FAIL] Test script contains errors. > Jul 13 19:43:07 Checking of logs skipped. > Jul 13 19:43:07 > Jul 13 19:43:07 [FAIL] 'PyFlink end-to-end test' failed after 0 minutes and > 40 seconds! Test exited with exit code 1 > Jul 13 19:43:07 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-28440) EventTimeWindowCheckpointingITCase failed with restore
[ https://issues.apache.org/jira/browse/FLINK-28440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Nuyanzin updated FLINK-28440: Affects Version/s: 1.19.0 > EventTimeWindowCheckpointingITCase failed with restore > -- > > Key: FLINK-28440 > URL: https://issues.apache.org/jira/browse/FLINK-28440 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / State Backends >Affects Versions: 1.16.0, 1.17.0, 1.18.0, 1.19.0 >Reporter: Huang Xingbo >Assignee: Yanfei Lei >Priority: Critical > Labels: auto-deprioritized-critical, pull-request-available, > stale-assigned, test-stability > Fix For: 1.18.0 > > Attachments: image-2023-02-01-00-51-54-506.png, > image-2023-02-01-01-10-01-521.png, image-2023-02-01-01-19-12-182.png, > image-2023-02-01-16-47-23-756.png, image-2023-02-01-16-57-43-889.png, > image-2023-02-02-10-52-56-599.png, image-2023-02-03-10-09-07-586.png, > image-2023-02-03-12-03-16-155.png, image-2023-02-03-12-03-56-614.png > > > {code:java} > Caused by: java.lang.Exception: Exception while creating > StreamOperatorStateContext. > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:256) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:268) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:722) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:698) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:665) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.util.FlinkException: Could not restore keyed > state backend for WindowOperator_0a448493b4782967b150582570326227_(2/4) from > any of the 1 provided restore options. > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) > ... 11 more > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: > /tmp/junit1835099326935900400/junit1113650082510421526/52ee65b7-033f-4429-8ddd-adbe85e27ced > (No such file or directory) > at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) > at > org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:96) > at > org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:75) > at > org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:92) > at > org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) > ... 13 more > Caused by: java.io.FileNotFoundException: > /tmp/junit1835099326935900400/junit1113650082510421526/52ee65b7-033f-4429-8ddd-adbe85e27ced > (No such file or directory) > at java.io.FileInputStream.open0(Native Meth