[jira] [Created] (FLINK-27082) Disable tests relying on non-writable directories
Chesnay Schepler created FLINK-27082: Summary: Disable tests relying on non-writable directories Key: FLINK-27082 URL: https://issues.apache.org/jira/browse/FLINK-27082 Project: Flink Issue Type: Sub-task Components: Tests Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.16.0, 1.14.5, 1.15.1 We have a number of tests that rely on {{File#setWritable}} to produce errors. These currently fail on GHA because we're running the tests as root, who can delete files anyway. Switching to a non-root user is a follow-up. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27083) SavepointITCase.testStopWithSavepointFailingAfterSnapshotCreation failed on azure
Yun Gao created FLINK-27083: --- Summary: SavepointITCase.testStopWithSavepointFailingAfterSnapshotCreation failed on azure Key: FLINK-27083 URL: https://issues.apache.org/jira/browse/FLINK-27083 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.16.0 Environment: Reporter: Yun Gao {code:java} Apr 05 04:09:45 [ERROR] Tests run: 20, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 23.89 s <<< FAILURE! - in org.apache.flink.test.checkpointing.SavepointITCase Apr 05 04:09:45 [ERROR] org.apache.flink.test.checkpointing.SavepointITCase.testStopWithSavepointFailingAfterSnapshotCreation Time elapsed: 0.139 s <<< ERROR! Apr 05 04:09:45 java.util.concurrent.ExecutionException: org.apache.flink.runtime.scheduler.stopwithsavepoint.StopWithSavepointStoppingException: A savepoint has been created at: file:/tmp/junit4406725947107631997/junit5730835490671962066/savepoint-902c72-01f9662051f2, but the corresponding job 902c72291e9d3f8c4905e6730763116a failed during stopping. The savepoint is consistent, but might have uncommitted transactions. If you want to commit the transaction please restart a job from this savepoint. Apr 05 04:09:45 at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) Apr 05 04:09:45 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) Apr 05 04:09:45 at org.apache.flink.test.checkpointing.SavepointITCase.testStopWithFailingSourceInOnePipeline(SavepointITCase.java:1202) Apr 05 04:09:45 at org.apache.flink.test.checkpointing.SavepointITCase.testStopWithSavepointFailingAfterSnapshotCreation(SavepointITCase.java:1013) Apr 05 04:09:45 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) Apr 05 04:09:45 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) Apr 05 04:09:45 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) Apr 05 04:09:45 at java.lang.reflect.Method.invoke(Method.java:498) Apr 05 04:09:45 at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) Apr 05 04:09:45 at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) Apr 05 04:09:45 at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) Apr 05 04:09:45 at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) Apr 05 04:09:45 at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) Apr 05 04:09:45 at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) Apr 05 04:09:45 at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) Apr 05 04:09:45 at org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) Apr 05 04:09:45 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) Apr 05 04:09:45 at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) Apr 05 04:09:45 at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) Apr 05 04:09:45 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) Apr 05 04:09:45 at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) Apr 05 04:09:45 at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) Apr 05 04:09:45 at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) Apr 05 04:09:45 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) Apr 05 04:09:45 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) Apr 05 04:09:45 at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) Apr 05 04:09:45 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) Apr 05 04:09:45 at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) Apr 05 04:09:45 at org.junit.runners.ParentRunner.run(ParentRunner.java:413) Apr 05 04:09:45 at org.junit.runner.JUnitCore.run(JUnitCore.java:137) Apr 05 04:09:45 at org.junit.runner.JUnitCore.run(JUnitCore.java:115) Apr 05 04:09:45 at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42) Apr 05 04:09:45 at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80) Apr 05 04:09:45 at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72) Apr 05 04:09:45 at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107) {code} https://dev.azure.com/apache-flink/apache-flink/_build/results?buil
[jira] [Created] (FLINK-27084) Perround mode recreating operator fails
Yunfeng Zhou created FLINK-27084: Summary: Perround mode recreating operator fails Key: FLINK-27084 URL: https://issues.apache.org/jira/browse/FLINK-27084 Project: Flink Issue Type: Bug Components: Library / Machine Learning Affects Versions: ml-2.0.0 Environment: Flink 1.14.0, Flink ML 2.0.0 Reporter: Yunfeng Zhou When I was trying to submit a job containing Flink ML KMeans operator to a Flink cluster, the following exception is thrown out. {code:java} The program finished with the following exception:org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 8ba9d3173d1c83eb4803298f81349aea) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: java.util.concurrent.ExecutionException: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 8ba9d3173d1c83eb4803298f81349aea) at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) at org.apache.flink.client.program.StreamContextEnvironment.getJobExecutionResult(StreamContextEnvironment.java:123) at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:80) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1898) at org.apache.flink.ml.benchmark.BenchmarkUtils.runBenchmark(BenchmarkUtils.java:127) at org.apache.flink.ml.benchmark.BenchmarkUtils.runBenchmark(BenchmarkUtils.java:84) at org.apache.flink.ml.benchmark.Benchmark.main(Benchmark.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) ... 8 more Caused by: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: 8ba9d3173d1c83eb4803298f81349aea) at org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:125) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) at org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:403) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) at org.apache.flink.client.program.rest.RestClusterClient.lambda$pollResourceAsync$26(RestClusterClient.java:698) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) at org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:403) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:575) at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:943) at java.util.concurrent.CompletableFuture$Com
[jira] [Created] (FLINK-27085) Introduce snapshot.num-retained.min option
Caizhi Weng created FLINK-27085: --- Summary: Introduce snapshot.num-retained.min option Key: FLINK-27085 URL: https://issues.apache.org/jira/browse/FLINK-27085 Project: Flink Issue Type: Sub-task Components: Table Store Affects Versions: 0.1.0 Reporter: Caizhi Weng Fix For: 0.1.0 Currently we retain at least 1 snapshot when expiring. However consider the following scenario: A user writes a snapshot one day ago. Today he is reading this snapshot and meanwhile writing more records. If a new snapshot is created and the reading is not finished, the old snapshot created one day ago will be removed as it exceeds maximum retaining time. This will cause the reading to fail. We should introduce {{snapshot.num-retained.min}} to at least retain a minimum number of snapshots to avoid this problem. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27086) Add a QA about how to handle exception when use hive parser in hive dialect document
Jing Zhang created FLINK-27086: -- Summary: Add a QA about how to handle exception when use hive parser in hive dialect document Key: FLINK-27086 URL: https://issues.apache.org/jira/browse/FLINK-27086 Project: Flink Issue Type: Technical Debt Components: Documentation Affects Versions: 1.15.0 Reporter: Jing Zhang Fix For: 1.15.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27087) Configuration "taskmanager.network.detailed-metrics" in metrics document was mistaken as "taskmanager.net.detailed-metrics"
Smile created FLINK-27087: - Summary: Configuration "taskmanager.network.detailed-metrics" in metrics document was mistaken as "taskmanager.net.detailed-metrics" Key: FLINK-27087 URL: https://issues.apache.org/jira/browse/FLINK-27087 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.16.0 Reporter: Smile [Docs for config|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#taskmanager-network-detailed-metrics] has a correct key as "taskmanager.{color:#FF}network{color}.detailed-metrics", but when mention it in [docs for metrics|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/#default-shuffle-service], it was mistaken as "taskmanager.{color:#FF}net{color}.detailed-metrics". Compared with [code in flink-core|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/NettyShuffleEnvironmentOptions.java#L104] "network" should be the right one. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27088) The example of using StringDeserializer for deserializing Kafka message value as string has an error
Penguin created FLINK-27088: --- Summary: The example of using StringDeserializer for deserializing Kafka message value as string has an error Key: FLINK-27088 URL: https://issues.apache.org/jira/browse/FLINK-27088 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.14.4 Reporter: Penguin Attachments: StringDeserializer.png The example of using StringDeserializer for deserializing Kafka message value as string has an error in flink's document 1.4 release, the parameters of the example code "KafkaRecordDeserializationSchema.valueOnly()" should be "StringDeserializer.class" rather than "StringSerializer.class". The document link is as follows: [link title|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/kafka/#deserializer] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27089) Calling TRY_CAST with an invalid value throws IndexOutOfBounds Exception.
Caizhi Weng created FLINK-27089: --- Summary: Calling TRY_CAST with an invalid value throws IndexOutOfBounds Exception. Key: FLINK-27089 URL: https://issues.apache.org/jira/browse/FLINK-27089 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.15.0 Reporter: Caizhi Weng Add the following test to {org.apache.flink.table.planner.runtime.batch.sql.CalcITCase} to reproduce this issue. {code:scala} @Test def myTest(): Unit = { checkResult("SELECT TRY_CAST('invalid' AS INT)", Seq(row(null))) } {code} The exception stack is {code} java.lang.IndexOutOfBoundsException: index (1) must be less than size (1) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1345) at com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1327) at com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:43) at org.apache.calcite.rex.RexCallBinding.getOperandType(RexCallBinding.java:136) at org.apache.calcite.sql.fun.SqlCastFunction.getMonotonicity(SqlCastFunction.java:205) at org.apache.flink.table.planner.functions.sql.BuiltInSqlFunction.getMonotonicity(BuiltInSqlFunction.java:141) at org.apache.calcite.rel.metadata.RelMdCollation.project(RelMdCollation.java:291) at org.apache.calcite.rel.logical.LogicalProject.lambda$create$0(LogicalProject.java:122) at org.apache.calcite.plan.RelTraitSet.replaceIfs(RelTraitSet.java:242) at org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:121) at org.apache.calcite.rel.logical.LogicalProject.create(LogicalProject.java:111) at org.apache.calcite.rel.core.RelFactories$ProjectFactoryImpl.createProject(RelFactories.java:177) at org.apache.calcite.tools.RelBuilder.project_(RelBuilder.java:1516) at org.apache.calcite.tools.RelBuilder.project(RelBuilder.java:1311) at org.apache.calcite.tools.RelBuilder.projectNamed(RelBuilder.java:1565) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:4222) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:687) at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:644) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3438) at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:570) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$rel(FlinkPlannerImpl.scala:198) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:190) at org.apache.flink.table.planner.operations.SqlToOperationConverter.toQueryOperation(SqlToOperationConverter.java:1240) at org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlQuery(SqlToOperationConverter.java:1188) at org.apache.flink.table.planner.operations.SqlToOperationConverter.convertValidatedSqlNode(SqlToOperationConverter.java:345) at org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:238) at org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:105) at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:675) at org.apache.flink.table.planner.runtime.utils.BatchTestBase.parseQuery(BatchTestBase.scala:297) at org.apache.flink.table.planner.runtime.utils.BatchTestBase.check(BatchTestBase.scala:139) at org.apache.flink.table.planner.runtime.utils.BatchTestBase.checkResult(BatchTestBase.scala:106) at org.apache.flink.table.planner.runtime.batch.sql.CalcITCase.myTest(CalcITCase.scala:75) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExpectedException$ExpectedExce
[jira] [Created] (FLINK-27090) Add Estimator and Transformer for FTRL
Dong Lin created FLINK-27090: Summary: Add Estimator and Transformer for FTRL Key: FLINK-27090 URL: https://issues.apache.org/jira/browse/FLINK-27090 Project: Flink Issue Type: Bug Reporter: Dong Lin Fix For: ml-2.1.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27091) Add Estimator and Transformer for SVM
Dong Lin created FLINK-27091: Summary: Add Estimator and Transformer for SVM Key: FLINK-27091 URL: https://issues.apache.org/jira/browse/FLINK-27091 Project: Flink Issue Type: Bug Components: Library / Machine Learning Reporter: Dong Lin Fix For: ml-2.1.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27092) Add Estimator and Transformer for Bucketizer
Dong Lin created FLINK-27092: Summary: Add Estimator and Transformer for Bucketizer Key: FLINK-27092 URL: https://issues.apache.org/jira/browse/FLINK-27092 Project: Flink Issue Type: New Feature Components: Library / Machine Learning Reporter: Dong Lin Fix For: ml-2.1.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27093) Add Estimator and Transformer for LinearReg
Dong Lin created FLINK-27093: Summary: Add Estimator and Transformer for LinearReg Key: FLINK-27093 URL: https://issues.apache.org/jira/browse/FLINK-27093 Project: Flink Issue Type: New Feature Components: Library / Machine Learning Reporter: Dong Lin Fix For: ml-2.1.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Creating flink-kubernetes-operator project on Dockerhub
I think creating an official image is not necessary. Per the docker hub statement [1], "Docker Official Images are an intellectual property of Docker". So technically it's not part of the Apache Flink release, and anyone who wants can open a PR to create such an image. Thank you~ Xintong Song [1] https://docs.docker.com/docker-hub/official_images/ On Wed, Apr 6, 2022 at 11:47 AM Yang Wang wrote: > It seems that you have already got this done :) > > One more question, do we need to create an official docker image repo for > flink-kubernetes-operator[1]? Then user could pull the image directly via > "docker pull flink-kubernetes-operator". > The drawback is we always need to create a PR[2] and wait for their review, > merging. Personally, I think this is unnecessary. > We already host the images via " > ghcr.io/apache/flink-kubernetes-operator:0.1.0" and > "apache/flink-kubernetes-operator:0.1.0". > > [1]. https://docs.docker.com/docker-hub/official_images/ > [2]. https://github.com/docker-library/official-images/pull/10980/files > > Best, > Yang > > Gyula Fóra 于2022年4月3日周日 00:49写道: > > > Hi Devs, > > > > Does anyone know what is the process for creating a new dockerhub project > > under apache? > > > > I would like to create *apache**/flink-kubernetes-operator *and get push > > access to it. > > > > Thank you! > > Gyula > > >
[jira] [Created] (FLINK-27094) Use Boring Cyborg to link issues back to Jira
Martijn Visser created FLINK-27094: -- Summary: Use Boring Cyborg to link issues back to Jira Key: FLINK-27094 URL: https://issues.apache.org/jira/browse/FLINK-27094 Project: Flink Issue Type: Sub-task Components: Build System, Connectors / ElasticSearch Reporter: Martijn Visser Assignee: Martijn Visser -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27095) Allow FlinkDeployment Mode Switching
Usamah Jassat created FLINK-27095: - Summary: Allow FlinkDeployment Mode Switching Key: FLINK-27095 URL: https://issues.apache.org/jira/browse/FLINK-27095 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Reporter: Usamah Jassat Fix For: kubernetes-operator-1.0.0 Currently the operator doesn't allow switching the mode of an existing FlinkDeployment between Application and Session. I don't see why this shouldn't be allowed, if the mode is switched we should tear-down the cluster and spin-up a new cluster running in this new mode. It will add complexity into the connector but seems like a valid use-case. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27096) Benchmark all algorithms included in Flink ML 2.1 release and achieve performance parity with Spark
Dong Lin created FLINK-27096: Summary: Benchmark all algorithms included in Flink ML 2.1 release and achieve performance parity with Spark Key: FLINK-27096 URL: https://issues.apache.org/jira/browse/FLINK-27096 Project: Flink Issue Type: New Feature Components: Library / Machine Learning Reporter: Dong Lin Fix For: ml-2.1.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: Wrong format when passing arguments with space
HI, Kevin I have not reproduced this problem. What is the impact of this problem? Can't get this parameter correctly in user main method? Could you provide a screenshot of the JobManager configuration on the UI > 2022年4月2日 上午10:23,Kevin Lee 写道: > > It's a typo > > I run this demo on yarn cluster > > > The mode is *"run-application"* instead of "run" > > ./bin/flink run-application -c com.lmk.QuotaTest --rate 10 --time > "2022-03-28 11:53:21" > > > The JM log shows program-args as follow: > > > $internal.application.program-args, --rate;10;--time;'"2022-03-30 18:46:56"' > > > > Sorry for the mistake > > > > > 姜鑫 于2022年4月1日周五 11:22写道: > >> Hi Kevin, >> >> I noticed that the two quotas in your time string looks different. Please >> confirm that it is a typo or not. >> >> Best, >> Xin >> >> >>> 2022年3月28日 上午11:58,Kevin Lee 写道: >>> >>> Flink version : 1.13 >>> >>> Bug: >>> When I pass an argument with space by single quota. >>> The main function get this argument with a double quota >>> >>> example: >>> ./bin/flink run -c com.lmk.QuotaTest --rate 10 --time ''2022-03-28 >> 11:53:21" >>> >>> The main function get parameters: >>> >>> 1-rate >>> 2---10 >>> 3-time >>> 4---"2022-03-28 11:53:21" >>> >>> >>> I think flink shell should remove the double quota in "2022-03-28 >> 11:53:21" >>> >>> >>> Hope to get your reply asap >> >>
[jira] [Created] (FLINK-27097) Document custom validator implementations
Nicholas Jiang created FLINK-27097: -- Summary: Document custom validator implementations Key: FLINK-27097 URL: https://issues.apache.org/jira/browse/FLINK-27097 Project: Flink Issue Type: Sub-task Reporter: Nicholas Jiang -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27098) The session job controller should create FlinkDeployment informer event source with namespace definition
Xin Hao created FLINK-27098: --- Summary: The session job controller should create FlinkDeployment informer event source with namespace definition Key: FLINK-27098 URL: https://issues.apache.org/jira/browse/FLINK-27098 Project: Flink Issue Type: Bug Components: Kubernetes Operator Reporter: Xin Hao The below error will occur if we deploy the operator with a namespaced scope and submit a session job. {code:java} [WARN ] Error starting org.apache.flink.kubernetes.operator.controller.FlinkSessionJobController$1@96a75da io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://10.16.0.1/apis/flink.apache.org/v1alpha1/namespaces/flink-operator/flinkdeployments. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. flinkdeployments.flink.apache.org is forbidden: User "system:serviceaccount:flink-operator:flink-operator" cannot list resource "flinkdeployments" in API group "flink.apache.org" in the namespace "flink-operator". {code} This error comes from the creation of the FlinkDeployment informer event source. I just submitted a PR to fix this, https://github.com/apache/flink-kubernetes-operator/pull/157 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27099) RestClient fills /tmp directory
Mexicapita created FLINK-27099: -- Summary: RestClient fills /tmp directory Key: FLINK-27099 URL: https://issues.apache.org/jira/browse/FLINK-27099 Project: Flink Issue Type: Bug Components: Client / Job Submission, Runtime / REST Affects Versions: 1.14.4, 1.12.1, 1.7.2 Reporter: Mexicapita I have some problems with my apache flink stream system. A few days ago, we detect that flink was using too much space in the hard disk, and explorint them, we found to many (around a thousand) `.jar` files in `/tmp` directory on jobmanager container. The .jar files always have a similar name, but with diferent numbers: - 1695477181_3183563526079495551lib_org.eclipse.paho.client.mqttv3-1.1.0.jar - 1798263280_2346102789822957064lib_com.google.protobuf_2.6.0.jar The .jar files was created when I use the 'Submit New Job' menu In this menu, flink start to polling `GET http://myflink:myport/jars/` (once at second). For each request, a couple of .jar files are created. I can't find any information about this and I'm not sure if its a bug or something. Any help please ? My installations: - Flink 1.12.1 running with docker swarm - Flink 1.7 running with docker compose - Flink 1.14.4 java 11 running with docker-compose The three installations have the same problem Thanks !!! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27100) Support parquet format in FileStore
Yubin Li created FLINK-27100: Summary: Support parquet format in FileStore Key: FLINK-27100 URL: https://issues.apache.org/jira/browse/FLINK-27100 Project: Flink Issue Type: New Feature Components: Table Store Reporter: Yubin Li Apache Parquet is a very popular columnar file format, used in many data analysis engines like Hive/Impala/Spark/Flink. we could use simple command lines like parquet-tools to view metadata and data easily instead of using complex java code. now flink-table-store only support ORC, but there are massive business data stored as parquet format, developers/analysisers are very familliar with it, maybe it's a good addition to make Parquet usable. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27101) Periodically break the chain of incremental checkpoint
Steven Zhen Wu created FLINK-27101: -- Summary: Periodically break the chain of incremental checkpoint Key: FLINK-27101 URL: https://issues.apache.org/jira/browse/FLINK-27101 Project: Flink Issue Type: New Feature Components: Runtime / Checkpointing Reporter: Steven Zhen Wu Incremental checkpoint is almost a must for large-state jobs. It greatly reduces the bytes uploaded to DFS per checkpoint. However, there are a few implications from incremental checkpoint that are problematic for production operations. Will use S3 as an example DFS in the rest of description. 1. Because there is no way to deterministically know how far back the incremental checkpoint can refer to files uploaded to S3, it is very difficult to set S3 bucket/object TTL. In one application, we have observed Flink checkpoint referring to files uploaded over 6 months ago. S3 TTL can corrupt the Flink checkpoints. S3 TTL is important for a few reasons - purge orphaned files (like external checkpoints from previous deployments) to keep the storage cost in check. This problem can be addressed by implementing proper garbage collection (similar to JVM) by traversing the retained checkpoints from all jobs and traverse the file references. But that is an expensive solution from engineering cost perspective. - Security and privacy. E.g., there may be requirement that Flink state can't keep the data for more than some duration threshold (hours/days/weeks). Application is expected to purge keys to satisfy the requirement. However, with incremental checkpoint and how deletion works in RocksDB, it is hard to set S3 TTL to purge S3 files. Even though those old S3 files don't contain live keys, they may still be referrenced by retained Flink checkpoints. 2. Occasionally, corrupted checkpoint files (on S3) are observed. As a result, restoring from checkpoint failed. With incremental checkpoint, it usually doesn't help to try other older checkpoints, because they may refer to the same corrupted file. It is unclear whether the corruption happened before or during S3 upload. This risk can be mitigated with periodical savepoints. It all boils down to periodical full snapshot (checkpoint or savepoint) to deterministically break the chain of incremental checkpoints. Search the jira history, the behavior that FLINK-23949 [1] trying to fix is actually close to what we would need here. There are a few options 1. Periodically trigger savepoints (via control plane). This is actually not a bad practice and might be appealing to some people. The problem is that it requires a job deployment to break the chain of incremental checkpoint. periodical job deployment may sound hacky. If we make the behavior of full checkpoint after a savepoint (fixed in FLINK-23949) configurable, it might be an acceptable compromise. The benefit is that no job deployment is required after savepoints. 2. Build the feature in Flink incremental checkpoint. Periodically (with some cron style config) trigger a full checkpoint to break the incremental chain. If the full checkpoint failed (due to whatever reason), the following checkpoints should attempt full checkpoint as well until one successful full checkpoint is completed. 3. For the security/privacy requirement, the main thing is to apply compaction on the deleted keys. That could probably avoid references to the old files. Is there any RocksDB compation can achieve full compaction of removing old delete markers. Recent delete markers are fine [1] https://issues.apache.org/jira/browse/FLINK-23949 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27102) Add BinaryClassificationEvaluator
Dong Lin created FLINK-27102: Summary: Add BinaryClassificationEvaluator Key: FLINK-27102 URL: https://issues.apache.org/jira/browse/FLINK-27102 Project: Flink Issue Type: New Feature Components: Library / Machine Learning Reporter: Dong Lin Fix For: ml-2.1.0 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27103) Don't store redundant primary key fields
Jingsong Lee created FLINK-27103: Summary: Don't store redundant primary key fields Key: FLINK-27103 URL: https://issues.apache.org/jira/browse/FLINK-27103 Project: Flink Issue Type: Sub-task Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.1.0 We are currently storing the primary key redundantly in the file, we can directly use the primary key field in the original fields to avoid redundant storage -- This message was sent by Atlassian Jira (v8.20.1#820001)
Re: pyflink on Python 3.9 dev guide
Hi Luan, Your error is because the version of the Python interpreter used by your client and the cluster side is inconsistent. The default protocol version used by pickle before python 3.8 is 4, and after python 3.8 it is 5. If the two version do not match, an error will be reported as shown in your error message `return cloudpickle.loads(payload) in the error message TypeError: an integer is required (got type bytes)`. I have previously created JIRA https://issues.apache.org/jira/browse/FLINK-22517 to illustrate this issue. At present, if you want to solve this problem, the best way is to keep the version of the python interpreter on the client side and the cluster side the same. Best, Xingbo Luan Cooper 于2022年4月6日周三 14:26写道: > actually I had build/compile > - pyarrow==2.0.0 (test skipped) > - apache-beam==2.27.0 (test skipped) > on python 3.9, and test with example python jobs( bin/flink run > -pyclientexec python3.7 -pyexec python3.9 -py > examples/python/table/word_count.py ) > but got exceptions following > > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: Error received from SDK harness for instruction > 1: Traceback (most recent call last): > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 289, in _execute > response = task() > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 362, in > lambda: self.create_worker().do_instruction(request), request) > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 606, in do_instruction > return getattr(self, request_type)( > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 637, in process_bundle > bundle_processor = self.bundle_processor_cache.get( > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 463, in get > processor = bundle_processor.BundleProcessor( > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 868, in __init__ > self.ops = self.create_execution_tree(self.process_bundle_descriptor) > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 921, in create_execution_tree > return collections.OrderedDict([( > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 924, in > get_operation(transform_id))) for transform_id in sorted( > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 812, in wrapper > result = cache[args] = func(*args) > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 903, in get_operation > transform_consumers = { > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 904, in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 904, in > tag: [get_operation(op) for op in pcoll_consumers[pcoll_id]] > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 812, in wrapper > result = cache[args] = func(*args) > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 908, in get_operation > return transform_factory.create_operation( > File > > "/usr/local/lib/python3.9/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 1198, in create_operation > return creator(self, transform_id, transform_proto, payload, consumers) > File > > "/usr/local/lib/python3.9/dist-packages/pyflink/fn_execution/beam/beam_operations.py", > line 55, in create_table_function > return _create_user_defined_function_operation( > File > > "/usr/local/lib/python3.9/dist-packages/pyflink/fn_execution/beam/beam_operations.py", > line 199, in _create_user_defined_function_operation > return beam_operation_cls( > File "pyflink/fn_execution/beam/beam_operations_fast.pyx", line 198, in > > pyflink.fn_execution.beam.beam_operations_fast.StatelessFunctionOperation.__init__ > File "pyflink/fn_execution/beam/beam_operations_fast.pyx", line 129, in > pyflink.fn_execution.beam.beam_operations_fast.FunctionOperation.__init__ > File "pyflink/fn_execution/beam/beam_operations_fast.pyx", line 202, in > > pyflink.fn_execution.beam.beam_operations_fast.StatelessFunctionOperation.generate_operation > File > > "/usr/local/lib/python3.9/dist-packages/pyflink/fn_execution/table/operations.py", > line 124, in __init__ > super(TableFunctionOperation, self).__init__(spec) > File > > "/usr/local/lib/python3.9/dist-packages/pyflink/fn_execution/table/operatio
[jira] [Created] (FLINK-27104) ClassNotFound when submit jobs to session cluster with external jars
Sitan Pang created FLINK-27104: -- Summary: ClassNotFound when submit jobs to session cluster with external jars Key: FLINK-27104 URL: https://issues.apache.org/jira/browse/FLINK-27104 Project: Flink Issue Type: Improvement Components: Client / Job Submission Affects Versions: 1.11.0 Reporter: Sitan Pang I'm trying to submit SQL jobs to session cluster on k8s, which need external local udfs jars. I meet ClassNotFound Exception because of the reasons below: * 'pipeline.jars' will be overwritten by '-j' option which only accepts one jar. * 'pipeline.classpaths' will not be uploaded, so local files could not be found in TM. * 'pipelines.cached-files' will not be added to classpaths. In Task#createUserCodeClassloader and ClientUtils#buildUserCodeClassLoader, only 'pipeline.jars' and 'pipeline.classpaths' are added to user class loader. Only combining external jars into one could solve this, which means we need to re-create a new jar every time. Is there a better way to submit with external jars? Or could we make 'pipeline.jars' directly used by users? -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27105) wrong metric type
Feifan Wang created FLINK-27105: --- Summary: wrong metric type Key: FLINK-27105 URL: https://issues.apache.org/jira/browse/FLINK-27105 Project: Flink Issue Type: Bug Reporter: Feifan Wang -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27106) Flink command line support submit one job with multi jars
FanJia created FLINK-27106: -- Summary: Flink command line support submit one job with multi jars Key: FLINK-27106 URL: https://issues.apache.org/jira/browse/FLINK-27106 Project: Flink Issue Type: New Feature Components: Command Line Client Affects Versions: 1.14.4 Reporter: FanJia Need to be able to use multiple local jar packages to submit job on the commandline, similar to `spark-submit --jars jar1,jar2,jar3`. rather than a file system that all nodes need to be able to access. So users can finely control dependencies -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27107) Typo in Task
dizhou cao created FLINK-27107: -- Summary: Typo in Task Key: FLINK-27107 URL: https://issues.apache.org/jira/browse/FLINK-27107 Project: Flink Issue Type: Improvement Components: Runtime / Task Affects Versions: 1.15.0 Reporter: dizhou cao two small typos in Task * TaskCancelerWatchDog/TaskInterrupter field: executerThread -> executorThread * TaskCanceler field: executer -> executor -- This message was sent by Atlassian Jira (v8.20.1#820001)