[jira] [Created] (FLINK-17318) The comment is not right in `org.apache.flink.table.planner.delegation.PlannerBase`
Hequn Cheng created FLINK-17318: --- Summary: The comment is not right in `org.apache.flink.table.planner.delegation.PlannerBase` Key: FLINK-17318 URL: https://issues.apache.org/jira/browse/FLINK-17318 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Affects Versions: 1.10.0 Reporter: Hequn Cheng `org.apache.flink.table.planner.delegation.PlannerBase` should be an implementation of Blink planner instead of legacy Flink planner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [VOTE] Release 1.9.3, release candidate #1
+1 (non-binding) - checked release notes - checked signatures - built from source - submitted an example job on yarn cluster - WebUI and logs look good Thanks, Zhu Zhu Yu Li 于2020年4月22日周三 下午2:55写道: > +1 (non-binding) > > - Checked release notes: OK > - Checked sums and signatures: OK > - Source release > - contains no binaries: OK > - contains no 1.9-SNAPSHOT references: OK > - build from source: OK (8u101) > - mvn clean verify: OK (8u101) > - Binary release > - no examples appear to be missing > - started a cluster, WebUI reachable, several streaming and batch > examples ran successfully (11.0.4) > - Repository appears to contain all expected artifacts > - Website PR looks good > > Best Regards, > Yu > > > On Sat, 18 Apr 2020 at 10:38, Dian Fu wrote: > > > Hi everyone, > > > > Please review and vote on the release candidate #1 for the version 1.9.3, > > as follows: > > [ ] +1, Approve the release > > [ ] -1, Do not approve the release (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > * JIRA release notes [1], > > * the official Apache source release and binary convenience releases to > be > > deployed to dist.apache.org [2], which are signed with the key with > > fingerprint 6B6291A8502BA8F0913AE04DDEB95B05BF075300 [3], > > * all artifacts to be deployed to the Maven Central Repository [4], > > * source code tag "release-1.9.3-rc1" [5], > > * website pull request listing the new release and adding announcement > blog > > post [6]. > > > > The vote will be open for at least 72 hours. It is adopted by majority > > approval, with at least 3 PMC affirmative votes. > > > > Thanks, > > Dian > > > > [1] > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346867 > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.3-rc1/ > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > [4] > > https://repository.apache.org/content/repositories/orgapacheflink-1353/ > > [5] > > > https://github.com/apache/flink/commit/6d23b2c38c7a8fd8855063238c744923e1985a63 > > [6] https://github.com/apache/flink-web/pull/329 >
Re: [DISCUSS] Adding support for Hadoop 3 and removing flink-shaded-hadoop
Hi Robert, I think it would be a helpful simplification of Flink's build setup if we can get rid of flink-shaded-hadoop. Moreover relying only on the vanilla Hadoop dependencies for the modules which interact with Hadoop/Yarn sounds like a good idea to me. Adding support for Hadoop 3 would also be nice. I'm not sure, though, how Hadoop's API's have changed between 2 and 3. It might be necessary to introduce some bridges in order to make it work. Cheers, Till On Tue, Apr 21, 2020 at 4:37 PM Robert Metzger wrote: > Hi all, > > for the upcoming 1.11 release, I started looking into adding support for > Hadoop 3[1] for Flink. I have explored a little bit already into adding a > shaded hadoop 3 into “flink-shaded”, and some mechanisms for switching > between Hadoop 2 and 3 dependencies in the Flink build. > > However, Chesnay made me aware that we could also go a different route: We > let Flink depend on vanilla Hadoop dependencies and stop providing shaded > fat jars for Hadoop through “flink-shaded”. > > Why? > - Maintaining properly shaded Hadoop fat jars is a lot of work (we have > insufficient test coverage for all kinds of Hadoop features) > - For Hadoop 2, there are already some known and unresolved issues with our > shaded jars that we didn’t manage to fix > > Users will have to use Flink with Hadoop by relying on vanilla or > vendor-provided Hadoop dependencies. > > What do you think? > > Best, > Robert > > [1] https://issues.apache.org/jira/browse/FLINK-11086 >
Re: [DISCUSS] Adding support for Hadoop 3 and removing flink-shaded-hadoop
+1 to getting rid of flink-shaded-hadoop. But we need to document how people can now get a Flink dist that works with Hadoop. Currently, when you download the single shaded jar you immediately get support for submitting to YARN via bin/flink run. Aljoscha On 22.04.20 09:08, Till Rohrmann wrote: Hi Robert, I think it would be a helpful simplification of Flink's build setup if we can get rid of flink-shaded-hadoop. Moreover relying only on the vanilla Hadoop dependencies for the modules which interact with Hadoop/Yarn sounds like a good idea to me. Adding support for Hadoop 3 would also be nice. I'm not sure, though, how Hadoop's API's have changed between 2 and 3. It might be necessary to introduce some bridges in order to make it work. Cheers, Till On Tue, Apr 21, 2020 at 4:37 PM Robert Metzger wrote: Hi all, for the upcoming 1.11 release, I started looking into adding support for Hadoop 3[1] for Flink. I have explored a little bit already into adding a shaded hadoop 3 into “flink-shaded”, and some mechanisms for switching between Hadoop 2 and 3 dependencies in the Flink build. However, Chesnay made me aware that we could also go a different route: We let Flink depend on vanilla Hadoop dependencies and stop providing shaded fat jars for Hadoop through “flink-shaded”. Why? - Maintaining properly shaded Hadoop fat jars is a lot of work (we have insufficient test coverage for all kinds of Hadoop features) - For Hadoop 2, there are already some known and unresolved issues with our shaded jars that we didn’t manage to fix Users will have to use Flink with Hadoop by relying on vanilla or vendor-provided Hadoop dependencies. What do you think? Best, Robert [1] https://issues.apache.org/jira/browse/FLINK-11086
Re: [VOTE] Release 1.9.3, release candidate #1
+1 (non-binding) - Verified signature - Built from source (Java8) - Run custom jobs on Kubernetes Regards, Fabian > On 18. Apr 2020, at 04:37, Dian Fu wrote: > > Hi everyone, > > Please review and vote on the release candidate #1 for the version 1.9.3, > as follows: > [ ] +1, Approve the release > [ ] -1, Do not approve the release (please provide specific comments) > > > The complete staging area is available for your review, which includes: > * JIRA release notes [1], > * the official Apache source release and binary convenience releases to be > deployed to dist.apache.org [2], which are signed with the key with > fingerprint 6B6291A8502BA8F0913AE04DDEB95B05BF075300 [3], > * all artifacts to be deployed to the Maven Central Repository [4], > * source code tag "release-1.9.3-rc1" [5], > * website pull request listing the new release and adding announcement blog > post [6]. > > The vote will be open for at least 72 hours. It is adopted by majority > approval, with at least 3 PMC affirmative votes. > > Thanks, > Dian > > [1] > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346867 > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.3-rc1/ > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > [4] https://repository.apache.org/content/repositories/orgapacheflink-1353/ > [5] > https://github.com/apache/flink/commit/6d23b2c38c7a8fd8855063238c744923e1985a63 > [6] https://github.com/apache/flink-web/pull/329
[jira] [Created] (FLINK-17319) Fix broken link to walkthrough guide
Patrick Wiener created FLINK-17319: -- Summary: Fix broken link to walkthrough guide Key: FLINK-17319 URL: https://issues.apache.org/jira/browse/FLINK-17319 Project: Flink Issue Type: Bug Components: Documentation, Stateful Functions Affects Versions: statefun-2.0.0 Reporter: Patrick Wiener Currently, there is a broken link in the index.md to the walkthrough guide, referencing to walkthrough.html that does not exist. see [https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.0/] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17320) Java8 lambda expression cannot be serialized.
Jiayi Liao created FLINK-17320: -- Summary: Java8 lambda expression cannot be serialized. Key: FLINK-17320 URL: https://issues.apache.org/jira/browse/FLINK-17320 Project: Flink Issue Type: Bug Components: API / Type Serialization System, Table SQL / Runtime Affects Versions: 1.9.0 Reporter: Jiayi Liao Reproduce codes. {code:java} @Test public void test() throws IOException { PriorityQueue pq = new PriorityQueue<>((o1, o2) -> o1.length - o2.length - 1); pq.add("1234135"); pq.add("12323424135"); KryoSerializer kryoSerializer = new KryoSerializer(PriorityQueue.class, new ExecutionConfig()); kryoSerializer.serialize(pq, new DataOutputSerializer(10240)); } {code} And the NPE will be thrown: {code:java} Caused by: java.lang.NullPointerException at com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:80) at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:488) at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:593) at org.apache.flink.runtime.types.PriorityQueueSerializer.write(PriorityQueueSerializer.java:67) at org.apache.flink.runtime.types.PriorityQueueSerializer.write(PriorityQueueSerializer.java:40) at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:599) at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.serialize(KryoSerializer.java:307) at org.apache.flink.util.InstantiationUtil.serializeToByteArray(InstantiationUtil.java:526) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17321) Support for casting collection types.
Dawid Wysakowicz created FLINK-17321: Summary: Support for casting collection types. Key: FLINK-17321 URL: https://issues.apache.org/jira/browse/FLINK-17321 Project: Flink Issue Type: Sub-task Components: Table SQL / Planner Reporter: Dawid Wysakowicz Casts of collection types are not supported yet. E.g. query: {{"SELECT cast (a as ARRAY) FROM (VALUES (array[3, 2, 1])) AS T(a)"}} fails with: {code} org.apache.flink.table.planner.codegen.CodeGenException: Unsupported cast from 'ARRAY NOT NULL' to 'ARRAY NOT NULL'. at org.apache.flink.table.planner.codegen.calls.ScalarOperatorGens$.generateCast(ScalarOperatorGens.scala:1284) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateCallExpression(ExprCodeGenerator.scala:691) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:486) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.visitCall(ExprCodeGenerator.scala:52) at org.apache.calcite.rex.RexCall.accept(RexCall.java:288) at org.apache.flink.table.planner.codegen.ExprCodeGenerator.generateExpression(ExprCodeGenerator.scala:132) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$$anonfun$5.apply(CalcCodeGenerator.scala:152) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$$anonfun$5.apply(CalcCodeGenerator.scala:152) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.produceProjectionCode$1(CalcCodeGenerator.scala:152) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.generateProcessCode(CalcCodeGenerator.scala:179) at org.apache.flink.table.planner.codegen.CalcCodeGenerator$.generateCalcOperator(CalcCodeGenerator.scala:49) at org.apache.flink.table.planner.plan.nodes.physical.batch.BatchExecCalc.translateToPlanInternal(BatchExecCalc.scala:62) at org.apache.flink.table.planner.plan.nodes.physical.batch.BatchExecCalc.translateToPlanInternal(BatchExecCalc.scala:38) at org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:58) at org.apache.flink.table.planner.plan.nodes.physical.batch.BatchExecCalcBase.translateToPlan(BatchExecCalcBase.scala:42) at org.apache.flink.table.planner.plan.nodes.physical.batch.BatchExecSink.translateToTransformation(BatchExecSink.scala:131) at org.apache.flink.table.planner.plan.nodes.physical.batch.BatchExecSink.translateToPlanInternal(BatchExecSink.scala:97) at org.apache.flink.table.planner.plan.nodes.physical.batch.BatchExecSink.translateToPlanInternal(BatchExecSink.scala:49) at org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:58) at org.apache.flink.table.planner.plan.nodes.physical.batch.BatchExecSink.translateToPlan(BatchExecSink.scala:49) at org.apache.flink.table.planner.delegation.BatchPlanner$$anonfun$translateToPlan$1.apply(BatchPlanner.scala:72) at org.apache.flink.table.planner.delegation.BatchPlanner$$anonfun$translateToPlan$1.apply(BatchPlanner.scala:71) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.delegation.BatchPlanner.translateToPlan(BatchPlanner.scala:71) at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:153) ... {code} Similar behaviour can be observed for MULTISET, MAP, ROW -- This message was sent by Atlassian Jira (v8.3.4#803005)
[DISCUSS] FLIP-36 - Support Interactive Programming in Flink Table API
Hi folks, I'd like to start the discussion about FLIP-36 Support Interactive Programming in Flink Table API https://cwiki.apache.org/confluence/display/FLINK/FLIP-36%3A+Support+Interactive+Programming+in+Flink The FLIP proposes to add support for interactive programming in Flink Table API. Specifically, it let users cache the intermediate results(tables) and use them in the later jobs. Even though the FLIP has been discussed in the past[1], the FLIP hasn't formally passed the vote yet. And some of the design and implementation detail have to change to incorporates the cluster partition proposed in FLIP-67[2]. Looking forward to your feedback. Thanks, Xuannan [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-67%3A+Cluster+partitions+lifecycle [2] https://lists.apache.org/thread.html/b372fd7b962b9f37e4dace3bc8828f6e2a2b855e56984e58bc4a413f@%3Cdev.flink.apache.org%3E
Re: [DISCUSS] Adding support for Hadoop 3 and removing flink-shaded-hadoop
I agree with Aljoscha. Otherwise I can see a lot of tickets getting created saying the application is not running on YARN. Cheers, Sivaprasanna On Wed, Apr 22, 2020 at 1:00 PM Aljoscha Krettek wrote: > +1 to getting rid of flink-shaded-hadoop. But we need to document how > people can now get a Flink dist that works with Hadoop. Currently, when > you download the single shaded jar you immediately get support for > submitting to YARN via bin/flink run. > > Aljoscha > > > On 22.04.20 09:08, Till Rohrmann wrote: > > Hi Robert, > > > > I think it would be a helpful simplification of Flink's build setup if we > > can get rid of flink-shaded-hadoop. Moreover relying only on the vanilla > > Hadoop dependencies for the modules which interact with Hadoop/Yarn > sounds > > like a good idea to me. > > > > Adding support for Hadoop 3 would also be nice. I'm not sure, though, how > > Hadoop's API's have changed between 2 and 3. It might be necessary to > > introduce some bridges in order to make it work. > > > > Cheers, > > Till > > > > On Tue, Apr 21, 2020 at 4:37 PM Robert Metzger > wrote: > > > >> Hi all, > >> > >> for the upcoming 1.11 release, I started looking into adding support for > >> Hadoop 3[1] for Flink. I have explored a little bit already into adding > a > >> shaded hadoop 3 into “flink-shaded”, and some mechanisms for switching > >> between Hadoop 2 and 3 dependencies in the Flink build. > >> > >> However, Chesnay made me aware that we could also go a different route: > We > >> let Flink depend on vanilla Hadoop dependencies and stop providing > shaded > >> fat jars for Hadoop through “flink-shaded”. > >> > >> Why? > >> - Maintaining properly shaded Hadoop fat jars is a lot of work (we have > >> insufficient test coverage for all kinds of Hadoop features) > >> - For Hadoop 2, there are already some known and unresolved issues with > our > >> shaded jars that we didn’t manage to fix > >> > >> Users will have to use Flink with Hadoop by relying on vanilla or > >> vendor-provided Hadoop dependencies. > >> > >> What do you think? > >> > >> Best, > >> Robert > >> > >> [1] https://issues.apache.org/jira/browse/FLINK-11086 > >> > > > >
[jira] [Created] (FLINK-17322) Enable latency tracker would corrupt the broadcast state
Yun Tang created FLINK-17322: Summary: Enable latency tracker would corrupt the broadcast state Key: FLINK-17322 URL: https://issues.apache.org/jira/browse/FLINK-17322 Project: Flink Issue Type: Bug Components: Runtime / Network Reporter: Yun Tang This bug is reported from user mail list: [http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Latency-tracking-together-with-broadcast-state-can-cause-job-failure-td34013.html] Execute {{BroadcastStateIT#broadcastStateWorksWithLatencyTracking}} would easily reproduce this problem. >From current information, the broadcast element would be corrupt once we >enable {{env.getConfig().setLatencyTrackingInterval(2000)}}. The exception stack trace would be: (based on current master branch) {code:java} Caused by: java.io.IOException: Corrupt stream, found tag: 84 at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:217) ~[classes/:?] at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:46) ~[classes/:?] at org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55) ~[classes/:?] at org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:157) ~[classes/:?] at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:123) ~[classes/:?] at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:181) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:332) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:206) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:196) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:505) ~[classes/:?] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:485) ~[classes/:?] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:720) ~[classes/:?] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:544) ~[classes/:?] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_144] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17323) ChannelStateReader rejects requests about unkown channels (Unaligned checkpoints)
Roman Khachatryan created FLINK-17323: - Summary: ChannelStateReader rejects requests about unkown channels (Unaligned checkpoints) Key: FLINK-17323 URL: https://issues.apache.org/jira/browse/FLINK-17323 Project: Flink Issue Type: Bug Components: Runtime / Task Affects Versions: 1.11.0 Reporter: Roman Khachatryan Assignee: Roman Khachatryan Fix For: 1.11.0 ChannelStateReader expects requests only for channels or subpartitions that have state. In case of upscaling or starting from scratch this behavior is incorrect. It should return NO_MORE_DATA. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17324) Move the image we use to generate the flink-docker image into flink-docker
Ismaël Mejía created FLINK-17324: Summary: Move the image we use to generate the flink-docker image into flink-docker Key: FLINK-17324 URL: https://issues.apache.org/jira/browse/FLINK-17324 Project: Flink Issue Type: Improvement Components: Release System / Docker Reporter: Ismaël Mejía Before the docker official image was repatriated into Apache Flink we used a docker image that contained the scripts to generate the release. {{docker run --rm \}} {{ --volume ~/projects/docker-flink:/build \}} {{ plucas/docker-flink-build \}} {{ /build/generate-stackbrew-library.sh > ~/projects/official-images /library/flink}} Notice that this docker image tool 'plucas/docker-flink-build' is not part of upstream Flink so we need to move it there into some sort of tools section in the flink-docker repo or document an alternative to it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [VOTE] Release 1.9.3, release candidate #1
+1 (non-binding) - verified the checksum and signature - checked the release note - checked that there are no new dependencies introduced since 1.9.2 which may affect the license (only bump kafka from 0.10.2.1 to 0.10.2.2 which doesn't affect the license) - checked that there are no missing artifacts in staging area - checked that the pyflink package could be pip installed Regards, Dian On Wed, Apr 22, 2020 at 3:35 PM Fabian Paul wrote: > +1 (non-binding) > > - Verified signature > - Built from source (Java8) > - Run custom jobs on Kubernetes > > Regards, > Fabian > > > On 18. Apr 2020, at 04:37, Dian Fu wrote: > > > > Hi everyone, > > > > Please review and vote on the release candidate #1 for the version 1.9.3, > > as follows: > > [ ] +1, Approve the release > > [ ] -1, Do not approve the release (please provide specific comments) > > > > > > The complete staging area is available for your review, which includes: > > * JIRA release notes [1], > > * the official Apache source release and binary convenience releases to > be > > deployed to dist.apache.org [2], which are signed with the key with > > fingerprint 6B6291A8502BA8F0913AE04DDEB95B05BF075300 [3], > > * all artifacts to be deployed to the Maven Central Repository [4], > > * source code tag "release-1.9.3-rc1" [5], > > * website pull request listing the new release and adding announcement > blog > > post [6]. > > > > The vote will be open for at least 72 hours. It is adopted by majority > approval, with at least 3 PMC affirmative votes. > > > > Thanks, > > Dian > > > > [1] > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346867 > > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.3-rc1/ > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS > > [4] > https://repository.apache.org/content/repositories/orgapacheflink-1353/ > > [5] > https://github.com/apache/flink/commit/6d23b2c38c7a8fd8855063238c744923e1985a63 > > [6] https://github.com/apache/flink-web/pull/329 > >
[jira] [Created] (FLINK-17325) Integrate orc to file system connector
Jingsong Lee created FLINK-17325: Summary: Integrate orc to file system connector Key: FLINK-17325 URL: https://issues.apache.org/jira/browse/FLINK-17325 Project: Flink Issue Type: Sub-task Components: Connectors / FileSystem, Connectors / ORC Reporter: Jingsong Lee Assignee: Jingsong Lee Fix For: 1.11.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17326) flink sql gateway should support persist meta information such as SessionContext in order to recover
jackylau created FLINK-17326: Summary: flink sql gateway should support persist meta information such as SessionContext in order to recover Key: FLINK-17326 URL: https://issues.apache.org/jira/browse/FLINK-17326 Project: Flink Issue Type: Bug Components: Client / Job Submission Affects Versions: 1.10.0 Reporter: jackylau Fix For: 1.11.0 flink sql gateway should support persist meta information such as SessionContext in order to recover -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Adding support for Hadoop 3 and removing flink-shaded-hadoop
Thanks Robert for starting this significant discussion. Since hadoop3 has been released for long time and many companies have already put it in production. No matter you are using flink-shaded-hadoop2 or not, currently Flink could already run in yarn3(not sure about HDFS). Since the yarn api is always backward compatible. The difference is we could not benefit from the new features because we are using hadoop-2.4 as compile dependency. So then we need to use reflector for new features(node label, tags, etc.). All in all, i am in in favour of dropping the flink-shaded-hadoop. Just have some questions. 1. Do we still support "-include-hadoop" profile? If yes, what we will get in the lib dir? 2. I am not sure whether dropping the flink-shaded-hadoop will take some class conflicts problems. If we use "export HADOOP_CLASSPATH=`hadoop classpath`" for the hadoop env setup, then many jars will be appended to the Flink client classpath. 3. The compile hadoop version is still 2.4.1. Right? Best, Yang Sivaprasanna 于2020年4月22日周三 下午4:18写道: > I agree with Aljoscha. Otherwise I can see a lot of tickets getting created > saying the application is not running on YARN. > > Cheers, > Sivaprasanna > > On Wed, Apr 22, 2020 at 1:00 PM Aljoscha Krettek > wrote: > > > +1 to getting rid of flink-shaded-hadoop. But we need to document how > > people can now get a Flink dist that works with Hadoop. Currently, when > > you download the single shaded jar you immediately get support for > > submitting to YARN via bin/flink run. > > > > Aljoscha > > > > > > On 22.04.20 09:08, Till Rohrmann wrote: > > > Hi Robert, > > > > > > I think it would be a helpful simplification of Flink's build setup if > we > > > can get rid of flink-shaded-hadoop. Moreover relying only on the > vanilla > > > Hadoop dependencies for the modules which interact with Hadoop/Yarn > > sounds > > > like a good idea to me. > > > > > > Adding support for Hadoop 3 would also be nice. I'm not sure, though, > how > > > Hadoop's API's have changed between 2 and 3. It might be necessary to > > > introduce some bridges in order to make it work. > > > > > > Cheers, > > > Till > > > > > > On Tue, Apr 21, 2020 at 4:37 PM Robert Metzger > > wrote: > > > > > >> Hi all, > > >> > > >> for the upcoming 1.11 release, I started looking into adding support > for > > >> Hadoop 3[1] for Flink. I have explored a little bit already into > adding > > a > > >> shaded hadoop 3 into “flink-shaded”, and some mechanisms for switching > > >> between Hadoop 2 and 3 dependencies in the Flink build. > > >> > > >> However, Chesnay made me aware that we could also go a different > route: > > We > > >> let Flink depend on vanilla Hadoop dependencies and stop providing > > shaded > > >> fat jars for Hadoop through “flink-shaded”. > > >> > > >> Why? > > >> - Maintaining properly shaded Hadoop fat jars is a lot of work (we > have > > >> insufficient test coverage for all kinds of Hadoop features) > > >> - For Hadoop 2, there are already some known and unresolved issues > with > > our > > >> shaded jars that we didn’t manage to fix > > >> > > >> Users will have to use Flink with Hadoop by relying on vanilla or > > >> vendor-provided Hadoop dependencies. > > >> > > >> What do you think? > > >> > > >> Best, > > >> Robert > > >> > > >> [1] https://issues.apache.org/jira/browse/FLINK-11086 > > >> > > > > > > > >
Re: [DISCUSS] Adding support for Hadoop 3 and removing flink-shaded-hadoop
1) Likely not, as this again introduces a hard-dependency on flink-shaded-hadoop. 2) Indeed; this will be something the user/cloud providers have to deal with now. 3) Yes. As a small note, we can still keep the hadoop-2 version of flink-shaded around for existing users. What I suggested was to just not release hadoop-3 versions. On 22/04/2020 14:19, Yang Wang wrote: Thanks Robert for starting this significant discussion. Since hadoop3 has been released for long time and many companies have already put it in production. No matter you are using flink-shaded-hadoop2 or not, currently Flink could already run in yarn3(not sure about HDFS). Since the yarn api is always backward compatible. The difference is we could not benefit from the new features because we are using hadoop-2.4 as compile dependency. So then we need to use reflector for new features(node label, tags, etc.). All in all, i am in in favour of dropping the flink-shaded-hadoop. Just have some questions. 1. Do we still support "-include-hadoop" profile? If yes, what we will get in the lib dir? 2. I am not sure whether dropping the flink-shaded-hadoop will take some class conflicts problems. If we use "export HADOOP_CLASSPATH=`hadoop classpath`" for the hadoop env setup, then many jars will be appended to the Flink client classpath. 3. The compile hadoop version is still 2.4.1. Right? Best, Yang Sivaprasanna 于2020年4月22日周三 下午4:18写道: I agree with Aljoscha. Otherwise I can see a lot of tickets getting created saying the application is not running on YARN. Cheers, Sivaprasanna On Wed, Apr 22, 2020 at 1:00 PM Aljoscha Krettek wrote: +1 to getting rid of flink-shaded-hadoop. But we need to document how people can now get a Flink dist that works with Hadoop. Currently, when you download the single shaded jar you immediately get support for submitting to YARN via bin/flink run. Aljoscha On 22.04.20 09:08, Till Rohrmann wrote: Hi Robert, I think it would be a helpful simplification of Flink's build setup if we can get rid of flink-shaded-hadoop. Moreover relying only on the vanilla Hadoop dependencies for the modules which interact with Hadoop/Yarn sounds like a good idea to me. Adding support for Hadoop 3 would also be nice. I'm not sure, though, how Hadoop's API's have changed between 2 and 3. It might be necessary to introduce some bridges in order to make it work. Cheers, Till On Tue, Apr 21, 2020 at 4:37 PM Robert Metzger wrote: Hi all, for the upcoming 1.11 release, I started looking into adding support for Hadoop 3[1] for Flink. I have explored a little bit already into adding a shaded hadoop 3 into “flink-shaded”, and some mechanisms for switching between Hadoop 2 and 3 dependencies in the Flink build. However, Chesnay made me aware that we could also go a different route: We let Flink depend on vanilla Hadoop dependencies and stop providing shaded fat jars for Hadoop through “flink-shaded”. Why? - Maintaining properly shaded Hadoop fat jars is a lot of work (we have insufficient test coverage for all kinds of Hadoop features) - For Hadoop 2, there are already some known and unresolved issues with our shaded jars that we didn’t manage to fix Users will have to use Flink with Hadoop by relying on vanilla or vendor-provided Hadoop dependencies. What do you think? Best, Robert [1] https://issues.apache.org/jira/browse/FLINK-11086
[DISCUSS] Intermediary releases of the flink-docker images
Hello, I wanted to discuss about a subject that was shortly mentioned during the docker unification thread (and in some other PR) that is the release of docker images independent of major Flink releases. In the past when the docker images were maintained outside of the Apache repository we usually did intermediary releases to fix issues or to add new functionalities that were independent of the Flink releases and specific of the docker images. There are two major cases when this happened: 1. If the upstream official docker images maintainers requested us to do the changes or there was some breakage because some upstream change (this is rare but has happened in the past). 2. If we wanted to fix issues or add new functionality to the images that was independent of the full Flink release. We have been working on having Java 11 based images available and this is an example of (2), where we want to publish these images based on the already released 1.10.0 version. So I would like to know your opinion on how should we proceed in the future. Ideally we should wait for the major release, but the reality is that (1) can happen and (2) can benefit end users. So what should we do? Can we do these updates without a formal release as we did before, or does it make sense to follow a release process with the corresponding vote for the docker images? or are there other alternatives? Regards, Ismaël
Re: [DISCUSS] Intermediary releases of the flink-docker images
I am confused why docker upstream can influence inner process seriously,may be the jvm thread?In my opinion,docker is just a child space of os. xiaoxingstack 邮箱:xiaoxingst...@gmail.com 签名由 网易邮箱大师 定制 On 04/22/2020 20:55, Ismaël Mejía wrote: Hello, I wanted to discuss about a subject that was shortly mentioned during the docker unification thread (and in some other PR) that is the release of docker images independent of major Flink releases. In the past when the docker images were maintained outside of the Apache repository we usually did intermediary releases to fix issues or to add new functionalities that were independent of the Flink releases and specific of the docker images. There are two major cases when this happened: 1. If the upstream official docker images maintainers requested us to do the changes or there was some breakage because some upstream change (this is rare but has happened in the past). 2. If we wanted to fix issues or add new functionality to the images that was independent of the full Flink release. We have been working on having Java 11 based images available and this is an example of (2), where we want to publish these images based on the already released 1.10.0 version. So I would like to know your opinion on how should we proceed in the future. Ideally we should wait for the major release, but the reality is that (1) can happen and (2) can benefit end users. So what should we do? Can we do these updates without a formal release as we did before, or does it make sense to follow a release process with the corresponding vote for the docker images? or are there other alternatives? Regards, Ismaël
Re: [DISCUSS] Intermediary releases of the flink-docker images
maybe could try someway like foreachpartition in foreachrdd,which will not together to driver take too extra consumption. xiaoxingstack 邮箱:xiaoxingst...@gmail.com 签名由 网易邮箱大师 定制 On 04/22/2020 20:55, Ismaël Mejía wrote: Hello, I wanted to discuss about a subject that was shortly mentioned during the docker unification thread (and in some other PR) that is the release of docker images independent of major Flink releases. In the past when the docker images were maintained outside of the Apache repository we usually did intermediary releases to fix issues or to add new functionalities that were independent of the Flink releases and specific of the docker images. There are two major cases when this happened: 1. If the upstream official docker images maintainers requested us to do the changes or there was some breakage because some upstream change (this is rare but has happened in the past). 2. If we wanted to fix issues or add new functionality to the images that was independent of the full Flink release. We have been working on having Java 11 based images available and this is an example of (2), where we want to publish these images based on the already released 1.10.0 version. So I would like to know your opinion on how should we proceed in the future. Ideally we should wait for the major release, but the reality is that (1) can happen and (2) can benefit end users. So what should we do? Can we do these updates without a formal release as we did before, or does it make sense to follow a release process with the corresponding vote for the docker images? or are there other alternatives? Regards, Ismaël
Re: [VOTE] Release 1.9.3, release candidate #1
+1 (non-binding) - checked/verified signatures and hashes - checked the release note - checked that there are no missing artifacts in staging area - built from source sing scala 2.11 and using Scala 2.12 succeeded - ran a couple of end-to-end tests locally and succeeded - went through all commits checked in between 1.9.3 and 1.9.2, make sure all issues set the "fixVersion" property - started a cluster, WebUI was accessible, submitted a wordcount job and ran succeeded, no suspicious log output - the web PR looks good Best, Leonard Xu > 在 2020年4月22日,17:58,Dian Fu 写道: > > +1 (non-binding) > > - verified the checksum and signature > - checked the release note > - checked that there are no new dependencies introduced since 1.9.2 which > may affect the license (only bump kafka from 0.10.2.1 to 0.10.2.2 which > doesn't affect the license) > - checked that there are no missing artifacts in staging area > - checked that the pyflink package could be pip installed > > Regards, > Dian > > On Wed, Apr 22, 2020 at 3:35 PM Fabian Paul > wrote: > >> +1 (non-binding) >> >> - Verified signature >> - Built from source (Java8) >> - Run custom jobs on Kubernetes >> >> Regards, >> Fabian >> >>> On 18. Apr 2020, at 04:37, Dian Fu wrote: >>> >>> Hi everyone, >>> >>> Please review and vote on the release candidate #1 for the version 1.9.3, >>> as follows: >>> [ ] +1, Approve the release >>> [ ] -1, Do not approve the release (please provide specific comments) >>> >>> >>> The complete staging area is available for your review, which includes: >>> * JIRA release notes [1], >>> * the official Apache source release and binary convenience releases to >> be >>> deployed to dist.apache.org [2], which are signed with the key with >>> fingerprint 6B6291A8502BA8F0913AE04DDEB95B05BF075300 [3], >>> * all artifacts to be deployed to the Maven Central Repository [4], >>> * source code tag "release-1.9.3-rc1" [5], >>> * website pull request listing the new release and adding announcement >> blog >>> post [6]. >>> >>> The vote will be open for at least 72 hours. It is adopted by majority >> approval, with at least 3 PMC affirmative votes. >>> >>> Thanks, >>> Dian >>> >>> [1] >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346867 >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.3-rc1/ >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS >>> [4] >> https://repository.apache.org/content/repositories/orgapacheflink-1353/ >>> [5] >> https://github.com/apache/flink/commit/6d23b2c38c7a8fd8855063238c744923e1985a63 >>> [6] https://github.com/apache/flink-web/pull/329 >> >>
Re: [DISCUSS] Intermediary releases of the flink-docker images
We can create additional releases independent of Flink, but they will have to go through a formal release process in any case. On 22/04/2020 14:55, Ismaël Mejía wrote: Hello, I wanted to discuss about a subject that was shortly mentioned during the docker unification thread (and in some other PR) that is the release of docker images independent of major Flink releases. In the past when the docker images were maintained outside of the Apache repository we usually did intermediary releases to fix issues or to add new functionalities that were independent of the Flink releases and specific of the docker images. There are two major cases when this happened: 1. If the upstream official docker images maintainers requested us to do the changes or there was some breakage because some upstream change (this is rare but has happened in the past). 2. If we wanted to fix issues or add new functionality to the images that was independent of the full Flink release. We have been working on having Java 11 based images available and this is an example of (2), where we want to publish these images based on the already released 1.10.0 version. So I would like to know your opinion on how should we proceed in the future. Ideally we should wait for the major release, but the reality is that (1) can happen and (2) can benefit end users. So what should we do? Can we do these updates without a formal release as we did before, or does it make sense to follow a release process with the corresponding vote for the docker images? or are there other alternatives? Regards, Ismaël
[jira] [Created] (FLINK-17327) Kafka unavailability could cause Flink TM shutdown
Jun Qin created FLINK-17327: --- Summary: Kafka unavailability could cause Flink TM shutdown Key: FLINK-17327 URL: https://issues.apache.org/jira/browse/FLINK-17327 Project: Flink Issue Type: Bug Affects Versions: 1.10.0 Reporter: Jun Qin Steps to reproduce: # Start a Flink 1.10 standalone cluster # Run a Flink job which reads from one Kafka topic and writes to another topic, with exactly-once checkpointing enabled # Stop all Kafka Brokers after a few successful checkpoints When Kafka brokers are down: # {{org.apache.kafka.clients.NetworkClient}} reported connection to broker could not be established # Then, Flink could not complete snapshot due to {{Timeout expired while initializing transactional state in 6ms}} # After several snapshot failures, Flink reported {{Too many ongoing snapshots. Increase kafka producers pool size or decrease number of concurrent checkpoints.}} # Eventually, Flink tried to cancel the task which did not succeed within 3 min # Then {{Fatal error occurred while executing the TaskManager. Shutting it down...}} I will attach the logs to show the details. Worth to note that if there would be no consumer but producer only in the task, the behavior is different: # {{org.apache.kafka.clients.NetworkClient}} reported connection to broker could not be established # after {{delivery.timeout.ms}} (2min by default), producer reports: {{FlinkKafkaException: Failed to send data to Kafka: Expiring 4 record(s) for output-topic-0:120001 ms has passed since batch creation}} # Flink tried to cancel the upstream tasks and created a new producer # The new producer obviously reported connectivity issue to brokers # This continues till Kafka brokers are back. # Flink reported {{Too many ongoing snapshots. Increase kafka producers pool size or decrease number of concurrent checkpoints.}} # Flink cancelled the tasks and restarted them # The job continues, and new checkpoint succeeded. # TM runs all the time in this scenario I set Kafka transaction time out to 1 hour just to avoid transaction timeout. To get a producer only task, I called {{env.disableOperatorChaining();}} in the second scenario. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Releasing Flink 1.10.1
FLINK-16662 has been fixed just now (thanks @LionelZ and @Klou for the efforts!) and I will prepare the first RC soon, JFYI. Best Regards, Yu On Thu, 16 Apr 2020 at 23:29, Yu Li wrote: > Thanks for the efforts and feedbacks all. > > Now the only left blocker is the below one (newly escalated with a solid > reason), will create RC1 right after it's resolved. > > FLINK-16662 Blink Planner failed to generate JobGraph for POJO DataStream > converting to Table (Cannot determine simple type name) > > Best Regards, > Yu > > > On Thu, 16 Apr 2020 at 20:17, Congxian Qiu wrote: > >> Thanks a lot for driving this, Yu! Looking forward to the first RC of >> 1.10.1 >> FLINK-16576 has been merged into release-1.10 already. >> >> Best, >> Congxian >> >> >> Dian Fu 于2020年4月16日周四 上午10:35写道: >> >> > Thanks a lot for driving this, Yu! Looking forward for the first RC of >> > 1.10.1. >> > >> > > 在 2020年4月16日,上午10:24,jincheng sun 写道: >> > > >> > > Looking forward the first RC of Flink 1.10.1 . >> > > Good job Yu! >> > > >> > > Best, >> > > Jincheng >> > > >> > > >> > > >> > > Jark Wu 于2020年4月15日周三 下午6:28写道: >> > > >> > >> +1 to have a 1.10.1 RC soon. It has been a long time since 1.10.0 is >> > >> released. >> > >> >> > >> Best, >> > >> Jark >> > >> >> > >> On Wed, 15 Apr 2020 at 16:10, Till Rohrmann >> > wrote: >> > >> >> > >>> Great to see that will have the first RC for Flink 1.10.1 soon. >> Thanks >> > a >> > >>> lot for driving this effort Yu! >> > >>> >> > >>> Cheers, >> > >>> Till >> > >>> >> > >>> On Sun, Apr 12, 2020 at 5:03 PM Yu Li wrote: >> > >>> >> > Thanks Weike and all others for the efforts! >> > >> > Here comes the latest status, we are in good shape and plan to >> produce >> > >>> RC1 >> > next week. >> > >> > * Blockers (1 left) >> > - [Closed] FLINK-16018 Improve error reporting when submitting >> batch >> > >>> job >> > (instead of AskTimeoutException) >> > - [Closed] FLINK-16142 Memory Leak causes Metaspace OOM error on >> > >>> repeated >> > job submission >> > - [Closed] FLINK-16170 SearchTemplateRequest >> ClassNotFoundException >> > >>> when >> > use flink-sql-connector-elasticsearch7 >> > - [Closed] FLINK-16262 Class loader problem with >> > FlinkKafkaProducer.Semantic.EXACTLY_ONCE and usrlib directory >> > - [Closed] FLINK-16406 Increase default value for JVM Metaspace to >> > minimise its OutOfMemoryError >> > - [Closed] FLINK-16454 Update the copyright year in NOTICE files >> > - [Closed] FLINK-16705 LocalExecutor tears down MiniCluster before >> > >>> client >> > can retrieve JobResult >> > - [Closed] FLINK-16913 >> ReadableConfigToConfigurationAdapter#getEnum >> > throws UnsupportedOperationException >> > - [Closed] FLINK-16626 Exception encountered when cancelling a >> job in >> > yarn per-job mode >> > - [Fix for 1.10.1 is Done] FLINK-17093 Python UDF doesn't work >> when >> > >> the >> > input column is of composite type >> > - [PR reviewed] FLINK-16576 State inconsistency on restore with >> > >> memory >> > state backends >> > >> > * Critical (1 left) >> > - [Closed] FLINK-16047 Blink planner produces wrong aggregate >> results >> > with state clean up >> > - [Closed] FLINK-16070 Blink planner can not extract correct >> unique >> > >> key >> > for UpsertStreamTableSink >> > - [Fix for 1.10.1 is Done] FLINK-16225 Metaspace Out Of Memory >> should >> > >>> be >> > handled as Fatal Error in TaskManager >> > - [Closed] FLINK-14316 stuck in "Job leader ... lost leadership" >> > >> error >> > - [May Postpone] FLINK-16408 Bind user code class loader to >> lifetime >> > >>> of a >> > slot >> > >> > Please let me know if any concerns/comments. Thanks. >> > >> > Best Regards, >> > Yu >> > >> > >> > On Fri, 3 Apr 2020 at 21:35, DONG, Weike >> > >>> wrote: >> > >> > > Hi Yu, >> > > >> > > Thanks for your updates. I am still working on the fix for >> > >> FLINK-16626 >> > and >> > > it is expected to be completed by this Sunday after thorough >> testing. >> > > >> > > Sincerely, >> > > Weike >> > > >> > > On Fri, Apr 3, 2020 at 8:43 PM Yu Li wrote: >> > > >> > >> Updates for 1.10.1 watched issues (we are in good progress and >> > >> almost >> > >> there >> > >> to produce the first RC, thanks all for the efforts): >> > >> >> > >> * Blockers (3 left) >> > >> - [Closed] FLINK-16018 Improve error reporting when submitting >> > >> batch >> > job >> > >> (instead of AskTimeoutException) >> > >> - [Closed] FLINK-16142 Memory Leak causes Metaspace OOM error on >> > >> repeated >> > >> job submission >> > >> - [Closed] FLINK-16170 SearchTemplateRequest >> > >> ClassNotFoundException >> > when >> > >> use flink-sql-connector-elasticsearch7 >> >
Re: [DISCUSS] Adding support for Hadoop 3 and removing flink-shaded-hadoop
+1 for supporting Hadoop 3. I'm not familiar with the shading efforts, thus no comment on dropping the flink-shaded-hadoop. Correct me if I'm wrong. Despite currently the default Hadoop version for compiling is 2.4.1 in Flink, I think this does not mean Flink should support only Hadoop 2.4+. So no matter which Hadoop version we use for compiling by default, we need to use reflection for the Hadoop features/APIs that are not supported in all versions anyway. There're already many such reflections in `YarnClusterDescriptor` and `YarnResourceManager`, and might be more in future. I'm wondering whether we should have a unified mechanism (an interface / abstract class or so) that handles all these kind of Hadoop API reflections at one place. Not necessarily in the scope to this discussion though. Thank you~ Xintong Song On Wed, Apr 22, 2020 at 8:32 PM Chesnay Schepler wrote: > 1) Likely not, as this again introduces a hard-dependency on > flink-shaded-hadoop. > 2) Indeed; this will be something the user/cloud providers have to deal > with now. > 3) Yes. > > As a small note, we can still keep the hadoop-2 version of flink-shaded > around for existing users. > What I suggested was to just not release hadoop-3 versions. > > On 22/04/2020 14:19, Yang Wang wrote: > > Thanks Robert for starting this significant discussion. > > > > Since hadoop3 has been released for long time and many companies have > > already > > put it in production. No matter you are using flink-shaded-hadoop2 or > not, > > currently > > Flink could already run in yarn3(not sure about HDFS). Since the yarn api > > is always > > backward compatible. The difference is we could not benefit from the new > > features > > because we are using hadoop-2.4 as compile dependency. So then we need to > > use > > reflector for new features(node label, tags, etc.). > > > > All in all, i am in in favour of dropping the flink-shaded-hadoop. Just > > have some questions. > > 1. Do we still support "-include-hadoop" profile? If yes, what we will > get > > in the lib dir? > > 2. I am not sure whether dropping the flink-shaded-hadoop will take some > > class conflicts > > problems. If we use "export HADOOP_CLASSPATH=`hadoop classpath`" for the > > hadoop > > env setup, then many jars will be appended to the Flink client classpath. > > 3. The compile hadoop version is still 2.4.1. Right? > > > > > > Best, > > Yang > > > > > > Sivaprasanna 于2020年4月22日周三 下午4:18写道: > > > >> I agree with Aljoscha. Otherwise I can see a lot of tickets getting > created > >> saying the application is not running on YARN. > >> > >> Cheers, > >> Sivaprasanna > >> > >> On Wed, Apr 22, 2020 at 1:00 PM Aljoscha Krettek > >> wrote: > >> > >>> +1 to getting rid of flink-shaded-hadoop. But we need to document how > >>> people can now get a Flink dist that works with Hadoop. Currently, when > >>> you download the single shaded jar you immediately get support for > >>> submitting to YARN via bin/flink run. > >>> > >>> Aljoscha > >>> > >>> > >>> On 22.04.20 09:08, Till Rohrmann wrote: > Hi Robert, > > I think it would be a helpful simplification of Flink's build setup if > >> we > can get rid of flink-shaded-hadoop. Moreover relying only on the > >> vanilla > Hadoop dependencies for the modules which interact with Hadoop/Yarn > >>> sounds > like a good idea to me. > > Adding support for Hadoop 3 would also be nice. I'm not sure, though, > >> how > Hadoop's API's have changed between 2 and 3. It might be necessary to > introduce some bridges in order to make it work. > > Cheers, > Till > > On Tue, Apr 21, 2020 at 4:37 PM Robert Metzger > >>> wrote: > > Hi all, > > > > for the upcoming 1.11 release, I started looking into adding support > >> for > > Hadoop 3[1] for Flink. I have explored a little bit already into > >> adding > >>> a > > shaded hadoop 3 into “flink-shaded”, and some mechanisms for > switching > > between Hadoop 2 and 3 dependencies in the Flink build. > > > > However, Chesnay made me aware that we could also go a different > >> route: > >>> We > > let Flink depend on vanilla Hadoop dependencies and stop providing > >>> shaded > > fat jars for Hadoop through “flink-shaded”. > > > > Why? > > - Maintaining properly shaded Hadoop fat jars is a lot of work (we > >> have > > insufficient test coverage for all kinds of Hadoop features) > > - For Hadoop 2, there are already some known and unresolved issues > >> with > >>> our > > shaded jars that we didn’t manage to fix > > > > Users will have to use Flink with Hadoop by relying on vanilla or > > vendor-provided Hadoop dependencies. > > > > What do you think? > > > > Best, > > Robert > > > > [1] https://issues.apache.org/jira/browse/FLINK-11086 > > > >>> > >
[jira] [Created] (FLINK-17328) Expose network metric for job vertex in rest api
lining created FLINK-17328: -- Summary: Expose network metric for job vertex in rest api Key: FLINK-17328 URL: https://issues.apache.org/jira/browse/FLINK-17328 Project: Flink Issue Type: Improvement Components: Runtime / Metrics, Runtime / REST Reporter: lining JobDetailsHandler * pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, inputFloatingBuffersUsageAvg * back-pressured for show whether it is back pressured(merge all iths subtasks) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17329) Quickstarts Scala nightly e2e test failed on travis
Yu Li created FLINK-17329: - Summary: Quickstarts Scala nightly e2e test failed on travis Key: FLINK-17329 URL: https://issues.apache.org/jira/browse/FLINK-17329 Project: Flink Issue Type: Bug Components: Quickstarts Affects Versions: 1.10.0, 1.11.0 Reporter: Yu Li Fix For: 1.11.0, 1.10.2 The `Quickstarts Scala nightly end-to-end test` case failed on travis due to failed to download elastic search package: {noformat} Downloading Elasticsearch from https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.1.2.tar.gz ... % Total% Received % Xferd Average Speed TimeTime Time Current Dload Upload Total SpentLeft Speed 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 33 31.7M 33 10.6M0 0 31.1M 0 0:00:01 --:--:-- 0:00:01 31.0M curl: (56) GnuTLS recv error (-54): Error in the pull function. gzip: stdin: unexpected end of file tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now {noformat} https://api.travis-ci.org/v3/job/677803024/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17330) Avoid scheduling deadlocks caused by intra-logical-region ALL-to-ALL blocking edges
Zhu Zhu created FLINK-17330: --- Summary: Avoid scheduling deadlocks caused by intra-logical-region ALL-to-ALL blocking edges Key: FLINK-17330 URL: https://issues.apache.org/jira/browse/FLINK-17330 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Affects Versions: 1.11.0 Reporter: Zhu Zhu Fix For: 1.11.0 Imagine a job like this: A --(pipelined FORWARD)--> B --(blocking ALL-to-ALL)--> D A --(pipelined FORWARD)--> C --(pipelined FORWARD)--> D parallelism=2 for all vertices. We will have 2 execution pipelined regions: R1={A1, B1, C1, D1}, R2={A2, B2, C2, D2} R1 has a cross-region input edge (B2->D1). R2 has a cross-region input edge (B1->D2). Scheduling deadlock will happen since we schedule a region only when all its inputs are consumable (i.e. blocking partitions to be finished). Because R1 can be scheduled only if R2 finishes, while R2 can be scheduled only if R1 finishes. To avoid this, one solution is to force a logical pipelined region with intra-region ALL-to-ALL blocking edges to form one only execution pipelined region, so that there would not be cyclic input dependency between regions. Besides that, we should also pay attention to avoid cyclic cross-region POINTWISE blocking edges. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17331) Add NettyMessageContent interface for all the class which could be write to NettyMessage
Yangze Guo created FLINK-17331: -- Summary: Add NettyMessageContent interface for all the class which could be write to NettyMessage Key: FLINK-17331 URL: https://issues.apache.org/jira/browse/FLINK-17331 Project: Flink Issue Type: Improvement Reporter: Yangze Guo Currently, there are some classes, e.g. {{JobVertexID}}, {{ExecutionAttemptID}} need to write to {{NettyMessage}}. However, the size of these classes in {{ByteBuf}} are directly written in {{NettyMessage}} class, which is error-prone. If someone edits those classes, there would be no warning or error during the compile phase. I think it would be better to add a {{NettyMessageContent}}(the name could be discussed) interface: {code:java} public interface NettyMessageContent { void writeTo(ByteBuf bug) int getContentLen(); } {code} Regarding the {{fromByteBuf}}, since it is a static method, we could not add it to the interface. We might explain it in the javaDoc of {{NettyMessageContent}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17332) Fix restart policy not equals to Never for native task manager pods
Canbin Zheng created FLINK-17332: Summary: Fix restart policy not equals to Never for native task manager pods Key: FLINK-17332 URL: https://issues.apache.org/jira/browse/FLINK-17332 Project: Flink Issue Type: Bug Components: Deployment / Kubernetes Affects Versions: 1.10.0, 1.10.1 Reporter: Canbin Zheng Fix For: 1.11.0 Currently, we do not explicitly set the {{RestartPolicy}} for the TaskManager Pod in native K8s setups so that it is {{Always}} by default. The task manager pod itself should not restart the failed Container, the decision should always made by the job manager. Therefore, this ticket proposes to set the {{RestartPolicy}} to {{Never}} for the task manager pods. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17333) add doc for "create ddl"
Bowen Li created FLINK-17333: Summary: add doc for "create ddl" Key: FLINK-17333 URL: https://issues.apache.org/jira/browse/FLINK-17333 Project: Flink Issue Type: Improvement Components: Documentation, Table SQL / API Reporter: Bowen Li Assignee: Bowen Li Fix For: 1.11.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17334) Flink does not support UDFs with primitive return types
xin.ruan created FLINK-17334: Summary: Flink does not support UDFs with primitive return types Key: FLINK-17334 URL: https://issues.apache.org/jira/browse/FLINK-17334 Project: Flink Issue Type: Bug Components: Connectors / Hive Affects Versions: 1.10.0 Reporter: xin.ruan Fix For: 1.10.1 We are currently migrating Hive UDF to Flink. While testing compatibility, we found that Flink cannot support primitive types like boolean, int, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-17335) JDBCUpsertTableSink Upsert mysql exception No value specified for parameter 1
yutao created FLINK-17335: - Summary: JDBCUpsertTableSink Upsert mysql exception No value specified for parameter 1 Key: FLINK-17335 URL: https://issues.apache.org/jira/browse/FLINK-17335 Project: Flink Issue Type: Bug Components: Connectors / JDBC Affects Versions: 1.10.0 Reporter: yutao -- This message was sent by Atlassian Jira (v8.3.4#803005)