Re: Introduction
Welcome Neelesh! On Mon, Aug 1, 2016 at 4:47 AM, Márton Balassi wrote: > Welcome Neelesh, great to have you here. :) > > On Sun, Jul 31, 2016, 11:08 Neelesh Salian wrote: > > > Hello folks, > > > > I am Neelesh Salian; I recently joined the Flink community and I wanted > to > > take this opportunity to formally introduce myself. > > > > I have been working with the Hadoop and Spark ecosystems over the past > two > > years and found Flink really interesting in the Streaming use case. > > > > > > Excited to start working and help build the community. :) > > Thank you. > > Have a great Sunday. > > > > > > -- > > Neelesh Srinivas Salian > > Customer Operations Engineer > > >
[jira] [Created] (FLINK-4285) Non-existing example in Flink quickstart setup documentation
Tzu-Li (Gordon) Tai created FLINK-4285: -- Summary: Non-existing example in Flink quickstart setup documentation Key: FLINK-4285 URL: https://issues.apache.org/jira/browse/FLINK-4285 Project: Flink Issue Type: Bug Components: Documentation, Examples Reporter: Tzu-Li (Gordon) Tai Priority: Minor Fix For: 1.1.0 In https://ci.apache.org/projects/flink/flink-docs-master/quickstart/setup_quickstart.html, the {{examples/streaming/SocketTextStreamWordCount.jar}} example doesn't exist anymore in the distributed package. It was somehow previously removed in https://github.com/apache/flink/commit/986d5368fdb84c44111994b667ce1fc5f9992716. We either add the {{SocketTextStreamWordCount}} example back, or change the documentation to use another example, probably {{SocketWindowWordCount}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Introduction
On Sun, Jul 31, 2016 at 8:07 PM, Neelesh Salian wrote: > I am Neelesh Salian; I recently joined the Flink community and I wanted to > take this opportunity to formally introduce myself. Thanks and welcome! :-)
[jira] [Created] (FLINK-4286) Have Kafka examples that use the Kafka 0.9 connector
Tzu-Li (Gordon) Tai created FLINK-4286: -- Summary: Have Kafka examples that use the Kafka 0.9 connector Key: FLINK-4286 URL: https://issues.apache.org/jira/browse/FLINK-4286 Project: Flink Issue Type: Improvement Components: Examples Reporter: Tzu-Li (Gordon) Tai Priority: Minor The {{ReadFromKafka}} and {{WriteIntoKafka}} examples use the 0.8 connector, and the built example jar is named {{Kafka.jar}} under {{examples/streaming/}} in the distributed package. Since we have different connectors for different Kafka versions, it would be good to have examples for different versions, and package them as {{Kafka08.jar}} and {{Kafka09.jar}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Introduction
Hi! Welcome to the community :-)! On 01.08.2016 09:51, Ufuk Celebi wrote: On Sun, Jul 31, 2016 at 8:07 PM, Neelesh Salian wrote: I am Neelesh Salian; I recently joined the Flink community and I wanted to take this opportunity to formally introduce myself. Thanks and welcome! :-)
Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
The Collector is pretty integral in all the other functions that return multiple elements. I honestly don't see us switching away from it, given that it is such a core part of the API. The close() method has, to my best knowledge, not caused issues, yet. I cannot recall anyone mentioning that the close() method confused them, they accidentally called it, etc. I am wondering whether this is more of a theoretical than practical issue. If we move one function away from Collector to be on the safe side for a "maybe change in the future", while keeping Collector in all other functions - I think that fragments the API concept wise more than it improves anything. On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek wrote: > @Stephan: For the Output, should we keep using a Collector (which exposes) > the close() method which should never be called by users or create a new > Output type that only has an "output" method. Collector can also be used > but with a close() method that doesn't do anything. In the long run, I > thought it might be better to switch the type away from Collector. > > Cheers, > Aljoscha > > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels wrote: > > > I think it looks like Beam rather than Hadoop :) > > > > What Stephan meant was that he wanted a dedicated output method in the > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose > > the collector. > > > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek > > wrote: > > > You mean keep the Collector? I don't like that one because it has the > > > close() method that should never be called by the user. > > > > > > We can keep it, though, because all the other user function interfaces > > also > > > expose it to the user. > > > > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen wrote: > > > > > >> I would actually make the output a separate parameter as well. Pretty > > much > > >> like the old variant, only replacing the "Window" parameter by the > > context > > >> (which contains everything about the window). > > >> It could also be called "WindowInvocationContext" or so. > > >> > > >> The current variant looks too Hadoop to me ;-) Everything done on the > > >> context object, and messy mocking when creating tests. > > >> > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran < > radu.tudo...@huawei.com> > > >> wrote: > > >> > > >> > Hi, > > >> > > > >> > Sorry - I made a mistake - I was thinking of getting access to the > > >> > collection (mist-read :) collector) of events in the window buffer > in > > >> > order to be able to delete/evict some of them which are not > necessary > > the > > >> > last ones. > > >> > > > >> > > > >> > Radu > > >> > > > >> > -Original Message- > > >> > From: Aljoscha Krettek [mailto:aljos...@apache.org] > > >> > Sent: Monday, July 18, 2016 5:54 PM > > >> > To: dev@flink.apache.org > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata > > >> > > > >> > What about the collector? This is only used for emitting elements to > > the > > >> > downstream operation. > > >> > > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran > > >> wrote: > > >> > > > >> > > Hi, > > >> > > > > >> > > I think it looks good and most importantly is that we can extend > it > > in > > >> > > the directions discussed so far. > > >> > > > > >> > > One question though regarding the Collector - are we going to be > > able > > >> > > to delete random elements from the list if this is not exposed as > a > > >> > > collection, at least to the evictor? If not, how are we going to > > >> > > extend in the future to cover this case? > > >> > > > > >> > > Regarding the ordering - I also observed that there are situations > > >> > > where elements do not have a logical order. One example is if you > > have > > >> > > high rates of the events. Nevertheless, even if now is not the > time > > >> > > for this, I think in the future we can imagine having also some > data > > >> > > structures that offer some ordering. It can save some computation > > >> > > efforts later in the functions for some use cases. > > >> > > > > >> > > > > >> > > Best regards, > > >> > > > > >> > > > > >> > > -Original Message- > > >> > > From: Aljoscha Krettek [mailto:aljos...@apache.org] > > >> > > Sent: Monday, July 18, 2016 3:45 PM > > >> > > To: dev@flink.apache.org > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata > > >> > > > > >> > > I incorporated the changes. The proposed interface of > > >> > > ProcessWindowFunction is now this: > > >> > > > > >> > > public abstract class ProcessWindowFunction extends > > >> > > Window> implements Function { > > >> > > > > >> > > public abstract void process(KEY key, Iterable elements, > > >> > > Context > > >> > > ctx) throws Exception; > > >> > > > > >> > > public abstract class Context { > > >> > > public abstract W window(); > > >> > > public abstract void output(OUT value); > > >> > > } > > >> > > } > > >> > > > > >> > > I'
[jira] [Created] (FLINK-4287) Unable to access secured HBase from a yarn-session.
Niels Basjes created FLINK-4287: --- Summary: Unable to access secured HBase from a yarn-session. Key: FLINK-4287 URL: https://issues.apache.org/jira/browse/FLINK-4287 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 1.0.3 Reporter: Niels Basjes Assignee: Niels Basjes When I start {{yarn-session.sh -n1}} against a Kerberos secured Yarn+HBase cluster I see this in the messages: {quote} 2016-08-01 09:53:01,763 INFO org.apache.flink.yarn.Utils - Attempting to obtain Kerberos security token for HBase 2016-08-01 09:53:01,763 INFO org.apache.flink.yarn.Utils - HBase is not available (not packaged with this application): ClassNotFoundException : "org.apache.hadoop.hbase.HBaseConfiguration". {quote} as a consequence it has become impossible to access a secured HBase from this yarn session. >From what I see now at least two things need to be done: # Add all relevant HBase parts to the yarn-session.sh scripting. # Add an optional option to pass principle and keytab file so the session can last longer than the time the Kerberos tickets last. (i.e pass these parameters into a call to {{UserGroupInformation.loginUserFromKeytab(user, keytabFile);}}) I do see that this would leave an important problem open: This yarnsession is accessible by everyone on the cluster and as a consequence they can run jobs in there that can access all data I have access to. Perhaps this should be a separate jira issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
Just tried to reproduce the error reported by Aljoscha, but could not. I used a clean checkpoint of the RC1 code and cleaned all local maven caches before the testing. @Aljoscha: Can you reproduce this on your machine? Can you try and clean the maven caches? On Sun, Jul 31, 2016 at 7:31 PM, Ufuk Celebi wrote: > Probably related to shading :( What's strange is that Travis builds > for Hadoop 2.6.3 with the release-1.1 branch do succeed (sometimes... > Travis is super flakey at the moment, because of some corrupted cached > dependencies): https://travis-ci.org/apache/flink/jobs/148348699 > > On Fri, Jul 29, 2016 at 4:19 PM, Aljoscha Krettek > wrote: > > When running "mvn clean verify" with Hadoop version 2.6.1 the > > Zookeeper/Leader Election tests fail with this: > > > > java.lang.NoSuchMethodError: > > > org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String; > > at > > > org.apache.curator.framework.imps.NamespaceImpl.(NamespaceImpl.java:37) > > at > > > org.apache.curator.framework.imps.CuratorFrameworkImpl.(CuratorFrameworkImpl.java:113) > > at > > > org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124) > > at > > > org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101) > > at > > > org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143) > > at > > > org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70) > > at > > > org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187) > > > > I'll continue testing other parts and other Hadoop versions. > > > > On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi wrote: > > > >> Dear Flink community, > >> > >> Please vote on releasing the following candidate as Apache Flink version > >> 1.1.0. > >> > >> I've CC'd u...@flink.apache.org as users are encouraged to help > >> testing Flink 1.1.0 for their specific use cases. Please feel free to > >> report issues and successful tests on dev@flink.apache.org. > >> > >> The commit to be voted on: > >> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463) > >> > >> Branch: > >> release-1.1.0-rc1 > >> ( > >> > https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1 > >> ) > >> > >> The release artifacts to be voted on can be found at: > >> http://people.apache.org/~uce/flink-1.1.0-rc1/ > >> > >> The release artifacts are signed with the key with fingerprint 9D403309: > >> http://www.apache.org/dist/flink/KEYS > >> > >> The staging repository for this release can be found at: > >> https://repository.apache.org/content/repositories/orgapacheflink-1098 > >> > >> There is also a Google doc to coordinate the testing efforts. This is > >> a copy of the release document found in our Wiki: > >> > >> > https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing > >> > >> - > >> > >> Thanks to everyone who contributed to this release candidate. > >> > >> The vote is open for the next 3 days (not counting the weekend) and > >> passes if a majority of at least three +1 PMC votes are cast. > >> > >> The vote ends on Monday August 1st, 2016. > >> > >> [ ] +1 Release this package as Apache Flink 1.1.0 > >> [ ] -1 Do not release this package, because ... > >> >
[jira] [Created] (FLINK-4288) Make it possible to unregister tables
Timo Walther created FLINK-4288: --- Summary: Make it possible to unregister tables Key: FLINK-4288 URL: https://issues.apache.org/jira/browse/FLINK-4288 Project: Flink Issue Type: Improvement Components: Table API & SQL Reporter: Timo Walther Table names can not be changed yet. After registration you can not modify the table behind a table name. Maybe this behavior is too restrictive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4289) Source files have executable flag set
Ufuk Celebi created FLINK-4289: -- Summary: Source files have executable flag set Key: FLINK-4289 URL: https://issues.apache.org/jira/browse/FLINK-4289 Project: Flink Issue Type: Bug Reporter: Ufuk Celebi Assignee: Ufuk Celebi Priority: Minor Running {{find . -executable -type f}} lists the following source {{.java}} files as executable: {code} ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/SumFunction.java ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/ApplyFunction.java ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/GatherFunction.java ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/GatherSumApplyIteration.java ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/Neighbor.java ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/library/GSASingleSourceShortestPaths.java ./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/library/GSAConnectedComponents.java ./flink-libraries/flink-gelly-examples/src/test/java/org/apache/flink/graph/test/GatherSumApplyITCase.java ./flink-libraries/flink-gelly-examples/src/main/java/org/apache/flink/graph/examples/GSASingleSourceShortestPaths.java ./flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/CurrentUserRetweet.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Size.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Entities.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/URL.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/UserMention.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Media.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/HashTags.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Symbol.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/Tweet.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/Contributors.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/Coordinates.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/places/BoundingBox.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/places/Attributes.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/places/Places.java ./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/User/Users.java {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4290) CassandraConnectorTest deadlocks
Stephan Ewen created FLINK-4290: --- Summary: CassandraConnectorTest deadlocks Key: FLINK-4290 URL: https://issues.apache.org/jira/browse/FLINK-4290 Project: Flink Issue Type: Bug Components: Streaming Connectors Affects Versions: 1.0.3 Reporter: Stephan Ewen Priority: Critical The {{CassandraConnectorTest}} encountered a full deadlock on my latest test run. Stack dump of the JVM is attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4291) No log entry for unscheduled reporters
Chesnay Schepler created FLINK-4291: --- Summary: No log entry for unscheduled reporters Key: FLINK-4291 URL: https://issues.apache.org/jira/browse/FLINK-4291 Project: Flink Issue Type: Bug Components: Metrics Affects Versions: 1.1.0 Reporter: Chesnay Schepler Priority: Trivial When a non-Scheduled reporter is configured no log message is printed. It would be nice if we would print a log message for every instantiated reporter, not just Scheduled ones. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4292) HCatalog project incorrectly set up
Stephan Ewen created FLINK-4292: --- Summary: HCatalog project incorrectly set up Key: FLINK-4292 URL: https://issues.apache.org/jira/browse/FLINK-4292 Project: Flink Issue Type: Bug Components: Batch Connectors and Input/Output Formats Affects Versions: 1.0.3 Reporter: Stephan Ewen Assignee: Stephan Ewen Priority: Critical Fix For: 1.1.1, 1.2.0 The {{HCatalog}} project is erroneous in IntelliJ, because it misses the Scala SDK dependency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4293) Malformatted Apache Haders
Stephan Ewen created FLINK-4293: --- Summary: Malformatted Apache Haders Key: FLINK-4293 URL: https://issues.apache.org/jira/browse/FLINK-4293 Project: Flink Issue Type: Bug Affects Versions: 1.0.3 Reporter: Stephan Ewen Several files contain this header: {code} /** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ {code} The correct header format should be: {code} /* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Introduction
Welcome to the community Neelesh :-) On Mon, Aug 1, 2016 at 3:53 PM, Kevin Jacobs wrote: > Hi! > > Welcome to the community :-)! > > > > On 01.08.2016 09:51, Ufuk Celebi wrote: > >> On Sun, Jul 31, 2016 at 8:07 PM, Neelesh Salian >> wrote: >> >>> I am Neelesh Salian; I recently joined the Flink community and I wanted >>> to >>> take this opportunity to formally introduce myself. >>> >> Thanks and welcome! :-) >> > >
[jira] [Created] (FLINK-4294) Allow access of composite type fields
Timo Walther created FLINK-4294: --- Summary: Allow access of composite type fields Key: FLINK-4294 URL: https://issues.apache.org/jira/browse/FLINK-4294 Project: Flink Issue Type: New Feature Components: Table API & SQL Reporter: Timo Walther Assignee: Timo Walther Currently all Flink CompositeTypes are treated as GenericRelDataTypes. It would be better to access individual fields of composite types, too. e.g. {{code}} SELECT composite.name FROM composites SELECT tuple.f0 FROM tuples 'f0.getField(0) {{code}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4295) CoGroupSortTranslationTest#testGroupSortTuplesDefaultCoGroup fails to run
Chesnay Schepler created FLINK-4295: --- Summary: CoGroupSortTranslationTest#testGroupSortTuplesDefaultCoGroup fails to run Key: FLINK-4295 URL: https://issues.apache.org/jira/browse/FLINK-4295 Project: Flink Issue Type: Bug Components: Tests Affects Versions: 1.1.0 Reporter: Chesnay Schepler Priority: Minor When running the test you always get an exception; 08/01/2016 15:03:34 Job execution switched to status RUNNING. 08/01/2016 15:03:34 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to SCHEDULED 08/01/2016 15:03:34 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to DEPLOYING 08/01/2016 15:03:34 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to SCHEDULED 08/01/2016 15:03:34 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to DEPLOYING 08/01/2016 15:03:34 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to RUNNING 08/01/2016 15:03:34 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to RUNNING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(1/4) switched to SCHEDULED 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(3/4) switched to SCHEDULED 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(3/4) switched to DEPLOYING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(1/4) switched to DEPLOYING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4) switched to SCHEDULED 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4) switched to DEPLOYING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(2/4) switched to SCHEDULED 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(2/4) switched to DEPLOYING 08/01/2016 15:03:35 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to FINISHED 08/01/2016 15:03:35 DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566) (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to FINISHED 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(3/4) switched to RUNNING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(1/4) switched to RUNNING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(2/4) switched to RUNNING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4) switched to RUNNING 08/01/2016 15:03:35 DataSink (collect())(4/4) switched to SCHEDULED 08/01/2016 15:03:35 DataSink (collect())(1/4) switched to SCHEDULED 08/01/2016 15:03:35 DataSink (collect())(4/4) switched to DEPLOYING 08/01/2016 15:03:35 DataSink (collect())(1/4) switched to DEPLOYING 08/01/2016 15:03:35 DataSink (collect())(3/4) switched to SCHEDULED 08/01/2016 15:03:35 DataSink (collect())(3/4) switched to DEPLOYING 08/01/2016 15:03:35 CoGroup (CoGroup at org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4) switched to FINISHED 08/01/2016 15:03:
[jira] [Created] (FLINK-4296) Scheduler accepts more tasks than it has task slots available
Maximilian Michels created FLINK-4296: - Summary: Scheduler accepts more tasks than it has task slots available Key: FLINK-4296 URL: https://issues.apache.org/jira/browse/FLINK-4296 Project: Flink Issue Type: Bug Components: JobManager, TaskManager Affects Versions: 1.1.0 Reporter: Maximilian Michels Priority: Critical Fix For: 1.1.0, 1.2.0 Flink's scheduler doesn't support queued scheduling but expects to find all necessary task slots upon scheduling. If it does not it throws an error. Due to some changes in the latest master, this seems to be broken. Flink accepts jobs with {{parallelism > total number of task slots}}, schedules and deploys tasks in all available task slots, and leaves the remaining tasks lingering forever. Easy to reproduce: {code} ./bin/flink run -p TASK_SLOTS+n {code} where {{TASK_SLOTS}} is the number of total task slots of the cluster and {{n>=1}}. Here, {{p=11}}, {{TASK_SLOTS=10}}: {noformat} Cluster configuration: Standalone cluster with JobManager at localhost/127.0.0.1:6123 Using address localhost:6123 to connect to JobManager. JobManager web interface address http://localhost:8081 Starting execution of program Executing EnumTriangles example with default edges data set. Use --edges to specify file input. Printing result to stdout. Use --output to specify output path. Submitting job with JobID: cd0c0b4cbe25643d8d92558168cfc045. Waiting for job completion. 08/01/2016 12:12:12 Job execution switched to status RUNNING. 08/01/2016 12:12:12 CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to SCHEDULED 08/01/2016 12:12:12 CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to DEPLOYING 08/01/2016 12:12:12 CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to RUNNING 08/01/2016 12:12:12 CHAIN DataSource (at getDefaultEdgeDataSet(EnumTrianglesData.java:57) (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at main(EnumTriangles.java:108))(1/1) switched to FINISHED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(1/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(3/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(2/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(7/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(7/11) switched to DEPLOYING 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(6/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(4/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(5/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(4/11) switched to DEPLOYING 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(3/11) switched to DEPLOYING 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(9/11) switched to SCHEDULED 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(9/11) switched to DEPLOYING 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(5/11) switched to DEPLOYING 08/01/2016 12:12:12 GroupReduce (GroupReduce at main(EnumTriangles.java:112))(1/11) switched to DEPLOYING 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(1/11) switched to SCHEDULED 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(1/11) switched to DEPLOYING 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(2/11) switched to SCHEDULED 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(2/11) switched to DEPLOYING 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(3/11) switched to SCHEDULED 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(3/11) switched to DEPLOYING 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(4/11) switched to SCHEDULED 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(4/11) switched to DEPLOYING 08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(5/11) switched to SCHEDULED 08/01/2016 12:12:12 Join(Join at main
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
Thanks for the new release candidate Ufuk! Found two issues during testing: 1) Scheduling: The Flink scheduler accepts (it shouldn't) jobs with parallelism > total number of task slots, schedules tasks in all available task slots, and leaves the remaining tasks lingering forever. Haven't had time to investigate much, but a bit more details here: => JIRA: https://issues.apache.org/jira/browse/FLINK-4296 2) Yarn encoding issues with special characters in automatically determined location of the far jar => JIRA: https://issues.apache.org/jira/browse/FLINK-4297 => Fix: https://github.com/apache/flink/pull/2320 Otherwise, looks pretty good so far :) On Mon, Aug 1, 2016 at 10:27 AM, Stephan Ewen wrote: > Just tried to reproduce the error reported by Aljoscha, but could not. > I used a clean checkpoint of the RC1 code and cleaned all local maven caches > before the testing. > > @Aljoscha: Can you reproduce this on your machine? Can you try and clean the > maven caches? > > On Sun, Jul 31, 2016 at 7:31 PM, Ufuk Celebi wrote: >> >> Probably related to shading :( What's strange is that Travis builds >> for Hadoop 2.6.3 with the release-1.1 branch do succeed (sometimes... >> Travis is super flakey at the moment, because of some corrupted cached >> dependencies): https://travis-ci.org/apache/flink/jobs/148348699 >> >> On Fri, Jul 29, 2016 at 4:19 PM, Aljoscha Krettek >> wrote: >> > When running "mvn clean verify" with Hadoop version 2.6.1 the >> > Zookeeper/Leader Election tests fail with this: >> > >> > java.lang.NoSuchMethodError: >> > >> > org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String; >> > at >> > >> > org.apache.curator.framework.imps.NamespaceImpl.(NamespaceImpl.java:37) >> > at >> > >> > org.apache.curator.framework.imps.CuratorFrameworkImpl.(CuratorFrameworkImpl.java:113) >> > at >> > >> > org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124) >> > at >> > >> > org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101) >> > at >> > >> > org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143) >> > at >> > >> > org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70) >> > at >> > >> > org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187) >> > >> > I'll continue testing other parts and other Hadoop versions. >> > >> > On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi wrote: >> > >> >> Dear Flink community, >> >> >> >> Please vote on releasing the following candidate as Apache Flink >> >> version >> >> 1.1.0. >> >> >> >> I've CC'd u...@flink.apache.org as users are encouraged to help >> >> testing Flink 1.1.0 for their specific use cases. Please feel free to >> >> report issues and successful tests on dev@flink.apache.org. >> >> >> >> The commit to be voted on: >> >> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463) >> >> >> >> Branch: >> >> release-1.1.0-rc1 >> >> ( >> >> >> >> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1 >> >> ) >> >> >> >> The release artifacts to be voted on can be found at: >> >> http://people.apache.org/~uce/flink-1.1.0-rc1/ >> >> >> >> The release artifacts are signed with the key with fingerprint >> >> 9D403309: >> >> http://www.apache.org/dist/flink/KEYS >> >> >> >> The staging repository for this release can be found at: >> >> https://repository.apache.org/content/repositories/orgapacheflink-1098 >> >> >> >> There is also a Google doc to coordinate the testing efforts. This is >> >> a copy of the release document found in our Wiki: >> >> >> >> >> >> https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing >> >> >> >> - >> >> >> >> Thanks to everyone who contributed to this release candidate. >> >> >> >> The vote is open for the next 3 days (not counting the weekend) and >> >> passes if a majority of at least three +1 PMC votes are cast. >> >> >> >> The vote ends on Monday August 1st, 2016. >> >> >> >> [ ] +1 Release this package as Apache Flink 1.1.0 >> >> [ ] -1 Do not release this package, because ... >> >> > >
[jira] [Created] (FLINK-4297) Yarn client can't determine fat jar location if path contains spaces
Maximilian Michels created FLINK-4297: - Summary: Yarn client can't determine fat jar location if path contains spaces Key: FLINK-4297 URL: https://issues.apache.org/jira/browse/FLINK-4297 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Maximilian Michels Assignee: Maximilian Michels Fix For: 1.1.0, 1.2.0 The code that automatically determines the fat jar path through the ProtectionDomain of the Yarn class, receives a possibly URL encoded path string. We need to decode using the system locale encoding, otherwise we can receive errors of the following when spaces are in the file path: {noformat} Caused by: java.io.FileNotFoundException: File file:/Users/max/Downloads/release-testing/flink-1.1.0-rc1/flink-1.1.0/build%20target/lib/flink-dist_2.11-1.1.0.jar does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289) at org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile(LocalFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1836) at org.apache.flink.yarn.Utils.setupLocalResource(Utils.java:129) at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:616) at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:365) ... 6 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4298) Clean up Storm Compatibility Dependencies
Stephan Ewen created FLINK-4298: --- Summary: Clean up Storm Compatibility Dependencies Key: FLINK-4298 URL: https://issues.apache.org/jira/browse/FLINK-4298 Project: Flink Issue Type: Bug Components: Storm Compatibility Affects Versions: 1.0.3, 1.1.0 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 1.2.0 The {{flink-storm}} project contains unnecessary (transitive) dependencies - Google Guava - Ring Closure Servlet Bindings Particularly the last one is frequently troublesome in Maven builds (unstable downloads) and is not required, because the Storm Compatibility layer does not start a web UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4299) Show loss of job manager in Client
Ufuk Celebi created FLINK-4299: -- Summary: Show loss of job manager in Client Key: FLINK-4299 URL: https://issues.apache.org/jira/browse/FLINK-4299 Project: Flink Issue Type: Improvement Components: Client Reporter: Ufuk Celebi Fix For: 1.1.0 If the client looses the connection to a job manager and the job recovers from this, the client will only print the job status as {{RUNNING}} again. It is hard to actually notice that something went wrong and a job manager was lost. {code} ... 08/01/2016 14:35:43 Flat Map -> Sink: Unnamed(8/8) switched to RUNNING 08/01/2016 14:35:43 Source: Custom Source(6/8) switched to RUNNING <-- EVERYTHING'S RUNNING --> 08/01/2016 14:40:40 Job execution switched to status RUNNING <--- JOB MANAGER FAIL OVER 08/01/2016 14:40:40 Source: Custom Source(1/8) switched to SCHEDULED 08/01/2016 14:40:40 Source: Custom Source(1/8) switched to DEPLOYING 08/01/2016 14:40:40 Source: Custom Source(2/8) switched to SCHEDULED ... {code} After {{14:35:43}} everything is running and the client does not print any execution state updates. When the job manager fails, the job will be recovered and enter the running state again eventually (at 14:40:40), but the user might never notice this. I would like to improve on this by printing some messages about the state of the job manager connection. For example, between {{14:35:43}} and {{14:40:40}} it might say that the job manager connection was lost, a new one established, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4300) Improve error message for different Scala versions of JM and Client
Timo Walther created FLINK-4300: --- Summary: Improve error message for different Scala versions of JM and Client Key: FLINK-4300 URL: https://issues.apache.org/jira/browse/FLINK-4300 Project: Flink Issue Type: Improvement Components: Client, JobManager Reporter: Timo Walther If a user runs a job (e.g. via RemoteEnvironment) with different Scala versions of JobManager and Client, the job is not executed and has no proper error message. The Client fails only with a meaningless warning: {code} 16:59:58,677 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@127.0.0.1:6123] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. {code} JobManager log only contains the following warning: {code} 2016-08-01 16:59:58,664 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@192.168.1.142:63372] has failed, address is now gated for [5000] ms. Reason is: [scala.Option; local class incompatible: stream classdesc serialVersionUID = -2062608324514658839, local class serialVersionUID = -114498752079829388]. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
This is also a major issue for batch with off-heap memory and memory preallocation turned off: https://issues.apache.org/jira/browse/FLINK-4094 Not hard to fix though as we simply need to reliably clear the direct memory instead of relying on garbage collection. Another possible fix is to maintain memory pools independently of the preallocation mode. I think this is fine because preallocation:false suggests that no memory will be preallocated but not that memory will be freed once acquired.
[jira] [Created] (FLINK-4301) Parameterize Flink version in Quickstart bash script
Timo Walther created FLINK-4301: --- Summary: Parameterize Flink version in Quickstart bash script Key: FLINK-4301 URL: https://issues.apache.org/jira/browse/FLINK-4301 Project: Flink Issue Type: Improvement Components: Documentation Reporter: Timo Walther The Flink version is hard coded in the quickstart script (for Scala and Java). Thus, even if a user is in the Flink 1.0 docs the scripts are producing a quickstart of the Flink 1.1 release. It would be better if the one-liner in the documentation would contain the version such that a Flink 1.0 quickstart is build in the 1.0 documentation and 1.1 quickstart in the 1.1 documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4302) Add JavaDocs to MetricConfig
Ufuk Celebi created FLINK-4302: -- Summary: Add JavaDocs to MetricConfig Key: FLINK-4302 URL: https://issues.apache.org/jira/browse/FLINK-4302 Project: Flink Issue Type: Improvement Components: Metrics Affects Versions: 1.1.0 Reporter: Ufuk Celebi {{MetricConfig}} has no comments at all. If you want to implement a custom reporter and you want to implement its {{open}} method, a {{MetricConfig}} is its argument. It will be helpful to add one class-level JavaDoc stating where the config values are coming from etc. [~Zentol] what do you think? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] FLIP-5 Only send data to each taskmanager once for broadcasts
Hi Felix! Hope this helps_ Concerning (1.1) - The producer does not think in term of number of target TaskManagers. That number can, after all, change in the presence of a failure and recovery. The producer should, for its own result, not care how many consumers it will have (Tasks), but produce it only once. Concerning (1.2) - Only "blocking" intermediate results can be consumed multiple times. Data sent to broadcast variables must thus be always a blocking intermediate result. Greetings, Stephan On Wed, Jul 27, 2016 at 11:33 AM, Felix Neutatz wrote: > Hi Stephan, > > thanks for the great ideas. First I have some questions: > > 1.1) Does every task generate an intermediate result partition for every > target task or is that already implemented in a way so that there are only > as many intermediate result partitions per task manager as target tasks? > (Example: There are 2 task managers with 2 tasks each. Do we get 4 > intermediate result partitions per task manager or do we get 8?) > 1.2) How can I consume an intermediate result partition multiple times? > When I tried that I got the following exception: > Caused by: java.lang.IllegalStateException: Subpartition 0 of > dbe284e3b37c1df1b993a3f0a6020ea6@ce9fc38f08a5cc9e93431a9cbf740dcf is being > or already has been consumed, but pipelined subpartitions can only be > consumed once. > at > > org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.createReadView(PipelinedSubpartition.java:179) > at > > org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.createReadView(PipelinedSubpartition.java:36) > at > > org.apache.flink.runtime.io.network.partition.ResultPartition.createSubpartitionView(ResultPartition.java:348) > at > > org.apache.flink.runtime.io.network.partition.ResultPartitionManager.createSubpartitionView(ResultPartitionManager.java:81) > at > > org.apache.flink.runtime.io.network.netty.PartitionRequestServerHandler.channelRead0(PartitionRequestServerHandler.java:98) > at > > org.apache.flink.runtime.io.network.netty.PartitionRequestServerHandler.channelRead0(PartitionRequestServerHandler.java:41) > at > > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > > My status update: Since Friday I am implementing your idea described in > (2). Locally this approach already works (for less than 170 iterations). I > will investigate further to solve that issue. > > But I am still not sure how to implement (1). Maybe we introduce a similar > construct like the BroadcastVariableManager to share the RecordWriter among > all tasks of a taskmanager. I am interested in your thoughts :) > > Best regards, > Felix > > 2016-07-22 17:25 GMT+02:00 Stephan Ewen : > > > Hi Felix! > > > > Interesting suggestion. Here are some thoughts on the design. > > > > The two core changes needed to send data once to the TaskManagers are: > > > > (1) Every sender needs to produce its stuff once (rather than for every > > target task), there should not be redundancy there. > > (2) Every TaskManager should request the data once, other tasks in the > > same TaskManager pick it up from there. > > > > > > The current receiver-initialted pull model is actually a good abstraction > > for that, I think. > > > > Lets look at (1): > > > > - Currently, the TaskManagers have a separate intermediate result > > partition for each target slot. They should rather have one intermediate > > result partition (saves also repeated serialization) that is consumed > > multiple times. > > > > - Since the results that are to be broadcasted are always "blocking", > > they can be consumed (pulled) multiples times. > > > > Lets look at (2): > > > > - The current BroadcastVariableManager has the functionality to let the > > first accessor of the BC-variable materialize the result. > > > > - It could be changed such that only the first accessor creates a > > RecordReader, so the others do not even request the stream. That way, the > > TaskManager should pull only one stream from each producing task, which > > means the data is transferred once. > > > > > > That would also work perfectly with the current failure / recovery model. > > > > What do you think? > > > > Stephan > > > > > > On Fri, Jul 22, 2016 at 2:59 PM, Felix Neutatz > > wrote: > > > > > Hi everybody, > > > > > > I want to improve the performance of broadcasts in Flink. Therefore > Till > > > told me to start a FLIP on this topic to discuss how to go forward to > > solve > > > the current issues for broadcasts. > > > > > > The problem in a nutshell: Instead of sending data to each taskmanager > > only > > > once, at the moment the data is sent to each task. This means if there > > are > > > 3 slots on each taskmanager we will send the data 3 times instead of > > once. > > > > > > There are multiple ways to tackle this problem and I started to do some > > > research and investigate. You can follow my thought process here: > > > > > > > > > > > > https://cwiki
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
I tried it again now. I did: rm -r .m2/repository mvn clean verify -Dhadoop.version=2.6.0 failed again. Also with versions 2.6.1 and 2.6.3. On Mon, 1 Aug 2016 at 08:23 Maximilian Michels wrote: > This is also a major issue for batch with off-heap memory and memory > preallocation turned off: > https://issues.apache.org/jira/browse/FLINK-4094 > Not hard to fix though as we simply need to reliably clear the direct > memory instead of relying on garbage collection. Another possible fix > is to maintain memory pools independently of the preallocation mode. I > think this is fine because preallocation:false suggests that no memory > will be preallocated but not that memory will be freed once acquired. >
Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
Alright, that seems reasonable. I updated the doc to add the Collector to the method signature again. On Mon, 1 Aug 2016 at 00:59 Stephan Ewen wrote: > The Collector is pretty integral in all the other functions that return > multiple elements. I honestly don't see us switching away from it, given > that it is such a core part of the API. > > The close() method has, to my best knowledge, not caused issues, yet. I > cannot recall anyone mentioning that the close() method confused them, they > accidentally called it, etc. > I am wondering whether this is more of a theoretical than practical issue. > > If we move one function away from Collector to be on the safe side for a > "maybe change in the future", while keeping Collector in all other > functions - I think that fragments the API concept wise more than it > improves anything. > > On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek > wrote: > > > @Stephan: For the Output, should we keep using a Collector (which > exposes) > > the close() method which should never be called by users or create a new > > Output type that only has an "output" method. Collector can also be used > > but with a close() method that doesn't do anything. In the long run, I > > thought it might be better to switch the type away from Collector. > > > > Cheers, > > Aljoscha > > > > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels wrote: > > > > > I think it looks like Beam rather than Hadoop :) > > > > > > What Stephan meant was that he wanted a dedicated output method in the > > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose > > > the collector. > > > > > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek < > aljos...@apache.org> > > > wrote: > > > > You mean keep the Collector? I don't like that one because it has the > > > > close() method that should never be called by the user. > > > > > > > > We can keep it, though, because all the other user function > interfaces > > > also > > > > expose it to the user. > > > > > > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen wrote: > > > > > > > >> I would actually make the output a separate parameter as well. > Pretty > > > much > > > >> like the old variant, only replacing the "Window" parameter by the > > > context > > > >> (which contains everything about the window). > > > >> It could also be called "WindowInvocationContext" or so. > > > >> > > > >> The current variant looks too Hadoop to me ;-) Everything done on > the > > > >> context object, and messy mocking when creating tests. > > > >> > > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran < > > radu.tudo...@huawei.com> > > > >> wrote: > > > >> > > > >> > Hi, > > > >> > > > > >> > Sorry - I made a mistake - I was thinking of getting access to the > > > >> > collection (mist-read :) collector) of events in the window buffer > > in > > > >> > order to be able to delete/evict some of them which are not > > necessary > > > the > > > >> > last ones. > > > >> > > > > >> > > > > >> > Radu > > > >> > > > > >> > -Original Message- > > > >> > From: Aljoscha Krettek [mailto:aljos...@apache.org] > > > >> > Sent: Monday, July 18, 2016 5:54 PM > > > >> > To: dev@flink.apache.org > > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata > > > >> > > > > >> > What about the collector? This is only used for emitting elements > to > > > the > > > >> > downstream operation. > > > >> > > > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran < > radu.tudo...@huawei.com> > > > >> wrote: > > > >> > > > > >> > > Hi, > > > >> > > > > > >> > > I think it looks good and most importantly is that we can extend > > it > > > in > > > >> > > the directions discussed so far. > > > >> > > > > > >> > > One question though regarding the Collector - are we going to be > > > able > > > >> > > to delete random elements from the list if this is not exposed > as > > a > > > >> > > collection, at least to the evictor? If not, how are we going to > > > >> > > extend in the future to cover this case? > > > >> > > > > > >> > > Regarding the ordering - I also observed that there are > situations > > > >> > > where elements do not have a logical order. One example is if > you > > > have > > > >> > > high rates of the events. Nevertheless, even if now is not the > > time > > > >> > > for this, I think in the future we can imagine having also some > > data > > > >> > > structures that offer some ordering. It can save some > computation > > > >> > > efforts later in the functions for some use cases. > > > >> > > > > > >> > > > > > >> > > Best regards, > > > >> > > > > > >> > > > > > >> > > -Original Message- > > > >> > > From: Aljoscha Krettek [mailto:aljos...@apache.org] > > > >> > > Sent: Monday, July 18, 2016 3:45 PM > > > >> > > To: dev@flink.apache.org > > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata > > > >> > > > > > >> > > I incorporated the changes. The proposed interface of > > > >> > > ProcessWindowFunction is now this: > > > >>
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
I think that FLINK-4094 is nice to fix but not a release blocker since we know how to prevent this situation (setting preallocation to true). On Mon, Aug 1, 2016 at 11:56 PM, Aljoscha Krettek wrote: > I tried it again now. I did: > > rm -r .m2/repository > mvn clean verify -Dhadoop.version=2.6.0 > > failed again. Also with versions 2.6.1 and 2.6.3. > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels wrote: > > > This is also a major issue for batch with off-heap memory and memory > > preallocation turned off: > > https://issues.apache.org/jira/browse/FLINK-4094 > > Not hard to fix though as we simply need to reliably clear the direct > > memory instead of relying on garbage collection. Another possible fix > > is to maintain memory pools independently of the preallocation mode. I > > think this is fine because preallocation:false suggests that no memory > > will be preallocated but not that memory will be freed once acquired. > > >
Map Reduce Sorting
Hi, I have a question regarding when data points are sorted when applying a simple Map Reduce Job. I have the following code: data = readFromSource() data.map().groupBy(0).reduce(...) This code will be translated into the following execution plan: map -> combiner -> hash partitioning and sorting on 0 -> reduce. If I am right then the combiner firstly sorts the data, then it applies the combine function, and then it partitions the result. Now the partitions are consumed by the reducers. For each mapper/combiner machine, the reducer has an input gateway. For example, the mappers and combiners run on 10 machines, then each reducer has 10 input gateways. Now, the reducer consumes the data via a MutableObjectIterator. This iterator firstly consumes data from one input gateway, then from the other and so on. Is the data of a single input gateway already sorted? Because the combiner function has sorted the data already. Is the order of the data points maintained after they are sent through the network? In my code, the MutableObjectIterator instances are subclasses of NormalizedKeySorter. Does this mean that the data from an input gateway is firstly sorted before it is handover to the reduce function? Is this because the order of the data points is not mainted after sending through the network? It would be nice if someone can answer my question. If my assumptions are wrong, please correct me :) BR, Hilmi -- == Hilmi Yildirim, M.Sc. Researcher DFKI GmbH Intelligente Analytik für Massendaten DFKI Projektbüro Berlin Alt-Moabit 91c D-10559 Berlin Phone: +49 30 23895 1814 E-Mail: hilmi.yildi...@dfki.de - Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 -
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
Which Maven version are you using? On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek wrote: > I tried it again now. I did: > > rm -r .m2/repository > mvn clean verify -Dhadoop.version=2.6.0 > > failed again. Also with versions 2.6.1 and 2.6.3. > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels wrote: > >> This is also a major issue for batch with off-heap memory and memory >> preallocation turned off: >> https://issues.apache.org/jira/browse/FLINK-4094 >> Not hard to fix though as we simply need to reliably clear the direct >> memory instead of relying on garbage collection. Another possible fix >> is to maintain memory pools independently of the preallocation mode. I >> think this is fine because preallocation:false suggests that no memory >> will be preallocated but not that memory will be freed once acquired. >>
[jira] [Created] (FLINK-4303) Add CEP examples
Timo Walther created FLINK-4303: --- Summary: Add CEP examples Key: FLINK-4303 URL: https://issues.apache.org/jira/browse/FLINK-4303 Project: Flink Issue Type: Improvement Components: CEP Reporter: Timo Walther Neither CEP Java nor CEP Scala contain a runnable example. The example on the website is also not runnable without adding some additional code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
@Aljoscha: Have you made sure you have a clean maven cache (remove the .m2/repository/org/apache/flink folder)? On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek wrote: > I tried it again now. I did: > > rm -r .m2/repository > mvn clean verify -Dhadoop.version=2.6.0 > > failed again. Also with versions 2.6.1 and 2.6.3. > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels wrote: > > > This is also a major issue for batch with off-heap memory and memory > > preallocation turned off: > > https://issues.apache.org/jira/browse/FLINK-4094 > > Not hard to fix though as we simply need to reliably clear the direct > > memory instead of relying on garbage collection. Another possible fix > > is to maintain memory pools independently of the preallocation mode. I > > think this is fine because preallocation:false suggests that no memory > > will be preallocated but not that memory will be freed once acquired. > > >
Re: Map Reduce Sorting
Hi Hilmi, the results of the combiner are usually not completely sorted and if they are this property is not leveraged. This is due to the following reasons: 1) a sort-combiner only sorts as much data as fits into memory. If there is more data, the result consists of multiple sorted sequences. 2) since recently, Flink features a hash-based combiner which is usually more efficient and does not produce sorted output. 3) Flink's pipelined shipping strategy would require that the receiver merges the result records from all senders on the fly while receiving data via the network. In case of a straggling sender task all other senders would be blocked due to backpressure. In addition, this would only work if the combiner does a full sort and not several in-memory sorts. So, a Reducer will always do a full sort of all received data before applying the Reduce function (if available, a combiner is applied before data is written to disk in case of an external sort). Hope this helps, Fabian 2016-08-01 18:25 GMT+02:00 Hilmi Yildirim : > Hi, > > I have a question regarding when data points are sorted when applying a > simple Map Reduce Job. > > I have the following code: > > data = readFromSource() > > data.map().groupBy(0).reduce(...) > > This code will be translated into the following execution plan: > > map -> combiner -> hash partitioning and sorting on 0 -> reduce. > > > If I am right then the combiner firstly sorts the data, then it applies > the combine function, and then it partitions the result. > > Now the partitions are consumed by the reducers. For each mapper/combiner > machine, the reducer has an input gateway. For example, the mappers and > combiners run on 10 machines, then each reducer has 10 input gateways. Now, > the reducer consumes the data via a MutableObjectIterator. This iterator > firstly consumes data from one input gateway, then from the other and so > on. Is the data of a single input gateway already sorted? Because the > combiner function has sorted the data already. Is the order of the data > points maintained after they are sent through the network? > > In my code, the MutableObjectIterator instances are subclasses of > NormalizedKeySorter. Does this mean that the data from an input gateway is > firstly sorted before it is handover to the reduce function? Is this > because the order of the data points is not mainted after sending through > the network? > > > It would be nice if someone can answer my question. If my assumptions are > wrong, please correct me :) > > > BR, > > Hilmi > > > > > -- > == > Hilmi Yildirim, M.Sc. > Researcher > > DFKI GmbH > Intelligente Analytik für Massendaten > DFKI Projektbüro Berlin > Alt-Moabit 91c > D-10559 Berlin > Phone: +49 30 23895 1814 > > E-Mail: hilmi.yildi...@dfki.de > > - > Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH > Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern > > Geschaeftsfuehrung: > Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender) > Dr. Walter Olthoff > > Vorsitzender des Aufsichtsrats: > Prof. Dr. h.c. Hans A. Aukes > > Amtsgericht Kaiserslautern, HRB 2313 > - > >
Re: [VOTE] Release Apache Flink 1.1.0 (RC1)
@Ufuk: 3.3.9, that's probably it because that messes with the shading, right? @Stephan: Yes, even did a "rm -r .m2/repository". But the maven version is most likely the reason. On Mon, 1 Aug 2016 at 10:59 Stephan Ewen wrote: > @Aljoscha: Have you made sure you have a clean maven cache (remove the > .m2/repository/org/apache/flink folder)? > > On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek > wrote: > > > I tried it again now. I did: > > > > rm -r .m2/repository > > mvn clean verify -Dhadoop.version=2.6.0 > > > > failed again. Also with versions 2.6.1 and 2.6.3. > > > > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels wrote: > > > > > This is also a major issue for batch with off-heap memory and memory > > > preallocation turned off: > > > https://issues.apache.org/jira/browse/FLINK-4094 > > > Not hard to fix though as we simply need to reliably clear the direct > > > memory instead of relying on garbage collection. Another possible fix > > > is to maintain memory pools independently of the preallocation mode. I > > > think this is fine because preallocation:false suggests that no memory > > > will be preallocated but not that memory will be freed once acquired. > > > > > >