Re: Introduction

2016-08-01 Thread Stephan Ewen
Welcome Neelesh!


On Mon, Aug 1, 2016 at 4:47 AM, Márton Balassi 
wrote:

> Welcome Neelesh, great to have you here. :)
>
> On Sun, Jul 31, 2016, 11:08 Neelesh Salian  wrote:
>
> > Hello folks,
> >
> > I am Neelesh Salian; I recently joined the Flink community and I wanted
> to
> > take this opportunity to formally introduce myself.
> >
> > I have been working with the Hadoop and Spark ecosystems over the past
> two
> > years and found Flink really interesting in the Streaming use case.
> >
> >
> > Excited to start working and help build the community. :)
> > Thank you.
> > Have a great Sunday.
> >
> >
> > --
> > Neelesh Srinivas Salian
> > Customer Operations Engineer
> >
>


[jira] [Created] (FLINK-4285) Non-existing example in Flink quickstart setup documentation

2016-08-01 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-4285:
--

 Summary: Non-existing example in Flink quickstart setup 
documentation
 Key: FLINK-4285
 URL: https://issues.apache.org/jira/browse/FLINK-4285
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Examples
Reporter: Tzu-Li (Gordon) Tai
Priority: Minor
 Fix For: 1.1.0


In 
https://ci.apache.org/projects/flink/flink-docs-master/quickstart/setup_quickstart.html,
 the {{examples/streaming/SocketTextStreamWordCount.jar}} example doesn't exist 
anymore in the distributed package.

It was somehow previously removed in 
https://github.com/apache/flink/commit/986d5368fdb84c44111994b667ce1fc5f9992716.
 We either add the {{SocketTextStreamWordCount}} example back, or change the 
documentation to use another example, probably {{SocketWindowWordCount}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Introduction

2016-08-01 Thread Ufuk Celebi
On Sun, Jul 31, 2016 at 8:07 PM, Neelesh Salian  wrote:
> I am Neelesh Salian; I recently joined the Flink community and I wanted to
> take this opportunity to formally introduce myself.

Thanks and welcome! :-)


[jira] [Created] (FLINK-4286) Have Kafka examples that use the Kafka 0.9 connector

2016-08-01 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-4286:
--

 Summary: Have Kafka examples that use the Kafka 0.9 connector
 Key: FLINK-4286
 URL: https://issues.apache.org/jira/browse/FLINK-4286
 Project: Flink
  Issue Type: Improvement
  Components: Examples
Reporter: Tzu-Li (Gordon) Tai
Priority: Minor


The {{ReadFromKafka}} and {{WriteIntoKafka}} examples use the 0.8 connector, 
and the built example jar is named {{Kafka.jar}} under {{examples/streaming/}} 
in the distributed package.

Since we have different connectors for different Kafka versions, it would be 
good to have examples for different versions, and package them as 
{{Kafka08.jar}} and {{Kafka09.jar}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Introduction

2016-08-01 Thread Kevin Jacobs

Hi!

Welcome to the community :-)!


On 01.08.2016 09:51, Ufuk Celebi wrote:

On Sun, Jul 31, 2016 at 8:07 PM, Neelesh Salian  wrote:

I am Neelesh Salian; I recently joined the Flink community and I wanted to
take this opportunity to formally introduce myself.

Thanks and welcome! :-)




Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

2016-08-01 Thread Stephan Ewen
The Collector is pretty integral in all the other functions that return
multiple elements. I honestly don't see us switching away from it, given
that it is such a core part of the API.

The close() method has, to my best knowledge, not caused issues, yet. I
cannot recall anyone mentioning that the close() method confused them, they
accidentally called it, etc.
I am wondering whether this is more of a theoretical than practical issue.

If we move one function away from Collector to be on the safe side for a
"maybe change in the future", while keeping Collector in all other
functions - I think that fragments the API concept wise more than it
improves anything.

On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek 
wrote:

> @Stephan: For the Output, should we keep using a Collector (which exposes)
> the close() method which should never be called by users or create a new
> Output type that only has an "output" method. Collector can also be used
> but with a close() method that doesn't do anything. In the long run, I
> thought it might be better to switch the type away from Collector.
>
> Cheers,
> Aljoscha
>
> On Wed, 20 Jul 2016 at 01:25 Maximilian Michels  wrote:
>
> > I think it looks like Beam rather than Hadoop :)
> >
> > What Stephan meant was that he wanted a dedicated output method in the
> > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
> > the collector.
> >
> > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek 
> > wrote:
> > > You mean keep the Collector? I don't like that one because it has the
> > > close() method that should never be called by the user.
> > >
> > > We can keep it, though, because all the other user function interfaces
> > also
> > > expose it to the user.
> > >
> > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen  wrote:
> > >
> > >> I would actually make the output a separate parameter as well. Pretty
> > much
> > >> like the old variant, only replacing the "Window" parameter by the
> > context
> > >> (which contains everything about the window).
> > >> It could also be called "WindowInvocationContext" or so.
> > >>
> > >> The current variant looks too Hadoop to me ;-) Everything done on the
> > >> context object, and messy mocking when creating tests.
> > >>
> > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> radu.tudo...@huawei.com>
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > Sorry - I made a mistake - I was thinking of getting access to the
> > >> > collection (mist-read :) collector) of events in the window buffer
> in
> > >> > order to be able to delete/evict some of them which are not
> necessary
> > the
> > >> > last ones.
> > >> >
> > >> >
> > >> > Radu
> > >> >
> > >> > -Original Message-
> > >> > From: Aljoscha Krettek [mailto:aljos...@apache.org]
> > >> > Sent: Monday, July 18, 2016 5:54 PM
> > >> > To: dev@flink.apache.org
> > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > >> >
> > >> > What about the collector? This is only used for emitting elements to
> > the
> > >> > downstream operation.
> > >> >
> > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran 
> > >> wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > I think it looks good and most importantly is that we can extend
> it
> > in
> > >> > > the directions discussed so far.
> > >> > >
> > >> > > One question though regarding the Collector - are we going to be
> > able
> > >> > > to delete random elements from the list if this is not exposed as
> a
> > >> > > collection, at least to the evictor? If not, how are we going to
> > >> > > extend in the future to cover this case?
> > >> > >
> > >> > > Regarding the ordering - I also observed that there are situations
> > >> > > where elements do not have a logical order. One example is if you
> > have
> > >> > > high rates of the events. Nevertheless, even if now is not the
> time
> > >> > > for this, I think in the future we can imagine having also some
> data
> > >> > > structures that offer some ordering. It can save some computation
> > >> > > efforts later in the functions for some use cases.
> > >> > >
> > >> > >
> > >> > > Best regards,
> > >> > >
> > >> > >
> > >> > > -Original Message-
> > >> > > From: Aljoscha Krettek [mailto:aljos...@apache.org]
> > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > >> > > To: dev@flink.apache.org
> > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > >> > >
> > >> > > I incorporated the changes. The proposed interface of
> > >> > > ProcessWindowFunction is now this:
> > >> > >
> > >> > > public abstract class ProcessWindowFunction  extends
> > >> > > Window> implements Function {
> > >> > >
> > >> > > public abstract void process(KEY key, Iterable elements,
> > >> > > Context
> > >> > > ctx) throws Exception;
> > >> > >
> > >> > > public abstract class Context {
> > >> > > public abstract W window();
> > >> > > public abstract void output(OUT value);
> > >> > > }
> > >> > > }
> > >> > >
> > >> > > I'

[jira] [Created] (FLINK-4287) Unable to access secured HBase from a yarn-session.

2016-08-01 Thread Niels Basjes (JIRA)
Niels Basjes created FLINK-4287:
---

 Summary: Unable to access secured HBase from a yarn-session.
 Key: FLINK-4287
 URL: https://issues.apache.org/jira/browse/FLINK-4287
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Affects Versions: 1.0.3
Reporter: Niels Basjes
Assignee: Niels Basjes


When I start {{yarn-session.sh -n1}} against a Kerberos secured Yarn+HBase 
cluster I see this in the messages:

{quote}
2016-08-01 09:53:01,763 INFO  org.apache.flink.yarn.Utils   
- Attempting to obtain Kerberos security token for HBase
2016-08-01 09:53:01,763 INFO  org.apache.flink.yarn.Utils   
- HBase is not available (not packaged with this application): 
ClassNotFoundException : "org.apache.hadoop.hbase.HBaseConfiguration".
{quote}

as a consequence it has become impossible to access a secured HBase from this 
yarn session.

>From what I see now at least two things need to be done:
# Add all relevant HBase parts to the yarn-session.sh scripting.
# Add an optional option to pass principle and keytab file so the session can 
last longer than the time the Kerberos tickets last. (i.e pass these parameters 
into a call to {{UserGroupInformation.loginUserFromKeytab(user, keytabFile);}})

I do see that this would leave an important problem open:

This yarnsession is accessible by everyone on the cluster and as a consequence 
they can run jobs in there that can access all data I have access to. Perhaps 
this should be a separate jira issue?





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Stephan Ewen
Just tried to reproduce the error reported by Aljoscha, but could not.
I used a clean checkpoint of the RC1 code and cleaned all local maven
caches before the testing.

@Aljoscha: Can you reproduce this on your machine? Can you try and clean
the maven caches?

On Sun, Jul 31, 2016 at 7:31 PM, Ufuk Celebi  wrote:

> Probably related to shading :( What's strange is that Travis builds
> for Hadoop 2.6.3 with the release-1.1 branch do succeed (sometimes...
> Travis is super flakey at the moment, because of some corrupted cached
> dependencies): https://travis-ci.org/apache/flink/jobs/148348699
>
> On Fri, Jul 29, 2016 at 4:19 PM, Aljoscha Krettek 
> wrote:
> > When running "mvn clean verify" with Hadoop version 2.6.1 the
> > Zookeeper/Leader Election tests fail with this:
> >
> > java.lang.NoSuchMethodError:
> >
> org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String;
> > at
> >
> org.apache.curator.framework.imps.NamespaceImpl.(NamespaceImpl.java:37)
> > at
> >
> org.apache.curator.framework.imps.CuratorFrameworkImpl.(CuratorFrameworkImpl.java:113)
> > at
> >
> org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124)
> > at
> >
> org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101)
> > at
> >
> org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143)
> > at
> >
> org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70)
> > at
> >
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187)
> >
> > I'll continue testing other parts and other Hadoop versions.
> >
> > On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi  wrote:
> >
> >> Dear Flink community,
> >>
> >> Please vote on releasing the following candidate as Apache Flink version
> >> 1.1.0.
> >>
> >> I've CC'd u...@flink.apache.org as users are encouraged to help
> >> testing Flink 1.1.0 for their specific use cases. Please feel free to
> >> report issues and successful tests on dev@flink.apache.org.
> >>
> >> The commit to be voted on:
> >> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463)
> >>
> >> Branch:
> >> release-1.1.0-rc1
> >> (
> >>
> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1
> >> )
> >>
> >> The release artifacts to be voted on can be found at:
> >> http://people.apache.org/~uce/flink-1.1.0-rc1/
> >>
> >> The release artifacts are signed with the key with fingerprint 9D403309:
> >> http://www.apache.org/dist/flink/KEYS
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapacheflink-1098
> >>
> >> There is also a Google doc to coordinate the testing efforts. This is
> >> a copy of the release document found in our Wiki:
> >>
> >>
> https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing
> >>
> >> -
> >>
> >> Thanks to everyone who contributed to this release candidate.
> >>
> >> The vote is open for the next 3 days (not counting the weekend) and
> >> passes if a majority of at least three +1 PMC votes are cast.
> >>
> >> The vote ends on Monday August 1st, 2016.
> >>
> >> [ ] +1 Release this package as Apache Flink 1.1.0
> >> [ ] -1 Do not release this package, because ...
> >>
>


[jira] [Created] (FLINK-4288) Make it possible to unregister tables

2016-08-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-4288:
---

 Summary: Make it possible to unregister tables
 Key: FLINK-4288
 URL: https://issues.apache.org/jira/browse/FLINK-4288
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Reporter: Timo Walther


Table names can not be changed yet. After registration you can not modify the 
table behind a table name. Maybe this behavior is too restrictive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4289) Source files have executable flag set

2016-08-01 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4289:
--

 Summary: Source files have executable flag set
 Key: FLINK-4289
 URL: https://issues.apache.org/jira/browse/FLINK-4289
 Project: Flink
  Issue Type: Bug
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi
Priority: Minor


Running {{find . -executable -type f}} lists the following source {{.java}} 
files as executable:

{code}
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/SumFunction.java
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/ApplyFunction.java
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/GatherFunction.java
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/GatherSumApplyIteration.java
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/gsa/Neighbor.java
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/library/GSASingleSourceShortestPaths.java
./flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/library/GSAConnectedComponents.java
./flink-libraries/flink-gelly-examples/src/test/java/org/apache/flink/graph/test/GatherSumApplyITCase.java
./flink-libraries/flink-gelly-examples/src/main/java/org/apache/flink/graph/examples/GSASingleSourceShortestPaths.java
./flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/CurrentUserRetweet.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Size.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Entities.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/URL.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/UserMention.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Media.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/HashTags.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/entities/Symbol.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/Tweet.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/Contributors.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/tweet/Coordinates.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/places/BoundingBox.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/places/Attributes.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/places/Places.java
./flink-contrib/flink-tweet-inputformat/src/main/java/org/apache/flink/contrib/tweetinputformat/model/User/Users.java
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4290) CassandraConnectorTest deadlocks

2016-08-01 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4290:
---

 Summary: CassandraConnectorTest deadlocks
 Key: FLINK-4290
 URL: https://issues.apache.org/jira/browse/FLINK-4290
 Project: Flink
  Issue Type: Bug
  Components: Streaming Connectors
Affects Versions: 1.0.3
Reporter: Stephan Ewen
Priority: Critical


The {{CassandraConnectorTest}} encountered a full deadlock on my latest test 
run.

Stack dump of the JVM is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4291) No log entry for unscheduled reporters

2016-08-01 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-4291:
---

 Summary: No log entry for unscheduled reporters
 Key: FLINK-4291
 URL: https://issues.apache.org/jira/browse/FLINK-4291
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.1.0
Reporter: Chesnay Schepler
Priority: Trivial


When a non-Scheduled reporter is configured no log message is printed. It would 
be nice if we would print a log message for every instantiated reporter, not 
just Scheduled ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4292) HCatalog project incorrectly set up

2016-08-01 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4292:
---

 Summary: HCatalog project incorrectly set up
 Key: FLINK-4292
 URL: https://issues.apache.org/jira/browse/FLINK-4292
 Project: Flink
  Issue Type: Bug
  Components: Batch Connectors and Input/Output Formats
Affects Versions: 1.0.3
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Priority: Critical
 Fix For: 1.1.1, 1.2.0


The {{HCatalog}} project is erroneous in IntelliJ, because it misses the Scala 
SDK dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4293) Malformatted Apache Haders

2016-08-01 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4293:
---

 Summary: Malformatted Apache Haders
 Key: FLINK-4293
 URL: https://issues.apache.org/jira/browse/FLINK-4293
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.0.3
Reporter: Stephan Ewen


Several files contain this header:

{code}
/**
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 * 
 * http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
{code}

The correct header format should be:

{code}
/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Introduction

2016-08-01 Thread Till Rohrmann
Welcome to the community Neelesh :-)

On Mon, Aug 1, 2016 at 3:53 PM, Kevin Jacobs  wrote:

> Hi!
>
> Welcome to the community :-)!
>
>
>
> On 01.08.2016 09:51, Ufuk Celebi wrote:
>
>> On Sun, Jul 31, 2016 at 8:07 PM, Neelesh Salian 
>> wrote:
>>
>>> I am Neelesh Salian; I recently joined the Flink community and I wanted
>>> to
>>> take this opportunity to formally introduce myself.
>>>
>> Thanks and welcome! :-)
>>
>
>


[jira] [Created] (FLINK-4294) Allow access of composite type fields

2016-08-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-4294:
---

 Summary: Allow access of composite type fields
 Key: FLINK-4294
 URL: https://issues.apache.org/jira/browse/FLINK-4294
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Timo Walther
Assignee: Timo Walther


Currently all Flink CompositeTypes are treated as GenericRelDataTypes. It would 
be better to access individual fields of composite types, too. e.g.

{{code}}
SELECT composite.name FROM composites
SELECT tuple.f0 FROM tuples
'f0.getField(0)
{{code}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4295) CoGroupSortTranslationTest#testGroupSortTuplesDefaultCoGroup fails to run

2016-08-01 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-4295:
---

 Summary: 
CoGroupSortTranslationTest#testGroupSortTuplesDefaultCoGroup fails to run
 Key: FLINK-4295
 URL: https://issues.apache.org/jira/browse/FLINK-4295
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.1.0
Reporter: Chesnay Schepler
Priority: Minor


When running the test you always get an exception;

08/01/2016 15:03:34 Job execution switched to status RUNNING.
08/01/2016 15:03:34 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to 
SCHEDULED 
08/01/2016 15:03:34 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to 
DEPLOYING 
08/01/2016 15:03:34 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to 
SCHEDULED 
08/01/2016 15:03:34 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to 
DEPLOYING 
08/01/2016 15:03:34 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to RUNNING 
08/01/2016 15:03:34 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to RUNNING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(1/4)
 switched to SCHEDULED 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(3/4)
 switched to SCHEDULED 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(3/4)
 switched to DEPLOYING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(1/4)
 switched to DEPLOYING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4)
 switched to SCHEDULED 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4)
 switched to DEPLOYING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(2/4)
 switched to SCHEDULED 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(2/4)
 switched to DEPLOYING 
08/01/2016 15:03:35 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to 
FINISHED 
08/01/2016 15:03:35 DataSource (at 
org.apache.flink.api.scala.ExecutionEnvironment.fromElements(ExecutionEnvironment.scala:566)
 (org.apache.flink.api.java.io.CollectionInputFormat))(1/1) switched to 
FINISHED 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(3/4)
 switched to RUNNING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(1/4)
 switched to RUNNING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(2/4)
 switched to RUNNING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4)
 switched to RUNNING 
08/01/2016 15:03:35 DataSink (collect())(4/4) switched to SCHEDULED 
08/01/2016 15:03:35 DataSink (collect())(1/4) switched to SCHEDULED 
08/01/2016 15:03:35 DataSink (collect())(4/4) switched to DEPLOYING 
08/01/2016 15:03:35 DataSink (collect())(1/4) switched to DEPLOYING 
08/01/2016 15:03:35 DataSink (collect())(3/4) switched to SCHEDULED 
08/01/2016 15:03:35 DataSink (collect())(3/4) switched to DEPLOYING 
08/01/2016 15:03:35 CoGroup (CoGroup at 
org.apache.flink.api.scala.UnfinishedCoGroupOperation.finish(UnfinishedCoGroupOperation.scala:46))(4/4)
 switched to FINISHED 
08/01/2016 15:03:

[jira] [Created] (FLINK-4296) Scheduler accepts more tasks than it has task slots available

2016-08-01 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-4296:
-

 Summary: Scheduler accepts more tasks than it has task slots 
available
 Key: FLINK-4296
 URL: https://issues.apache.org/jira/browse/FLINK-4296
 Project: Flink
  Issue Type: Bug
  Components: JobManager, TaskManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Priority: Critical
 Fix For: 1.1.0, 1.2.0


Flink's scheduler doesn't support queued scheduling but expects to find all 
necessary task slots upon scheduling. If it does not it throws an error. Due to 
some changes in the latest master, this seems to be broken.

Flink accepts jobs with {{parallelism > total number of task slots}}, schedules 
and deploys tasks in all available task slots, and leaves the remaining tasks 
lingering forever.

Easy to reproduce: 
{code}
./bin/flink run -p TASK_SLOTS+n
{code} 

where {{TASK_SLOTS}} is the number of total task slots of the cluster and 
{{n>=1}}.

Here, {{p=11}}, {{TASK_SLOTS=10}}:

{noformat}
Cluster configuration: Standalone cluster with JobManager at 
localhost/127.0.0.1:6123
Using address localhost:6123 to connect to JobManager.
JobManager web interface address http://localhost:8081
Starting execution of program
Executing EnumTriangles example with default edges data set.
Use --edges to specify file input.
Printing result to stdout. Use --output to specify output path.
Submitting job with JobID: cd0c0b4cbe25643d8d92558168cfc045. Waiting for job 
completion.
08/01/2016 12:12:12 Job execution switched to status RUNNING.
08/01/2016 12:12:12 CHAIN DataSource (at 
getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
(org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
main(EnumTriangles.java:108))(1/1) switched to SCHEDULED
08/01/2016 12:12:12 CHAIN DataSource (at 
getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
(org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
main(EnumTriangles.java:108))(1/1) switched to DEPLOYING
08/01/2016 12:12:12 CHAIN DataSource (at 
getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
(org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
main(EnumTriangles.java:108))(1/1) switched to RUNNING
08/01/2016 12:12:12 CHAIN DataSource (at 
getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
(org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
main(EnumTriangles.java:108))(1/1) switched to FINISHED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(1/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(3/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(2/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(7/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(7/11) switched to DEPLOYING
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(6/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(4/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(5/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(4/11) switched to DEPLOYING
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(3/11) switched to DEPLOYING
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(9/11) switched to SCHEDULED
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(9/11) switched to DEPLOYING
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(5/11) switched to DEPLOYING
08/01/2016 12:12:12 GroupReduce (GroupReduce at 
main(EnumTriangles.java:112))(1/11) switched to DEPLOYING
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(1/11) 
switched to SCHEDULED
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(1/11) 
switched to DEPLOYING
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(2/11) 
switched to SCHEDULED
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(2/11) 
switched to DEPLOYING
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(3/11) 
switched to SCHEDULED
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(3/11) 
switched to DEPLOYING
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(4/11) 
switched to SCHEDULED
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(4/11) 
switched to DEPLOYING
08/01/2016 12:12:12 Join(Join at main(EnumTriangles.java:114))(5/11) 
switched to SCHEDULED
08/01/2016 12:12:12 Join(Join at main

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Maximilian Michels
Thanks for the new release candidate Ufuk!

Found two issues during testing:

1) Scheduling: The Flink scheduler accepts (it shouldn't) jobs with
parallelism > total number of task slots, schedules tasks in all
available task slots, and leaves the remaining tasks lingering
forever. Haven't had time to investigate much, but a bit more details
here:
=> JIRA: https://issues.apache.org/jira/browse/FLINK-4296

2) Yarn encoding issues with special characters in automatically
determined location of the far jar
=> JIRA: https://issues.apache.org/jira/browse/FLINK-4297
=> Fix: https://github.com/apache/flink/pull/2320

Otherwise, looks pretty good so far :)

On Mon, Aug 1, 2016 at 10:27 AM, Stephan Ewen  wrote:
> Just tried to reproduce the error reported by Aljoscha, but could not.
> I used a clean checkpoint of the RC1 code and cleaned all local maven caches
> before the testing.
>
> @Aljoscha: Can you reproduce this on your machine? Can you try and clean the
> maven caches?
>
> On Sun, Jul 31, 2016 at 7:31 PM, Ufuk Celebi  wrote:
>>
>> Probably related to shading :( What's strange is that Travis builds
>> for Hadoop 2.6.3 with the release-1.1 branch do succeed (sometimes...
>> Travis is super flakey at the moment, because of some corrupted cached
>> dependencies): https://travis-ci.org/apache/flink/jobs/148348699
>>
>> On Fri, Jul 29, 2016 at 4:19 PM, Aljoscha Krettek 
>> wrote:
>> > When running "mvn clean verify" with Hadoop version 2.6.1 the
>> > Zookeeper/Leader Election tests fail with this:
>> >
>> > java.lang.NoSuchMethodError:
>> >
>> > org.apache.curator.utils.PathUtils.validatePath(Ljava/lang/String;)Ljava/lang/String;
>> > at
>> >
>> > org.apache.curator.framework.imps.NamespaceImpl.(NamespaceImpl.java:37)
>> > at
>> >
>> > org.apache.curator.framework.imps.CuratorFrameworkImpl.(CuratorFrameworkImpl.java:113)
>> > at
>> >
>> > org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:124)
>> > at
>> >
>> > org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:101)
>> > at
>> >
>> > org.apache.flink.runtime.util.ZooKeeperUtils.createLeaderRetrievalService(ZooKeeperUtils.java:143)
>> > at
>> >
>> > org.apache.flink.runtime.util.LeaderRetrievalUtils.createLeaderRetrievalService(LeaderRetrievalUtils.java:70)
>> > at
>> >
>> > org.apache.flink.runtime.leaderelection.ZooKeeperLeaderRetrievalTest.testTimeoutOfFindConnectingAddress(ZooKeeperLeaderRetrievalTest.java:187)
>> >
>> > I'll continue testing other parts and other Hadoop versions.
>> >
>> > On Wed, 27 Jul 2016 at 11:51 Ufuk Celebi  wrote:
>> >
>> >> Dear Flink community,
>> >>
>> >> Please vote on releasing the following candidate as Apache Flink
>> >> version
>> >> 1.1.0.
>> >>
>> >> I've CC'd u...@flink.apache.org as users are encouraged to help
>> >> testing Flink 1.1.0 for their specific use cases. Please feel free to
>> >> report issues and successful tests on dev@flink.apache.org.
>> >>
>> >> The commit to be voted on:
>> >> 3a18463 (http://git-wip-us.apache.org/repos/asf/flink/commit/3a18463)
>> >>
>> >> Branch:
>> >> release-1.1.0-rc1
>> >> (
>> >>
>> >> https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink.git;a=shortlog;h=refs/heads/release-1.1.0-rc1
>> >> )
>> >>
>> >> The release artifacts to be voted on can be found at:
>> >> http://people.apache.org/~uce/flink-1.1.0-rc1/
>> >>
>> >> The release artifacts are signed with the key with fingerprint
>> >> 9D403309:
>> >> http://www.apache.org/dist/flink/KEYS
>> >>
>> >> The staging repository for this release can be found at:
>> >> https://repository.apache.org/content/repositories/orgapacheflink-1098
>> >>
>> >> There is also a Google doc to coordinate the testing efforts. This is
>> >> a copy of the release document found in our Wiki:
>> >>
>> >>
>> >> https://docs.google.com/document/d/1cDZGtnGJKLU1fLw8AE_FzkoDLOR8amYT2oc3mD0_lw4/edit?usp=sharing
>> >>
>> >> -
>> >>
>> >> Thanks to everyone who contributed to this release candidate.
>> >>
>> >> The vote is open for the next 3 days (not counting the weekend) and
>> >> passes if a majority of at least three +1 PMC votes are cast.
>> >>
>> >> The vote ends on Monday August 1st, 2016.
>> >>
>> >> [ ] +1 Release this package as Apache Flink 1.1.0
>> >> [ ] -1 Do not release this package, because ...
>> >>
>
>


[jira] [Created] (FLINK-4297) Yarn client can't determine fat jar location if path contains spaces

2016-08-01 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-4297:
-

 Summary: Yarn client can't determine fat jar location if path 
contains spaces
 Key: FLINK-4297
 URL: https://issues.apache.org/jira/browse/FLINK-4297
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 1.1.0, 1.2.0


The code that automatically determines the fat jar path through the 
ProtectionDomain of the Yarn class, receives a possibly URL encoded path 
string. We need to decode using the system locale encoding, otherwise we can 
receive errors of the following when spaces are in the file path: 

{noformat}
Caused by: java.io.FileNotFoundException: File 
file:/Users/max/Downloads/release-testing/flink-1.1.0-rc1/flink-1.1.0/build%20target/lib/flink-dist_2.11-1.1.0.jar
 does not exist
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at 
org.apache.hadoop.fs.LocalFileSystem.copyFromLocalFile(LocalFileSystem.java:82)
at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1836)
at org.apache.flink.yarn.Utils.setupLocalResource(Utils.java:129)
at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:616)
at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:365)
... 6 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4298) Clean up Storm Compatibility Dependencies

2016-08-01 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4298:
---

 Summary: Clean up Storm Compatibility Dependencies
 Key: FLINK-4298
 URL: https://issues.apache.org/jira/browse/FLINK-4298
 Project: Flink
  Issue Type: Bug
  Components: Storm Compatibility
Affects Versions: 1.0.3, 1.1.0
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.2.0


The {{flink-storm}} project contains unnecessary (transitive) dependencies

  - Google Guava
  - Ring Closure Servlet Bindings

Particularly the last one is frequently troublesome in Maven builds (unstable 
downloads) and is not required, because the Storm Compatibility layer does not 
start a web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4299) Show loss of job manager in Client

2016-08-01 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4299:
--

 Summary: Show loss of job manager in Client
 Key: FLINK-4299
 URL: https://issues.apache.org/jira/browse/FLINK-4299
 Project: Flink
  Issue Type: Improvement
  Components: Client
Reporter: Ufuk Celebi
 Fix For: 1.1.0


If the client looses the connection to a job manager and the job recovers from 
this, the client will only print the job status as {{RUNNING}} again. It is 
hard to actually notice that something went wrong and a job manager was lost.

{code}
...
08/01/2016 14:35:43 Flat Map -> Sink: Unnamed(8/8) switched to RUNNING
08/01/2016 14:35:43 Source: Custom Source(6/8) switched to RUNNING
<-- EVERYTHING'S RUNNING -->
08/01/2016 14:40:40 Job execution switched to status RUNNING <--- JOB 
MANAGER FAIL OVER
08/01/2016 14:40:40 Source: Custom Source(1/8) switched to SCHEDULED
08/01/2016 14:40:40 Source: Custom Source(1/8) switched to DEPLOYING
08/01/2016 14:40:40 Source: Custom Source(2/8) switched to SCHEDULED
...
{code}

After {{14:35:43}} everything is running and the client does not print any 
execution state updates. When the job manager fails, the job will be recovered 
and enter the running state again eventually (at 14:40:40), but the user might 
never notice this.

I would like to improve on this by printing some messages about the state of 
the job manager connection. For example, between {{14:35:43}} and {{14:40:40}} 
it might say that the job manager connection was lost, a new one established, 
etc.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4300) Improve error message for different Scala versions of JM and Client

2016-08-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-4300:
---

 Summary: Improve error message for different Scala versions of JM 
and Client
 Key: FLINK-4300
 URL: https://issues.apache.org/jira/browse/FLINK-4300
 Project: Flink
  Issue Type: Improvement
  Components: Client, JobManager
Reporter: Timo Walther


If a user runs a job (e.g. via RemoteEnvironment) with different Scala versions 
of JobManager and Client, the job is not executed and has no proper error 
message. 

The Client fails only with a meaningless warning:
{code}
16:59:58,677 WARN  akka.remote.ReliableDeliverySupervisor   
 - Association with remote system [akka.tcp://flink@127.0.0.1:6123] has failed, 
address is now gated for [5000] ms. Reason is: [Disassociated].
{code}

JobManager log only contains the following warning:
{code}
2016-08-01 16:59:58,664 WARN  akka.remote.ReliableDeliverySupervisor
- Association with remote system 
[akka.tcp://flink@192.168.1.142:63372] has failed, address is now gated for 
[5000] ms. Reason is: [scala.Option; local class incompatible: stream classdesc 
serialVersionUID = -2062608324514658839, local class serialVersionUID = 
-114498752079829388].
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Maximilian Michels
This is also a major issue for batch with off-heap memory and memory
preallocation turned off:
https://issues.apache.org/jira/browse/FLINK-4094
Not hard to fix though as we simply need to reliably clear the direct
memory instead of relying on garbage collection. Another possible fix
is to maintain memory pools independently of the preallocation mode. I
think this is fine because preallocation:false suggests that no memory
will be preallocated but not that memory will be freed once acquired.


[jira] [Created] (FLINK-4301) Parameterize Flink version in Quickstart bash script

2016-08-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-4301:
---

 Summary: Parameterize Flink version in Quickstart bash script
 Key: FLINK-4301
 URL: https://issues.apache.org/jira/browse/FLINK-4301
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Timo Walther


The Flink version is hard coded in the quickstart script (for Scala and Java). 
Thus, even if a user is in the Flink 1.0 docs the scripts are producing a 
quickstart of the Flink 1.1 release. It would be better if the one-liner in the 
documentation would contain the version such that a Flink 1.0 quickstart is 
build in the 1.0 documentation and 1.1 quickstart in the 1.1 documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4302) Add JavaDocs to MetricConfig

2016-08-01 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4302:
--

 Summary: Add JavaDocs to MetricConfig
 Key: FLINK-4302
 URL: https://issues.apache.org/jira/browse/FLINK-4302
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.1.0
Reporter: Ufuk Celebi


{{MetricConfig}} has no comments at all. If you want to implement a custom 
reporter and you want to implement its {{open}} method, a {{MetricConfig}} is 
its argument. It will be helpful to add one class-level JavaDoc stating where 
the config values are coming from etc.

[~Zentol] what do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] FLIP-5 Only send data to each taskmanager once for broadcasts

2016-08-01 Thread Stephan Ewen
Hi Felix!

Hope this helps_

Concerning (1.1) - The producer does not think in term of number of target
TaskManagers. That number can, after all, change in the presence of a
failure and recovery. The producer should, for its own result, not care how
many consumers it will have (Tasks), but produce it only once.

Concerning (1.2)  - Only "blocking" intermediate results can be consumed
multiple times. Data sent to broadcast variables must thus be always a
blocking intermediate result.

Greetings,
Stephan


On Wed, Jul 27, 2016 at 11:33 AM, Felix Neutatz 
wrote:

> Hi Stephan,
>
> thanks for the great ideas. First I have some questions:
>
> 1.1) Does every task generate an intermediate result partition for every
> target task or is that already implemented in a way so that there are only
> as many intermediate result partitions per task manager as target tasks?
> (Example: There are 2 task managers with 2 tasks each. Do we get 4
> intermediate result partitions per task manager or do we get 8?)
> 1.2) How can I consume an intermediate result partition multiple times?
> When I tried that I got the following exception:
> Caused by: java.lang.IllegalStateException: Subpartition 0 of
> dbe284e3b37c1df1b993a3f0a6020ea6@ce9fc38f08a5cc9e93431a9cbf740dcf is being
> or already has been consumed, but pipelined subpartitions can only be
> consumed once.
> at
>
> org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.createReadView(PipelinedSubpartition.java:179)
> at
>
> org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.createReadView(PipelinedSubpartition.java:36)
> at
>
> org.apache.flink.runtime.io.network.partition.ResultPartition.createSubpartitionView(ResultPartition.java:348)
> at
>
> org.apache.flink.runtime.io.network.partition.ResultPartitionManager.createSubpartitionView(ResultPartitionManager.java:81)
> at
>
> org.apache.flink.runtime.io.network.netty.PartitionRequestServerHandler.channelRead0(PartitionRequestServerHandler.java:98)
> at
>
> org.apache.flink.runtime.io.network.netty.PartitionRequestServerHandler.channelRead0(PartitionRequestServerHandler.java:41)
> at
>
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>
> My status update: Since Friday I am implementing your idea described in
> (2). Locally this approach already works (for less than 170 iterations). I
> will investigate further to solve that issue.
>
> But I am still not sure how to implement (1). Maybe we introduce a similar
> construct like the BroadcastVariableManager to share the RecordWriter among
> all tasks of a taskmanager. I am interested in your thoughts :)
>
> Best regards,
> Felix
>
> 2016-07-22 17:25 GMT+02:00 Stephan Ewen :
>
> > Hi Felix!
> >
> > Interesting suggestion. Here are some thoughts on the design.
> >
> > The two core changes needed to send data once to the TaskManagers are:
> >
> >   (1) Every sender needs to produce its stuff once (rather than for every
> > target task), there should not be redundancy there.
> >   (2) Every TaskManager should request the data once, other tasks in the
> > same TaskManager pick it up from there.
> >
> >
> > The current receiver-initialted pull model is actually a good abstraction
> > for that, I think.
> >
> > Lets look at (1):
> >
> >   - Currently, the TaskManagers have a separate intermediate result
> > partition for each target slot. They should rather have one intermediate
> > result partition (saves also repeated serialization) that is consumed
> > multiple times.
> >
> >   - Since the results that are to be broadcasted are always "blocking",
> > they can be consumed (pulled)  multiples times.
> >
> > Lets look at (2):
> >
> >   - The current BroadcastVariableManager has the functionality to let the
> > first accessor of the BC-variable materialize the result.
> >
> >   - It could be changed such that only the first accessor creates a
> > RecordReader, so the others do not even request the stream. That way, the
> > TaskManager should pull only one stream from each producing task, which
> > means the data is transferred once.
> >
> >
> > That would also work perfectly with the current failure / recovery model.
> >
> > What do you think?
> >
> > Stephan
> >
> >
> > On Fri, Jul 22, 2016 at 2:59 PM, Felix Neutatz 
> > wrote:
> >
> > > Hi everybody,
> > >
> > > I want to improve the performance of broadcasts in Flink. Therefore
> Till
> > > told me to start a FLIP on this topic to discuss how to go forward to
> > solve
> > > the current issues for broadcasts.
> > >
> > > The problem in a nutshell: Instead of sending data to each taskmanager
> > only
> > > once, at the moment the data is sent to each task. This means if there
> > are
> > > 3 slots on each taskmanager we will send the data 3 times instead of
> > once.
> > >
> > > There are multiple ways to tackle this problem and I started to do some
> > > research and investigate. You can follow my thought process here:
> > >
> > >
> > >
> >
> https://cwiki

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Aljoscha Krettek
I tried it again now. I did:

rm -r .m2/repository
mvn clean verify -Dhadoop.version=2.6.0

failed again. Also with versions 2.6.1 and 2.6.3.

On Mon, 1 Aug 2016 at 08:23 Maximilian Michels  wrote:

> This is also a major issue for batch with off-heap memory and memory
> preallocation turned off:
> https://issues.apache.org/jira/browse/FLINK-4094
> Not hard to fix though as we simply need to reliably clear the direct
> memory instead of relying on garbage collection. Another possible fix
> is to maintain memory pools independently of the preallocation mode. I
> think this is fine because preallocation:false suggests that no memory
> will be preallocated but not that memory will be freed once acquired.
>


Re: [DISCUSS] FLIP-2 Extending Window Function Metadata

2016-08-01 Thread Aljoscha Krettek
Alright, that seems reasonable. I updated the doc to add the Collector to
the method signature again.

On Mon, 1 Aug 2016 at 00:59 Stephan Ewen  wrote:

> The Collector is pretty integral in all the other functions that return
> multiple elements. I honestly don't see us switching away from it, given
> that it is such a core part of the API.
>
> The close() method has, to my best knowledge, not caused issues, yet. I
> cannot recall anyone mentioning that the close() method confused them, they
> accidentally called it, etc.
> I am wondering whether this is more of a theoretical than practical issue.
>
> If we move one function away from Collector to be on the safe side for a
> "maybe change in the future", while keeping Collector in all other
> functions - I think that fragments the API concept wise more than it
> improves anything.
>
> On Sun, Jul 31, 2016 at 7:10 PM, Aljoscha Krettek 
> wrote:
>
> > @Stephan: For the Output, should we keep using a Collector (which
> exposes)
> > the close() method which should never be called by users or create a new
> > Output type that only has an "output" method. Collector can also be used
> > but with a close() method that doesn't do anything. In the long run, I
> > thought it might be better to switch the type away from Collector.
> >
> > Cheers,
> > Aljoscha
> >
> > On Wed, 20 Jul 2016 at 01:25 Maximilian Michels  wrote:
> >
> > > I think it looks like Beam rather than Hadoop :)
> > >
> > > What Stephan meant was that he wanted a dedicated output method in the
> > > ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
> > > the collector.
> > >
> > > On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <
> aljos...@apache.org>
> > > wrote:
> > > > You mean keep the Collector? I don't like that one because it has the
> > > > close() method that should never be called by the user.
> > > >
> > > > We can keep it, though, because all the other user function
> interfaces
> > > also
> > > > expose it to the user.
> > > >
> > > > On Tue, 19 Jul 2016 at 15:22 Stephan Ewen  wrote:
> > > >
> > > >> I would actually make the output a separate parameter as well.
> Pretty
> > > much
> > > >> like the old variant, only replacing the "Window" parameter by the
> > > context
> > > >> (which contains everything about the window).
> > > >> It could also be called "WindowInvocationContext" or so.
> > > >>
> > > >> The current variant looks too Hadoop to me ;-) Everything done on
> the
> > > >> context object, and messy mocking when creating tests.
> > > >>
> > > >> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <
> > radu.tudo...@huawei.com>
> > > >> wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > Sorry - I made a mistake - I was thinking of getting access to the
> > > >> > collection (mist-read :) collector) of events in the window buffer
> > in
> > > >> > order to be able to delete/evict some of them which are not
> > necessary
> > > the
> > > >> > last ones.
> > > >> >
> > > >> >
> > > >> > Radu
> > > >> >
> > > >> > -Original Message-
> > > >> > From: Aljoscha Krettek [mailto:aljos...@apache.org]
> > > >> > Sent: Monday, July 18, 2016 5:54 PM
> > > >> > To: dev@flink.apache.org
> > > >> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > >> >
> > > >> > What about the collector? This is only used for emitting elements
> to
> > > the
> > > >> > downstream operation.
> > > >> >
> > > >> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <
> radu.tudo...@huawei.com>
> > > >> wrote:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > > I think it looks good and most importantly is that we can extend
> > it
> > > in
> > > >> > > the directions discussed so far.
> > > >> > >
> > > >> > > One question though regarding the Collector - are we going to be
> > > able
> > > >> > > to delete random elements from the list if this is not exposed
> as
> > a
> > > >> > > collection, at least to the evictor? If not, how are we going to
> > > >> > > extend in the future to cover this case?
> > > >> > >
> > > >> > > Regarding the ordering - I also observed that there are
> situations
> > > >> > > where elements do not have a logical order. One example is if
> you
> > > have
> > > >> > > high rates of the events. Nevertheless, even if now is not the
> > time
> > > >> > > for this, I think in the future we can imagine having also some
> > data
> > > >> > > structures that offer some ordering. It can save some
> computation
> > > >> > > efforts later in the functions for some use cases.
> > > >> > >
> > > >> > >
> > > >> > > Best regards,
> > > >> > >
> > > >> > >
> > > >> > > -Original Message-
> > > >> > > From: Aljoscha Krettek [mailto:aljos...@apache.org]
> > > >> > > Sent: Monday, July 18, 2016 3:45 PM
> > > >> > > To: dev@flink.apache.org
> > > >> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
> > > >> > >
> > > >> > > I incorporated the changes. The proposed interface of
> > > >> > > ProcessWindowFunction is now this:
> > > >> 

Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Till Rohrmann
I think that FLINK-4094 is nice to fix but not a release blocker since we
know how to prevent this situation (setting preallocation to true).

On Mon, Aug 1, 2016 at 11:56 PM, Aljoscha Krettek 
wrote:

> I tried it again now. I did:
>
> rm -r .m2/repository
> mvn clean verify -Dhadoop.version=2.6.0
>
> failed again. Also with versions 2.6.1 and 2.6.3.
>
> On Mon, 1 Aug 2016 at 08:23 Maximilian Michels  wrote:
>
> > This is also a major issue for batch with off-heap memory and memory
> > preallocation turned off:
> > https://issues.apache.org/jira/browse/FLINK-4094
> > Not hard to fix though as we simply need to reliably clear the direct
> > memory instead of relying on garbage collection. Another possible fix
> > is to maintain memory pools independently of the preallocation mode. I
> > think this is fine because preallocation:false suggests that no memory
> > will be preallocated but not that memory will be freed once acquired.
> >
>


Map Reduce Sorting

2016-08-01 Thread Hilmi Yildirim

Hi,

I have a question regarding when data points are sorted when applying a 
simple Map Reduce Job.


I have the following code:

data = readFromSource()

data.map().groupBy(0).reduce(...)

This code will be translated into the following execution plan:

map -> combiner -> hash partitioning and sorting on 0 -> reduce.


If I am right then the combiner firstly sorts the data, then it applies 
the combine function, and then it partitions the result.


Now the partitions are consumed by the reducers. For each 
mapper/combiner machine, the reducer has an input gateway. For example, 
the mappers and combiners run on 10 machines, then each reducer has 10 
input gateways. Now, the reducer consumes the data via a 
MutableObjectIterator. This iterator firstly consumes data from one 
input gateway, then from the other and so on. Is the data of a single 
input gateway already sorted? Because the combiner function has sorted 
the data already. Is the order of the data points maintained after they 
are sent through the network?


In my code, the MutableObjectIterator instances are subclasses of 
NormalizedKeySorter. Does this mean that the data from an input gateway 
is firstly sorted before it is handover to the reduce function? Is this 
because the order of the data points is not mainted after sending 
through the network?



It would be nice if someone can answer my question. If my assumptions 
are wrong, please correct me :)



BR,

Hilmi




--
==
Hilmi Yildirim, M.Sc.
Researcher

DFKI GmbH
Intelligente Analytik für Massendaten
DFKI Projektbüro Berlin
Alt-Moabit 91c
D-10559 Berlin
Phone: +49 30 23895 1814

E-Mail: hilmi.yildi...@dfki.de

-
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern

Geschaeftsfuehrung:
Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
Dr. Walter Olthoff

Vorsitzender des Aufsichtsrats:
Prof. Dr. h.c. Hans A. Aukes

Amtsgericht Kaiserslautern, HRB 2313
-



Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Ufuk Celebi
Which Maven version are you using?

On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek  wrote:
> I tried it again now. I did:
>
> rm -r .m2/repository
> mvn clean verify -Dhadoop.version=2.6.0
>
> failed again. Also with versions 2.6.1 and 2.6.3.
>
> On Mon, 1 Aug 2016 at 08:23 Maximilian Michels  wrote:
>
>> This is also a major issue for batch with off-heap memory and memory
>> preallocation turned off:
>> https://issues.apache.org/jira/browse/FLINK-4094
>> Not hard to fix though as we simply need to reliably clear the direct
>> memory instead of relying on garbage collection. Another possible fix
>> is to maintain memory pools independently of the preallocation mode. I
>> think this is fine because preallocation:false suggests that no memory
>> will be preallocated but not that memory will be freed once acquired.
>>


[jira] [Created] (FLINK-4303) Add CEP examples

2016-08-01 Thread Timo Walther (JIRA)
Timo Walther created FLINK-4303:
---

 Summary: Add CEP examples
 Key: FLINK-4303
 URL: https://issues.apache.org/jira/browse/FLINK-4303
 Project: Flink
  Issue Type: Improvement
  Components: CEP
Reporter: Timo Walther


Neither CEP Java nor CEP Scala contain a runnable example. The example on the 
website is also not runnable without adding some additional code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Stephan Ewen
@Aljoscha: Have you made sure you have a clean maven cache (remove the
.m2/repository/org/apache/flink folder)?

On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek 
wrote:

> I tried it again now. I did:
>
> rm -r .m2/repository
> mvn clean verify -Dhadoop.version=2.6.0
>
> failed again. Also with versions 2.6.1 and 2.6.3.
>
> On Mon, 1 Aug 2016 at 08:23 Maximilian Michels  wrote:
>
> > This is also a major issue for batch with off-heap memory and memory
> > preallocation turned off:
> > https://issues.apache.org/jira/browse/FLINK-4094
> > Not hard to fix though as we simply need to reliably clear the direct
> > memory instead of relying on garbage collection. Another possible fix
> > is to maintain memory pools independently of the preallocation mode. I
> > think this is fine because preallocation:false suggests that no memory
> > will be preallocated but not that memory will be freed once acquired.
> >
>


Re: Map Reduce Sorting

2016-08-01 Thread Fabian Hueske
Hi Hilmi,

the results of the combiner are usually not completely sorted and if they
are this property is not leveraged.
This is due to the following reasons:
1) a sort-combiner only sorts as much data as fits into memory. If there is
more data, the result consists of multiple sorted sequences.
2) since recently, Flink features a hash-based combiner which is usually
more efficient and does not produce sorted output.
3) Flink's pipelined shipping strategy would require that the
receiver merges the result records from all senders on the fly while
receiving data via the network. In case of a straggling sender task all
other senders would be blocked due to backpressure. In addition, this would
only work if the combiner does a full sort and not several in-memory sorts.

So, a Reducer will always do a full sort of all received data before
applying the Reduce function (if available, a combiner is applied before
data is written to disk in case of an external sort).

Hope this helps,
Fabian

2016-08-01 18:25 GMT+02:00 Hilmi Yildirim :

> Hi,
>
> I have a question regarding when data points are sorted when applying a
> simple Map Reduce Job.
>
> I have the following code:
>
> data = readFromSource()
>
> data.map().groupBy(0).reduce(...)
>
> This code will be translated into the following execution plan:
>
> map -> combiner -> hash partitioning and sorting on 0 -> reduce.
>
>
> If I am right then the combiner firstly sorts the data, then it applies
> the combine function, and then it partitions the result.
>
> Now the partitions are consumed by the reducers. For each mapper/combiner
> machine, the reducer has an input gateway. For example, the mappers and
> combiners run on 10 machines, then each reducer has 10 input gateways. Now,
> the reducer consumes the data via a MutableObjectIterator. This iterator
> firstly consumes data from one input gateway, then from the other and so
> on. Is the data of a single input gateway already sorted? Because the
> combiner function has sorted the data already. Is the order of the data
> points maintained after they are sent through the network?
>
> In my code, the MutableObjectIterator instances are subclasses of
> NormalizedKeySorter. Does this mean that the data from an input gateway is
> firstly sorted before it is handover to the reduce function? Is this
> because the order of the data points is not mainted after sending through
> the network?
>
>
> It would be nice if someone can answer my question. If my assumptions are
> wrong, please correct me :)
>
>
> BR,
>
> Hilmi
>
>
>
>
> --
> ==
> Hilmi Yildirim, M.Sc.
> Researcher
>
> DFKI GmbH
> Intelligente Analytik für Massendaten
> DFKI Projektbüro Berlin
> Alt-Moabit 91c
> D-10559 Berlin
> Phone: +49 30 23895 1814
>
> E-Mail: hilmi.yildi...@dfki.de
>
> -
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
>
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
>
> Amtsgericht Kaiserslautern, HRB 2313
> -
>
>


Re: [VOTE] Release Apache Flink 1.1.0 (RC1)

2016-08-01 Thread Aljoscha Krettek
@Ufuk: 3.3.9, that's probably it because that messes with the shading,
right?

@Stephan: Yes, even did a "rm -r .m2/repository". But the maven version is
most likely the reason.

On Mon, 1 Aug 2016 at 10:59 Stephan Ewen  wrote:

> @Aljoscha: Have you made sure you have a clean maven cache (remove the
> .m2/repository/org/apache/flink folder)?
>
> On Mon, Aug 1, 2016 at 5:56 PM, Aljoscha Krettek 
> wrote:
>
> > I tried it again now. I did:
> >
> > rm -r .m2/repository
> > mvn clean verify -Dhadoop.version=2.6.0
> >
> > failed again. Also with versions 2.6.1 and 2.6.3.
> >
> > On Mon, 1 Aug 2016 at 08:23 Maximilian Michels  wrote:
> >
> > > This is also a major issue for batch with off-heap memory and memory
> > > preallocation turned off:
> > > https://issues.apache.org/jira/browse/FLINK-4094
> > > Not hard to fix though as we simply need to reliably clear the direct
> > > memory instead of relying on garbage collection. Another possible fix
> > > is to maintain memory pools independently of the preallocation mode. I
> > > think this is fine because preallocation:false suggests that no memory
> > > will be preallocated but not that memory will be freed once acquired.
> > >
> >
>