Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Robert Metzger
Hey,
JIRA was down for quite a while yesterday. Sadly, I don't think we can
merge the change because its API breaking.
One of the promises of the 1.0 release is that we are not breaking any APIs
in the 1.x.y series of Flink. We can fix those issues with a 2.x release.

On Sun, Mar 13, 2016 at 5:27 AM, Márton Balassi 
wrote:

> The JIRA issue is FLINK-3610.
>
> On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi 
> wrote:
>
> >
> > I have just come across a shortcoming of the streaming Scala API: it
> > completely lacks the Scala implementation of the DataStreamSink and
> > instead the Java version is used. [1]
> >
> > I would regard this as a bug that needs a fix for 1.0.1. Unfortunately
> > this is also api-breaking.
> >
> > Will post it to JIRA shortly - but issues.apache.org is unresponsive for
> > me currently. Wanted to raise the issue here as it might affect the api.
> >
> > [1] https://github.com/apache/flink/blob/master/flink-streaming-scala
> > /src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
> > #L928-L929
> >
>


Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Márton Balassi
Ok, if that is what we promised let's stick to that.
Then would you suggest to open a release-2.0 branch and merge it there?

On Sun, Mar 13, 2016 at 11:43 AM, Robert Metzger 
wrote:

> Hey,
> JIRA was down for quite a while yesterday. Sadly, I don't think we can
> merge the change because its API breaking.
> One of the promises of the 1.0 release is that we are not breaking any APIs
> in the 1.x.y series of Flink. We can fix those issues with a 2.x release.
>
> On Sun, Mar 13, 2016 at 5:27 AM, Márton Balassi 
> wrote:
>
> > The JIRA issue is FLINK-3610.
> >
> > On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi <
> balassi.mar...@gmail.com>
> > wrote:
> >
> > >
> > > I have just come across a shortcoming of the streaming Scala API: it
> > > completely lacks the Scala implementation of the DataStreamSink and
> > > instead the Java version is used. [1]
> > >
> > > I would regard this as a bug that needs a fix for 1.0.1. Unfortunately
> > > this is also api-breaking.
> > >
> > > Will post it to JIRA shortly - but issues.apache.org is unresponsive
> for
> > > me currently. Wanted to raise the issue here as it might affect the
> api.
> > >
> > > [1] https://github.com/apache/flink/blob/master/flink-streaming-scala
> > > /src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
> > > #L928-L929
> > >
> >
>


Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Robert Metzger
I think its too early to fork off a 2.0 branch. I have absolutely no idea
when a 2.0 release becomes relevant, could be easily a year from now.

The API stability guarantees don't forbid adding new methods. Maybe we can
find a good way to resolve the issue without changing the signature of
existing methods.
And for tracking API breaking changes, maybe it makes sense to create a
2.0.0 version in JIRA and set the "fix-for" for the issue to 2.0.

On Sun, Mar 13, 2016 at 12:08 PM, Márton Balassi 
wrote:

> Ok, if that is what we promised let's stick to that.
> Then would you suggest to open a release-2.0 branch and merge it there?
>
> On Sun, Mar 13, 2016 at 11:43 AM, Robert Metzger 
> wrote:
>
> > Hey,
> > JIRA was down for quite a while yesterday. Sadly, I don't think we can
> > merge the change because its API breaking.
> > One of the promises of the 1.0 release is that we are not breaking any
> APIs
> > in the 1.x.y series of Flink. We can fix those issues with a 2.x release.
> >
> > On Sun, Mar 13, 2016 at 5:27 AM, Márton Balassi <
> balassi.mar...@gmail.com>
> > wrote:
> >
> > > The JIRA issue is FLINK-3610.
> > >
> > > On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi <
> > balassi.mar...@gmail.com>
> > > wrote:
> > >
> > > >
> > > > I have just come across a shortcoming of the streaming Scala API: it
> > > > completely lacks the Scala implementation of the DataStreamSink and
> > > > instead the Java version is used. [1]
> > > >
> > > > I would regard this as a bug that needs a fix for 1.0.1.
> Unfortunately
> > > > this is also api-breaking.
> > > >
> > > > Will post it to JIRA shortly - but issues.apache.org is unresponsive
> > for
> > > > me currently. Wanted to raise the issue here as it might affect the
> > api.
> > > >
> > > > [1]
> https://github.com/apache/flink/blob/master/flink-streaming-scala
> > > > /src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
> > > > #L928-L929
> > > >
> > >
> >
>


Re: Accessing the Configuration

2016-03-13 Thread Michal Fijolek
Isn't it what RichFunction.open(Configuration parameters) is for?



--
View this message in context: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Accessing-the-Configuration-tp10696p10756.html
Sent from the Apache Flink Mailing List archive. mailing list archive at 
Nabble.com.


Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Márton Balassi
This approach means that we will have API breaking changes lingering around
in random people's remote repos that will be untrackable in no time and
people will not be able to build on each others changes. Whoever will have
the pleasure to eventually merge those together will be on the receiving
end of some great fun. I see two alternatives: Either fork a "traditional"
release-2.0 already, which I do agree would be an overkill to maintain with
all the commits flowing in for a year or so. Or at least we could dedicate
a branch to centrally collect the api breaking changes since the
release-1.0 api.

I appreciate the api extending methodology and would certainly love to
follow it myself, but the concrete issue basically is that the return type
of a function in the public api is not right. We can methods with the
correct return type, but that is borderline more ugly than the current
solution.

On Sun, Mar 13, 2016 at 12:14 PM, Robert Metzger 
wrote:

> I think its too early to fork off a 2.0 branch. I have absolutely no idea
> when a 2.0 release becomes relevant, could be easily a year from now.
>
> The API stability guarantees don't forbid adding new methods. Maybe we can
> find a good way to resolve the issue without changing the signature of
> existing methods.
> And for tracking API breaking changes, maybe it makes sense to create a
> 2.0.0 version in JIRA and set the "fix-for" for the issue to 2.0.
>
> On Sun, Mar 13, 2016 at 12:08 PM, Márton Balassi  >
> wrote:
>
> > Ok, if that is what we promised let's stick to that.
> > Then would you suggest to open a release-2.0 branch and merge it there?
> >
> > On Sun, Mar 13, 2016 at 11:43 AM, Robert Metzger 
> > wrote:
> >
> > > Hey,
> > > JIRA was down for quite a while yesterday. Sadly, I don't think we can
> > > merge the change because its API breaking.
> > > One of the promises of the 1.0 release is that we are not breaking any
> > APIs
> > > in the 1.x.y series of Flink. We can fix those issues with a 2.x
> release.
> > >
> > > On Sun, Mar 13, 2016 at 5:27 AM, Márton Balassi <
> > balassi.mar...@gmail.com>
> > > wrote:
> > >
> > > > The JIRA issue is FLINK-3610.
> > > >
> > > > On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi <
> > > balassi.mar...@gmail.com>
> > > > wrote:
> > > >
> > > > >
> > > > > I have just come across a shortcoming of the streaming Scala API:
> it
> > > > > completely lacks the Scala implementation of the DataStreamSink and
> > > > > instead the Java version is used. [1]
> > > > >
> > > > > I would regard this as a bug that needs a fix for 1.0.1.
> > Unfortunately
> > > > > this is also api-breaking.
> > > > >
> > > > > Will post it to JIRA shortly - but issues.apache.org is
> unresponsive
> > > for
> > > > > me currently. Wanted to raise the issue here as it might affect the
> > > api.
> > > > >
> > > > > [1]
> > https://github.com/apache/flink/blob/master/flink-streaming-scala
> > > > >
> /src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
> > > > > #L928-L929
> > > > >
> > > >
> > >
> >
>


Re: Accessing the Configuration

2016-03-13 Thread Robert Metzger
No, the RichFunction.open(Configuration) parameter is a feature from the
DataSet API. Its for passing an operator-specific configuration through the
".withParameters(Configuration)" method.

For more infos, just search for "withParameters" here:
https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/index.html

On Sun, Mar 13, 2016 at 12:22 PM, Michal Fijolek 
wrote:

> Isn't it what RichFunction.open(Configuration parameters) is for?
>
>
>
> --
> View this message in context:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Accessing-the-Configuration-tp10696p10756.html
> Sent from the Apache Flink Mailing List archive. mailing list archive at
> Nabble.com.
>


Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Chesnay Schepler

On 13.03.2016 12:14, Robert Metzger wrote:

I think its too early to fork off a 2.0 branch. I have absolutely no idea
when a 2.0 release becomes relevant, could be easily a year from now.
at first i was going to agree with Robert, but then...I mean the issue 
with not allowing breaking changes
is that effectively this means we won't work on these issues until 2.0 
comes around. Since otherwise,
the contributor would have to stash that change themselves in their own 
repository for god-knows how long.
Chances are that work will go to waste anyway because they forget / 
delete it.


having a central place (not necessarily a separate branch, maybe a repo 
with a separate branch for every commit)
where we can stash this work could prove useful; instead of starting to 
work on these issues all at once for 2.0,

we could save some work by only having to rebase them in one way or another.


And for tracking API breaking changes, maybe it makes sense to create a
2.0.0 version in JIRA and set the "fix-for" for the issue to 2.0.

+1 for adding a 2.0.0 version tag/. /This is the perfect use-case for it.//


On Sun, Mar 13, 2016 at 12:08 PM, Márton Balassi 
wrote:


Ok, if that is what we promised let's stick to that.
Then would you suggest to open a release-2.0 branch and merge it there?

On Sun, Mar 13, 2016 at 11:43 AM, Robert Metzger 
wrote:


Hey,
JIRA was down for quite a while yesterday. Sadly, I don't think we can
merge the change because its API breaking.
One of the promises of the 1.0 release is that we are not breaking any

APIs

in the 1.x.y series of Flink. We can fix those issues with a 2.x release.

On Sun, Mar 13, 2016 at 5:27 AM, Márton Balassi <

balassi.mar...@gmail.com>

wrote:


The JIRA issue is FLINK-3610.

On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi <

balassi.mar...@gmail.com>

wrote:


I have just come across a shortcoming of the streaming Scala API: it
completely lacks the Scala implementation of the DataStreamSink and
instead the Java version is used. [1]

I would regard this as a bug that needs a fix for 1.0.1.

Unfortunately

this is also api-breaking.

Will post it to JIRA shortly - but issues.apache.org is unresponsive

for

me currently. Wanted to raise the issue here as it might affect the

api.

[1]

https://github.com/apache/flink/blob/master/flink-streaming-scala

/src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
#L928-L929





Re: Accessing the Configuration

2016-03-13 Thread Michal Fijolek
Oh I see now
thanks!



--
View this message in context: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Accessing-the-Configuration-tp10696p10760.html
Sent from the Apache Flink Mailing List archive. mailing list archive at 
Nabble.com.


Re: [streaming, scala] Scala DataStream#addSink returns Java DataStreamSink

2016-03-13 Thread Gyula Fóra
Hi,

I think this is an important question that will surely come up in some
cases in the future.

I see your point Robert, that we have promised api compatibility for 1.x.y
releases, but I am not sure that this should cover things that are clearly
just unintended errors in the api from our side.

I am not sure what would be the right action regarding issues like this in
the future.

Gyula

Chesnay Schepler  ezt írta (időpont: 2016. márc. 13.,
V, 12:37):

> On 13.03.2016 12:14, Robert Metzger wrote:
> > I think its too early to fork off a 2.0 branch. I have absolutely no idea
> > when a 2.0 release becomes relevant, could be easily a year from now.
> at first i was going to agree with Robert, but then...I mean the issue
> with not allowing breaking changes
> is that effectively this means we won't work on these issues until 2.0
> comes around. Since otherwise,
> the contributor would have to stash that change themselves in their own
> repository for god-knows how long.
> Chances are that work will go to waste anyway because they forget /
> delete it.
>
> having a central place (not necessarily a separate branch, maybe a repo
> with a separate branch for every commit)
> where we can stash this work could prove useful; instead of starting to
> work on these issues all at once for 2.0,
> we could save some work by only having to rebase them in one way or
> another.
>
> > And for tracking API breaking changes, maybe it makes sense to create a
> > 2.0.0 version in JIRA and set the "fix-for" for the issue to 2.0.
> +1 for adding a 2.0.0 version tag/. /This is the perfect use-case for it.//
> >
> > On Sun, Mar 13, 2016 at 12:08 PM, Márton Balassi <
> balassi.mar...@gmail.com>
> > wrote:
> >
> >> Ok, if that is what we promised let's stick to that.
> >> Then would you suggest to open a release-2.0 branch and merge it there?
> >>
> >> On Sun, Mar 13, 2016 at 11:43 AM, Robert Metzger 
> >> wrote:
> >>
> >>> Hey,
> >>> JIRA was down for quite a while yesterday. Sadly, I don't think we can
> >>> merge the change because its API breaking.
> >>> One of the promises of the 1.0 release is that we are not breaking any
> >> APIs
> >>> in the 1.x.y series of Flink. We can fix those issues with a 2.x
> release.
> >>>
> >>> On Sun, Mar 13, 2016 at 5:27 AM, Márton Balassi <
> >> balassi.mar...@gmail.com>
> >>> wrote:
> >>>
>  The JIRA issue is FLINK-3610.
> 
>  On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi <
> >>> balassi.mar...@gmail.com>
>  wrote:
> 
> > I have just come across a shortcoming of the streaming Scala API: it
> > completely lacks the Scala implementation of the DataStreamSink and
> > instead the Java version is used. [1]
> >
> > I would regard this as a bug that needs a fix for 1.0.1.
> >> Unfortunately
> > this is also api-breaking.
> >
> > Will post it to JIRA shortly - but issues.apache.org is unresponsive
> >>> for
> > me currently. Wanted to raise the issue here as it might affect the
> >>> api.
> > [1]
> >> https://github.com/apache/flink/blob/master/flink-streaming-scala
> > /src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
> > #L928-L929
> >
>
>


[jira] [Created] (FLINK-3611) Wrong link in CONTRIBUTING.md

2016-03-13 Thread Martin Junghanns (JIRA)
Martin Junghanns created FLINK-3611:
---

 Summary: Wrong link in CONTRIBUTING.md
 Key: FLINK-3611
 URL: https://issues.apache.org/jira/browse/FLINK-3611
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0.1
Reporter: Martin Junghanns
Assignee: Martin Junghanns
Priority: Trivial


Its {{[Coding Guidelines](http://flink.apache.org/coding-guidelines.html)}}
instead of
{{[Coding 
Guidelines](http://flink.apache.org/contribute-code.html#coding-guidelines)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Flink-1.0.0 JobManager is not running in Docker Container on AWS

2016-03-13 Thread Deepak Jha
Hi Stephan & Ufuk,
Thanks for your response.

Yes there is a way in which you can run docker (net = host mode) in which
guest machine's network stack gets shared by docker container.
Unfortunately its not supported by AWS ECS.

I do have one more question for you. Can you guys please explain me what
happens when taskmanager's register themselves to jobmanager in HA mode?
Does each taskmanager gets connected to jobmanager on separate port ? The
reason I'm asking is because if I run 2 taskmanager's (on separate docker
container), they are able to attach themselves to the Jobmanager (another
docker container) ( Flink HA setup using remote zk cluster) but soon after
that they get disconnected. Logs are not very helpful either... I suspect
that each taskmanager gets connected on new port and since by default
docker does not expose all ports, this may happen I do not see this
happen when I do not use docker container

Here is the log file that I saw in jobmanager

2016-03-12 08:55:55,010 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20] o.a.f.r.instance.InstanceManager -
Registered TaskManager at 5673db03e679 (akka.tcp://
flink@172.17.0.3:6121/user/taskmanager) as
7eafcfddd6bd084f2ec5a32594603f4f. Current number of registered hosts
is 1. *Current
number of alive task slots is 1.*
2016-03-12 08:57:42,676 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20] o.a.f.r.instance.InstanceManager -
Registered TaskManager at 7200a7da4da7 (akka.tcp://
flink@172.17.0.3:6121/user/taskmanager) as
320338e15a7a44ee64dc03a40f04fcd7. Current number of registered hosts
is 2. *Current
number of alive task slots is 2.*
2016-03-12 08:57:48,422 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20]
o.a.f.runtime.jobmanager.JobManager - Task manager akka.tcp://
flink@172.17.0.3:6121/user/taskmanager terminated.
2016-03-12 08:57:48,422 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20] o.a.f.r.instance.InstanceManager -*
Unregistered task manager akka.tcp://flink@172.17.0.3:6121/user/taskmanager
. Number of registered task
managers 1. Number of available slots 1.*
2016-03-12 08:58:01,417 PST [WARN]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20]
a.remote.ReliableDeliverySupervisor - Association with remote system
[akka.tcp://flink@172.17.0.3:6121] has failed, address is now gated for
[5000] ms. Reason is: [Disassociated].
2016-03-12 08:58:01,451 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20]
o.a.f.runtime.jobmanager.JobManager - Task manager akka.tcp://
flink@172.17.0.3:6121/user/taskmanager wants to disconnect, because
TaskManager akka://flink/user/taskmanager is disassociating.
2016-03-12 08:58:01,451 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20]
o.a.f.r.instance.InstanceManager - *Unregistered
task manager akka.tcp://flink@172.17.0.3:6121/user/taskmanager
. Number of registered task
managers 0. Number of available slots 0.*
2016-03-12 08:58:01,465 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20]
o.a.f.r.instance.InstanceManager - *Registered
TaskManager at 7200a7da4da7
(akka.tcp://flink@172.17.0.3:6121/user/taskmanager
) as
b5dbbc829854afa3ec5d8f0b6f9dbd03. Current number of registered hosts is 1.
Current number of alive task slots is 1.*
2016-03-12 08:58:03,383 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20]
o.a.f.runtime.jobmanager.JobManager - Task manager akka.tcp://
flink@172.17.0.3:6121/user/taskmanager terminated.
2016-03-12 08:58:03,384 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20] o.a.f.r.instance.InstanceManager -*
Unregistered task manager akka.tcp://flink@172.17.0.3:6121/user/taskmanager
. Number of registered task
managers 0. Number of available slots 0.*
2016-03-12 08:58:04,988 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20] o.a.f.r.instance.InstanceManager -
Registering TaskManager at akka.tcp://flink@172.17.0.3:6121/user/taskmanager
which was marked as dead earlier because of a heart-beat timeout.
2016-03-12 08:58:04,988 PST [INFO]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20] o.a.f.r.instance.InstanceManager -
Registered TaskManager at 7200a7da4da7 (akka.tcp://
flink@172.17.0.3:6121/user/taskmanager) as
eac0ce12e6ec885863d3438d691f4ab2. Current number of registered hosts is 1.
Current number of alive task slots is 1.
2016-03-12 08:58:21,382 PST [WARN]  ec2-54-173-231-120.compute-1.a
[flink-akka.actor.default-dispatcher-20]
a.remote.ReliableDeliverySupervisor - Association with remote system
[akka.tcp://flink@172.17.0.3:6121] has