Re: [DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Chiwan Park
+1 Regards, Chiwan Park > On Mar 23, 2016, at 11:24 PM, Robert Metzger wrote: > > +1 > > I just went through the master and release-1.0 branch, and most important > fixes are already in the release-1.0 branch. > I would also move this commit into the release branch: > "[FLINK-3636] Add Throttl

Re: Proposal: YARN session per-job Kerberos authentication

2016-03-23 Thread Maximilian Michels
Hi Stefano, Sounds great. Please go ahead! Note that Flink already provides the proposed feature for per-job Yarn clusters. However, it is a valuable addition to realize this feature for the Yarn session. The only blocker that I can think of is probably this PR which changes a lot of the Yarn cla

RE: Aggregation Design Questions

2016-03-23 Thread Lisonbee, Todd
I wrote another design for a summarize() function on DataSet. https://issues.apache.org/jira/browse/FLINK-3664 I think this would be a better place for me to start than working on generic Aggregations. (I could move ahead with it immediately and there are no tricky decisions if people more or l

[jira] [Created] (FLINK-3664) Create a method to easily Summarize a DataSet

2016-03-23 Thread Todd Lisonbee (JIRA)
Todd Lisonbee created FLINK-3664: Summary: Create a method to easily Summarize a DataSet Key: FLINK-3664 URL: https://issues.apache.org/jira/browse/FLINK-3664 Project: Flink Issue Type: Impro

Re: RollingSink

2016-03-23 Thread Vijay Srinivasaraghavan
Hi Aljoscha, It was my bad that I have copied some wrong class files during one of the step. I have retried the same steps that I mentioned earlier and with that I am able to see all the debug statements that I have added to the RollingSink.. I have found another interesting issue here. I am usin

Aggregation Design Questions

2016-03-23 Thread Lisonbee, Todd
Hello, I'm working on adding Standard Deviation and others to the list of Aggregations, https://issues.apache.org/jira/browse/FLINK-3613 Unfortunately, I didn't get very far because the general design of Aggreation on DataSets needs to change and each solution seems to have drawbacks. For exam

[jira] [Created] (FLINK-3663) FlinkKafkaConsumerBase.logPartitionInfo is missing a log marker

2016-03-23 Thread Niels Zeilemaker (JIRA)
Niels Zeilemaker created FLINK-3663: --- Summary: FlinkKafkaConsumerBase.logPartitionInfo is missing a log marker Key: FLINK-3663 URL: https://issues.apache.org/jira/browse/FLINK-3663 Project: Flink

Proposal: YARN session per-job Kerberos authentication

2016-03-23 Thread Stefano Baghino
Hello everybody, some of us at Radicalbit spent the last few weeks experimenting to improve the understanding of the compatibility of Flink with secure cluster environments and with Kerberos in particular. We’ve found a possible area of improvement and would like to work on it as part of our effo

Re: RollingSink

2016-03-23 Thread Aljoscha Krettek
Hmm, that’s strange. Could you maybe send one of the TaskManager logs? Cheers, Aljoscha > On 23 Mar 2016, at 15:28, Vijay wrote: > > Yes, I have updated on all cluster nodes and restarted entire cluster. > > Do you see any problems with the steps that I followed? > > Regards, > Vijay > > Sen

Re: Streaming KV store abstraction

2016-03-23 Thread Gyula Fóra
Hi! Sorry for the late answer, I completely missed this email. (Thanks Robert for pointing out). You won't be able to use that project as it was dependent on an earlier snapshot version that still had completely different state semantics. I don't think it is realistic that I will re-implment this

Re: RollingSink

2016-03-23 Thread Vijay
Yes, I have updated on all cluster nodes and restarted entire cluster. Do you see any problems with the steps that I followed? Regards, Vijay Sent from my iPhone > On Mar 23, 2016, at 7:18 AM, Aljoscha Krettek wrote: > > Hi, > did you update the log4j.properties file on all nodes where the T

Re: [DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Robert Metzger
+1 I just went through the master and release-1.0 branch, and most important fixes are already in the release-1.0 branch. I would also move this commit into the release branch: "[FLINK-3636] Add ThrottledIterator to WindowJoin jar" https://github.com/apache/flink/commit/f09d68a05efb4afeb7b8498d352

Re: RollingSink

2016-03-23 Thread Aljoscha Krettek
Hi, did you update the log4j.properties file on all nodes where the TaskManagers run and did you restart the whole cluster? Cheers, Aljoscha > On 23 Mar 2016, at 15:02, Vijay wrote: > > Hi Aljoscha, > > I am using standalone flink cluster (3 node). I am running flink job by > submitting/uploa

Re: RollingSink

2016-03-23 Thread Vijay
Hi Aljoscha, I am using standalone flink cluster (3 node). I am running flink job by submitting/uploading jar through Flink UI. I have built flink from maven and modified the RollingSink code to add new debug statements. I have also packaged the streaming file system connector package (includi

Re: RollingSink

2016-03-23 Thread Aljoscha Krettek
Hi, what where the steps you took? By the way, are you running this on yarn or in standalone mode? How are you starting the Flink job? Do you still don’t see DEBUG entries in the log? Cheers, Aljoscha > On 23 Mar 2016, at 14:32, Vijay wrote: > > I have changed the properties file but it did no

Re: RollingSink

2016-03-23 Thread Vijay
I have changed the properties file but it did not help. Regards, Vijay Sent from my iPhone > On Mar 23, 2016, at 5:39 AM, Aljoscha Krettek wrote: > > Ok, then you should be able to change the log level to DEBUG in > conf/log4j.properties. > >> On 23 Mar 2016, at 12:41, Vijay wrote: >> >> I

[jira] [Created] (FLINK-3661) Make Scala 2.11.x the default Scala version

2016-03-23 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3661: - Summary: Make Scala 2.11.x the default Scala version Key: FLINK-3661 URL: https://issues.apache.org/jira/browse/FLINK-3661 Project: Flink Issue Typ

Re: [DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Maximilian Michels
+1 On Wed, Mar 23, 2016 at 12:19 PM, Till Rohrmann wrote: > +1 > > On Wed, Mar 23, 2016 at 11:24 AM, Stephan Ewen wrote: > >> Yes, there is also the Rich Scala Window Functions, and the tests that used >> to address wrong JAR directories. >> >> On Wed, Mar 23, 2016 at 11:15 AM, Ufuk Celebi wrot

[jira] [Created] (FLINK-3662) Bump Akka version to 2.4.x for Scala 2.11.x

2016-03-23 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3662: - Summary: Bump Akka version to 2.4.x for Scala 2.11.x Key: FLINK-3662 URL: https://issues.apache.org/jira/browse/FLINK-3662 Project: Flink Issue Typ

Re: RollingSink

2016-03-23 Thread Aljoscha Krettek
Ok, then you should be able to change the log level to DEBUG in conf/log4j.properties. > On 23 Mar 2016, at 12:41, Vijay wrote: > > I think only the ERROR category gets displayed in the log file > > Regards, > Vijay > > Sent from my iPhone > >> On Mar 23, 2016, at 2:30 AM, Aljoscha Krettek

Re: a typical ML algorithm flow

2016-03-23 Thread Theodore Vasiloudis
Just realized what I wrote is wrong and probably doesn't apply here. The problem I described relates to modifying a *secondary* dataset as you iterate over a primary one. Taking SGD as an example, you would iterate over a weights dataset, modifying it using the native Flink iterations that Till

Re: [DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Till Rohrmann
+1 On Wed, Mar 23, 2016 at 11:24 AM, Stephan Ewen wrote: > Yes, there is also the Rich Scala Window Functions, and the tests that used > to address wrong JAR directories. > > On Wed, Mar 23, 2016 at 11:15 AM, Ufuk Celebi wrote: > > > Big +1, let's get this rolling... ;) > > > > On Wed, Mar 23,

Re: [DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Stephan Ewen
Yes, there is also the Rich Scala Window Functions, and the tests that used to address wrong JAR directories. On Wed, Mar 23, 2016 at 11:15 AM, Ufuk Celebi wrote: > Big +1, let's get this rolling... ;) > > On Wed, Mar 23, 2016 at 11:14 AM, Aljoscha Krettek > wrote: > > Hi, > > I’m aware of one

[jira] [Created] (FLINK-3660) Measure latency of elements and expose it through web interface

2016-03-23 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3660: - Summary: Measure latency of elements and expose it through web interface Key: FLINK-3660 URL: https://issues.apache.org/jira/browse/FLINK-3660 Project: Flink

Re: [DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Ufuk Celebi
Big +1, let's get this rolling... ;) On Wed, Mar 23, 2016 at 11:14 AM, Aljoscha Krettek wrote: > Hi, > I’m aware of one critical fix and one somewhat critical fix since 1.0.0. One > concerns data loss in the RollingSink, the other is a bug in a window > trigger. I would like to release a bugfix

[DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Aljoscha Krettek
Hi, I’m aware of one critical fix and one somewhat critical fix since 1.0.0. One concerns data loss in the RollingSink, the other is a bug in a window trigger. I would like to release a bugfix release since some people are restricted to using released versions and are also depending on the Rolli

Re: a typical ML algorithm flow

2016-03-23 Thread Till Rohrmann
Hi Dmitriy, I’m not sure whether I’ve understood your question correctly, so please correct me if I’m wrong. So you’re asking whether it is a problem that stat1 = A.map.reduce A = A.update.map(stat1) are executed on the same input data set A and whether we have to cache A for that, right? I ass

[jira] [Created] (FLINK-3659) Allow ConnectedStreams to Be Keyed on Only One Side

2016-03-23 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3659: --- Summary: Allow ConnectedStreams to Be Keyed on Only One Side Key: FLINK-3659 URL: https://issues.apache.org/jira/browse/FLINK-3659 Project: Flink Issue

Re: a typical ML algorithm flow

2016-03-23 Thread Theodore Vasiloudis
Hello Dmitriy, If I understood correctly what you are basically talking about modifying a DataSet as you iterate over it. AFAIK this is currently not possible in Flink, and indeed it's a real bottleneck for ML algorithms. This is the reason our current SGD implementation does a pass over the whol

[jira] [Created] (FLINK-3658) Allow the FlinkKafkaProducer to send data to multiple topics

2016-03-23 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3658: - Summary: Allow the FlinkKafkaProducer to send data to multiple topics Key: FLINK-3658 URL: https://issues.apache.org/jira/browse/FLINK-3658 Project: Flink

Re: Behavior of lib directory shipping on YARN

2016-03-23 Thread Stefano Baghino
Thanks for pointing out Max's work (awesome PR, btw). It actually seem to have introduced an environment variable regarding ship directories, it would be good to have his feedback on this. On Tue, Mar 22, 2016 at 10:24 PM, Ufuk Celebi wrote: > On Tue, Mar 22, 2016 at 8:42 PM, Stefano Baghino >