Re: API request to submit job takes over 1hr

2016-06-13 Thread Tzu-Li (Gordon) Tai
Hi Shannon, Thanks for your investigation on the issue and the JIRA. There's actually a previous JIRA on this problem already: https://issues.apache.org/jira/browse/FLINK-4023. Would you be ok with tracking this issue on FLINK-4023, and close FLINK-4069 as a duplicate issue? As you can see, I've a

Re: API request to submit job takes over 1hr

2016-06-13 Thread Shannon Carey
Robert, Thanks for your thoughtful response. 1. I understand your concern. User code is not guaranteed to respond to thread interrupts. So no matter what you do, you may end up with a stuck thread. But I think we can improve the user experience. First, we can update the documentation to ma

Re: Gelly scatter/gather

2016-06-13 Thread Vasiliki Kalavri
Hi Alieh, the VertexUpdateFunction and the MessagingFunction both have a method "getSuperstepNumber()" which will give you the current iteration number. -Vasia. On 13 June 2016 at 18:06, Alieh Saeedi wrote: > Hi > Is it possible to access iteration number in gelly scatter/gather? > > thanks in

Re: HBase reads and back pressure

2016-06-13 Thread Fabian Hueske
Do the backpressure metrics indicate that the sink function is blocking? 2016-06-13 16:58 GMT+02:00 Christophe Salperwyck < christophe.salperw...@gmail.com>: > To continue, I implemented the ws.apply(new SummaryStatistics(), new > YourFoldFunction(), new YourWindowFunction()); > > It works fine w

Re: Strange behavior of DataStream.countWindow

2016-06-13 Thread Fabian Hueske
If I understood you correctly, you want to compute windows in parallel without using a key. Are you aware that the results of such a computation is not deterministic and kind of arbitrary? If that is still OK for you, you can use a mapper to assign the current parallel index as a key field, i.e.,

Gelly scatter/gather

2016-06-13 Thread Alieh Saeedi
HiIs it possible to access iteration number in gelly scatter/gather? thanks in advance

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Maximilian Michels
+1 to what Aljoscha said. We should rather fix this programmatically. On Mon, Jun 13, 2016 at 4:25 PM, Aljoscha Krettek wrote: > Hi Josh, > I think RocksDB does not allow accessing a data base instance from more than > one process concurrently. Even if it were possible I would highly recommend >

Re: HBase reads and back pressure

2016-06-13 Thread Christophe Salperwyck
To continue, I implemented the ws.apply(new SummaryStatistics(), new YourFoldFunction(), new YourWindowFunction()); It works fine when there is no sink, but when I put an HBase sink it seems that the sink, somehow, blocks the flow. The sink writes very little data into HBase and when I limit my in

Custom Barrier?

2016-06-13 Thread Paul Wilson
Hi, I've been evaluating Flink and wondering if it was possible to define a window that is based on characteristics of the data (data driven) but not contained in the data stream directly. Consider 'nested events' where lower level events belong to a wider event where the wider event serves only

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Aljoscha Krettek
Hi Josh, I think RocksDB does not allow accessing a data base instance from more than one process concurrently. Even if it were possible I would highly recommend not to fiddle with Flink state internals (in RocksDB or elsewhere) from the outside. All kinds of things might be going on at any given m

[jira] Akshay Shingote shared "FLINK-4059: Requirement of Streaming layer for Complex Event Processing" with you

2016-06-13 Thread Akshay Shingote (JIRA)
Akshay Shingote shared an issue with you > Requirement of Streaming layer for Complex Event Processing > --- > > Key: FLINK-4059 > URL: https://issues.apac

Re: HBase reads and back pressure

2016-06-13 Thread Maximilian Michels
Thanks! On Mon, Jun 13, 2016 at 12:34 PM, Christophe Salperwyck wrote: > Hi, > I vote on this issue and I agree this would be nice to have. > > Thx! > Christophe > > 2016-06-13 12:26 GMT+02:00 Aljoscha Krettek : >> >> Hi, >> I'm afraid this is currently a shortcoming in the API. There is this ope

Re: HBase reads and back pressure

2016-06-13 Thread Christophe Salperwyck
Hi, I vote on this issue and I agree this would be nice to have. Thx! Christophe 2016-06-13 12:26 GMT+02:00 Aljoscha Krettek : > Hi, > I'm afraid this is currently a shortcoming in the API. There is this open > Jira issue to track it: https://issues.apache.org/jira/browse/FLINK-3869. > We can't

Re: HBase reads and back pressure

2016-06-13 Thread Christophe Salperwyck
Hi Max, In fact the Put would be the output of my WindowFunction. I saw Aljoscha replied, seems I will need to create another intermediate class to handle that. But it is fine. Thx for help! Christophe 2016-06-13 12:25 GMT+02:00 Maximilian Michels : > Hi Christophe, > > A fold function has two

Re: Arrays values in keyBy

2016-06-13 Thread Ufuk Celebi
Would make sense to update the Javadocs for the next release. On Mon, Jun 13, 2016 at 11:19 AM, Aljoscha Krettek wrote: > Yes, this is correct. Right now we're basically using .hashCode() for > keying. (Which can be problematic in some cases.) > > Beam, for example, clearly specifies that the enc

Re: NotSerializableException

2016-06-13 Thread Aljoscha Krettek
Nope, I think there is neither a fix nor an open issue for this right now. On Mon, 13 Jun 2016 at 11:31 Maximilian Michels wrote: > Is there an issue or a fix for proper use of the ClojureCleaner in > CoGroup.where()? > > On Fri, Jun 10, 2016 at 8:07 AM, Aljoscha Krettek > wrote: > > Hi, > > ye

Re: HBase reads and back pressure

2016-06-13 Thread Aljoscha Krettek
Hi, I'm afraid this is currently a shortcoming in the API. There is this open Jira issue to track it: https://issues.apache.org/jira/browse/FLINK-3869. We can't fix it before Flink 2.0, though, because we have to keep the API stable on the Flink 1.x release line. Cheers, Aljoscha On Mon, 13 Jun 2

Re: HBase reads and back pressure

2016-06-13 Thread Maximilian Michels
Hi Christophe, A fold function has two inputs: The state and a record to update the state with. So you can update the SummaryStatistics (state) with each Put (input). Cheers, Max On Mon, Jun 13, 2016 at 11:04 AM, Christophe Salperwyck wrote: > Thanks for the feedback and sorry that I can't try

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Maximilian Michels
Hi Josh, I'm not a RocksDB expert but the workaround you described should work. Just bear in mind that accessing RocksDB concurrently with a Flink job can result in an inconsistent state. Make sure to perform atomic updates and clear the RocksDB cache for the item. Cheers, Max On Mon, Jun 13, 20

Re: Application log on Yarn FlinkCluster

2016-06-13 Thread Maximilian Michels
Hi Theofilos, Flink doesn't send the local client output to the Yarn cluster. I think this will only change once we move the entire execution of the Job to the cluster framework. All output of the actual Flink job should be within the JobManager or TaskManager logs. There is something wrong with

Re: How to maintain the state of a variable in a map transformation.

2016-06-13 Thread Maximilian Michels
Hi Ravikumar, In short: No, you can't use closures to maintain a global state. If you want to keep an always global state, you'll have to use parallelism 1 or an external data store to keep that global state. Is it possible to break up your global state into a set of local states which can be com

Re: NotSerializableException

2016-06-13 Thread Maximilian Michels
Is there an issue or a fix for proper use of the ClojureCleaner in CoGroup.where()? On Fri, Jun 10, 2016 at 8:07 AM, Aljoscha Krettek wrote: > Hi, > yes, I was talking about a Flink bug. I forgot to mention the work-around > that Stephan mentioned. > > On Thu, 9 Jun 2016 at 20:38 Stephan Ewen wr

Re: Arrays values in keyBy

2016-06-13 Thread Aljoscha Krettek
Yes, this is correct. Right now we're basically using .hashCode() for keying. (Which can be problematic in some cases.) Beam, for example, clearly specifies that the encoded form of a value should be used for all comparisons/hashing. This is more well defined but can lead to slow performance in so

Re: HBase reads and back pressure

2016-06-13 Thread Christophe Salperwyck
Thanks for the feedback and sorry that I can't try all this straight away. Is there a way to have a different function than: WindowFunction() I would like to return a HBase Put and not a SummaryStatistics. So something like this: WindowFunction() Christophe 2016-06-09 17:47 GMT+02:00 Fabian Hue

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Josh
Hello, I have a follow-up question to this: since Flink doesn't support state expiration at the moment (e.g. expiring state which hasn't been updated for a certain amount of time), would it be possible to clear up old UDF states by: - store a 'last_updated" timestamp in the state value - periodical