Re: [DISCUSS] FLIP-8: Rescalable Non-Partitioned State

2016-08-12 Thread Ufuk Celebi
I will update the design doc with more details for the Checkpointed variants and remove Option 2 (I think that's an orthogonal thing). The way I see it now, we should have base CheckpointedBase interface, have the current Checkpointed interface be a subclass for not repartitionable state. Then we

[jira] [Created] (FLINK-4390) Add throws clause verification to RpcCompletenessTest

2016-08-12 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-4390: Summary: Add throws clause verification to RpcCompletenessTest Key: FLINK-4390 URL: https://issues.apache.org/jira/browse/FLINK-4390 Project: Flink Issue Typ

Re: [DISCUSS] FLIP-8: Rescalable Non-Partitioned State

2016-08-12 Thread Gyula Fóra
Hi Aljoscha, Yes this is pretty much how I think about it as well. Basically the state in this case would be computed from the side inputs with the same state update logic on all operators. I think it is imprtant that operators compute their own state or at least observe all state changes otherwi

Re: [DISCUSS] FLIP-8: Rescalable Non-Partitioned State

2016-08-12 Thread Aljoscha Krettek
Hi Gyula, I was thinking about this as well, in the context of side-inputs, which would be a generalization of your use case. If I'm not mistaken. In my head I was calling it global state. Essentially, this state would be the same on all operators and when checkpointing you would only have to check

[jira] [Created] (FLINK-4389) Expose metrics to Webfrontend

2016-08-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-4389: --- Summary: Expose metrics to Webfrontend Key: FLINK-4389 URL: https://issues.apache.org/jira/browse/FLINK-4389 Project: Flink Issue Type: Sub-task

Re: [DISCUSS] FLIP-8: Rescalable Non-Partitioned State

2016-08-12 Thread Gyula Fóra
Hi, Let me try to explain what I mean by broadcast states. I think it is a very common pattern that people broadcast control messages to operators that also receive normal input events. some examples: broadcast a model for prediction, broadcast some information that should be the same at all subt

Re: [DISCUSS] FLIP-8: Rescalable Non-Partitioned State

2016-08-12 Thread Ufuk Celebi
Comments inline. On Thu, Aug 11, 2016 at 8:06 PM, Gyula Fóra wrote: > Option 1: > I think the main problem here is sending all the state everywhere will not > scale at all. I think this will even fail for some internal Flink operators > (window timers I think are kept like this, maybe Im wrong he

[jira] [Created] (FLINK-4388) Race condition during initialization of MemorySegmentFactory

2016-08-12 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4388: --- Summary: Race condition during initialization of MemorySegmentFactory Key: FLINK-4388 URL: https://issues.apache.org/jira/browse/FLINK-4388 Project: Flink Iss

Re: Conceptual difference Windows and DataSet

2016-08-12 Thread Stephan Ewen
Hi Kevin! The windows in Flink's DataStream API are organized by key. The reason is that the windows are very flexible, and each key can form different windows than the other (think sessions per user - each session starts and stops differently). There has been discussion about introducing somethi

Re: [DISCUSS] Breaking Savepoint Compatibility from 1.1 to 1.2

2016-08-12 Thread Maximilian Michels
Hi Aljoscha, I'm not very deep into the state backend implementation. However, I think a breaking change is unavoidable with the new key groups. The only way that we achieve backwards-compatibility is to include a translator from the old state format to the new one. As you already mentioned, this

Re: expose side output stream

2016-08-12 Thread Aljoscha Krettek
Hi Chen, could you maybe share the code that you have so far? If you wan't you can start a google doc and then we can work together on fleshing out an API/implementation that we can present to the Flink community as a FLIP. Cheers, Aljoscha On Thu, 11 Aug 2016 at 14:40 Stephan Ewen wrote: > Hi

Re: [DISCUSS] Streaming connector contributions

2016-08-12 Thread Maximilian Michels
Hi Robert, We had this discussion before when I suggested to use an external repository to manage connectors. Ever since I have come to the conclusion that the overhead of maintaining two source repositories along with maintaining code and integration, documentation, and CI, is not worth the effor

[jira] [Created] (FLINK-4387) Instability in KvStateClientTest.testClientServerIntegration()

2016-08-12 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-4387: - Summary: Instability in KvStateClientTest.testClientServerIntegration() Key: FLINK-4387 URL: https://issues.apache.org/jira/browse/FLINK-4387 Project: Flink

[DISCUSS] (Meta)data Driven Window Triggers

2016-08-12 Thread Kevin Jacobs
Hi, Today I will be giving a presentation about Apache Flink and in terms of the use cases at my company, Apache Flink performs better than Apache Spark. There is only one issue I encountered, and that is the lack of support for (Meta)data Driven Window Triggers. I would like to start a disc