Re: Reinterpreting a pre-partitioned data stream as keyed stream

2020-04-02 Thread KristoffSC
Hi, sorry for a long wait. Answering our questions: 1 - yes 2 - thx 3 - rigth, understood 4 - well, in general I want to understand how this works. To be able in future to modify my job, for example extracting cpu heavy operators to separate tasks. Actually in my job some of my operators are ch

Re: Reinterpreting a pre-partitioned data stream as keyed stream

2020-02-02 Thread Guowei Ma
Hi, 1. Is the key that is used by the keyBy after point 1 precisely the same as the key used by the 4a and 4b? If yes, I think you could use the reinterpretAsKeyedStream to avoid the shuffle. 2. You could use SingleOutputStreamOperator::setChainingStrategy to disable the chain or use rebalance/shu

Re: Reinterpreting a pre-partitioned data stream as keyed stream

2020-01-30 Thread KristoffSC
Hi, thank you for the answer. I think I understand. In my uses case I have to keep the order of events for each key, but I dont have to process keys in the same order that I received them. On one point of my pipeline I'm also using a SessionWindow. My Flink environment has operator chaining en

Re: Reinterpreting a pre-partitioned data stream as keyed stream

2020-01-29 Thread Guowei Ma
Hi, Krzysztof When you use the *reinterpretAsKeyedStream* you must guarantee that partition is the same as Flink does by yourself. But before going any further I think we should know whether normal DataStream API could satisfy your requirements without using *reinterpretAsKeyedStream.* An opera

Reinterpreting a pre-partitioned data stream as keyed stream

2020-01-28 Thread KristoffSC
Hi all, we have a use case where order of received events matters and it should be kept across pipeline. Our pipeline would be paralleled. We can key the stream just after Source operator, but in order to keep the ordering among next operators we would have to still keep the stream keyed. Obviou