Hi Fabian Thank you, yes there are just map functions, i will do it that way with methods to get it faster
On Fri, Aug 10, 2018, 5:58 AM Fabian Hueske <fhue...@gmail.com> wrote: > Hi, > > Elias and Paul have good points. > I think the performance degradation is mostly to the lack of function > chaining in the rebalance case. > > If all steps are just map functions, they can be chained in the > no-rebalance case. > That means, records are passed via function calls. > If you add rebalancing, records will be passed between map functions via > serialization, network transfer, and deserialization. > This is of course much more expensive than calling a method. > > Best, Fabian > > 2018-08-10 4:25 GMT+02:00 Paul Lam <paullin3...@gmail.com>: > >> Hi Antonio, >> >> AFAIK, there are two reasons for this: >> >> 1. Rebalancing itself brings latency because it takes time to >> redistribute the elements. >> 2. Rebalancing also messes up the order in the Kafka topic partitions, >> and often makes a event-time window wait longer to trigger in case you’re >> using event time characteristic. >> >> Best Regards, >> Paul Lam >> >> >> >> 在 2018年8月10日,05:49,antonio saldivar <ansal...@gmail.com> 写道: >> >> Hello >> >> Sending ~450 elements per second ( the values are in milliseconds start >> to end) >> I went from: >> with Rebalance >> *+------------+* >> *| **AVGWINDOW ** |* >> *+------------+* >> *| *32131.0853 * |* >> *+------------+* >> >> to this without rebalance >> >> *+------------+* >> *| **AVGWINDOW ** |* >> *+------------+* >> *| *70.2077 * |* >> *+------------+* >> >> El jue., 9 ago. 2018 a las 17:42, Elias Levy (< >> fearsome.lucid...@gmail.com>) escribió: >> >>> What do you consider a lot of latency? The rebalance will require >>> serializing / deserializing the data as it gets distributed. Depending on >>> the complexity of your records and the efficiency of your serializers, that >>> could have a significant impact on your performance. >>> >>> On Thu, Aug 9, 2018 at 2:14 PM antonio saldivar <ansal...@gmail.com> >>> wrote: >>> >>>> Hello >>>> >>>> Does anyone know why when I add "rebalance()" to my .map steps is >>>> adding a lot of latency rather than not having rebalance. >>>> >>>> >>>> I have kafka partitions in my topic 44 and 44 flink task manager >>>> >>>> execution plan looks like this when I add rebalance but it is adding a >>>> lot of latency >>>> >>>> kafka-src -> rebalance -> step1 -> rebalance ->step2->rebalance -> >>>> kafka-sink >>>> >>>> Thank you >>>> regards >>>> >>>> >> >