Hi, Another way to ensure order is by adding a logical version number for each message so that earlier version will not override later version. Timestamp depends on your ntp server works correctly.
On Sun, Jul 29, 2018 at 3:52 PM Niels Basjes <ni...@basjes.nl> wrote: > Hi, > > The basic thing is that you will only get the messages in a guaranteed > order if the order is maintained in all steps from creation to use. > In Kafka order is only guaranteed for messages in the same partition. > So if you need them in order by account then the producing system must use > the accountid as the key used to force a specific account into a specific > kafka partition. > Then the Flink Kafka source will read them sequentially in the right > order, but in order to KEEP them in that order you should really to a keyby > immediately after reading and used only keyedstreams from that point > onwards. > As soon as you do shuffle or key by a different key then the ordering > within an account is no longer guaranteed. > > In general I always put a very accurate timestamp in all of my events > (epoch milliseconds, in some cases even epoch microseconds) so I can always > check if an order problem occurred. > > Niels Basjes > > On Sun, Jul 29, 2018 at 9:25 AM, Congxian Qiu <qcx978132...@gmail.com> > wrote: > >> Hi, >> Maybe the messages of the same key should be in the *same partition* of >> Kafka topic >> >> 2018-07-29 11:01 GMT+08:00 Hequn Cheng <chenghe...@gmail.com>: >> >>> Hi harshvardhan, >>> If 1.the messages exist on the same topic and 2.there are no rebalance >>> and 3.keyby on the same field with same value, the answer is yes. >>> >>> Best, Hequn >>> >>> On Sun, Jul 29, 2018 at 3:56 AM, Harshvardhan Agrawal < >>> harshvardhan.ag...@gmail.com> wrote: >>> >>>> Hey, >>>> >>>> The messages will exist on the same topic. I intend to keyby on the >>>> same field. The question is that will the two messages be mapped to the >>>> same task manager and on the same slot. Also will they be processed in >>>> correct order given they have the same keys? >>>> >>>> On Fri, Jul 27, 2018 at 21:28 Hequn Cheng <chenghe...@gmail.com> wrote: >>>> >>>>> Hi Harshvardhan, >>>>> >>>>> There are a number of factors to consider. >>>>> 1. the consecutive Kafka messages must exist in a same topic of kafka. >>>>> 2. the data should not been rebalanced. For example, operators should >>>>> be chained in order to avoid rebalancing. >>>>> 3. if you perform keyBy(), you should keyBy on a field the consecutive >>>>> two messages share the same value. >>>>> >>>>> Best, Hequn >>>>> >>>>> On Sat, Jul 28, 2018 at 12:11 AM, Harshvardhan Agrawal < >>>>> harshvardhan.ag...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> >>>>>> We are currently using Flink to process financial data. We are >>>>>> getting position data from Kafka and we enrich the positions with account >>>>>> and product information. We are using Ingestion time while processing >>>>>> events. The question I have is: say I key the position datasream by >>>>>> account >>>>>> number. If I have two consecutive Kafka messages with the same account >>>>>> and >>>>>> product info where the second one is an updated position of the first >>>>>> one, >>>>>> does Flink guarantee that the messages will be processed on the same slot >>>>>> in the same worker? We want to ensure that we don’t process them out of >>>>>> order. >>>>>> >>>>>> Thank you! >>>>>> -- >>>>>> Regards, >>>>>> Harshvardhan >>>>>> >>>>> >>>>> -- >>>> Regards, >>>> Harshvardhan >>>> >>> >>> >> >> >> -- >> Blog:http://www.klion26.com >> GTalk:qcx978132955 >> 一切随心 >> > > > > -- > Best regards / Met vriendelijke groeten, > > Niels Basjes > -- Liu, Renjie Software Engineer, MVAD