Yes, Kafka for source and sink which makes monitoring the Flink in/out easy.
Michael > On Apr 26, 2018, at 5:27 PM, Dhruv Kumar <gargdhru...@gmail.com> wrote: > > Ok that answers my questions. > > What are you keeping the source and sink as? Is it Kafka for both? > > -------------------------------------------------- > Dhruv Kumar > PhD Candidate > Department of Computer Science and Engineering > University of Minnesota > www.dhruvkumar.me <http://www.dhruvkumar.me/> > >> On Apr 26, 2018, at 16:37, TechnoMage <mla...@technomage.com >> <mailto:mla...@technomage.com>> wrote: >> >> Yes NTP can still have skew. It may be measured in fractions of a second, >> but with Flink that can be significant if you care about sub-second latency >> accuracy. Since I have a 20 stage stream with 0.002 second latency it can >> matter. >> >> Back pressure is the limiting of input due to the inability of down-stream >> tasks to accept input. For example if you have a map that reads from a >> database to enhance an element, that may limit earlier steps performance as >> they can not push elements to it faster than it can read from the database. >> This can flow all the way back to the source and slow records coming into >> the system. >> >> Michael >> >>> On Apr 26, 2018, at 12:38 PM, Dhruv Kumar <gargdhru...@gmail.com >>> <mailto:gargdhru...@gmail.com>> wrote: >>> >>> What do you mean by the time skew from one machine(source) to >>> another(sink)? Do you mean the system time clocks of the source and sink >>> may not be in sync. If I regularly use NTP to keep the system clocks in >>> sync, will time skew still happen? >>> >>> Could you also elaborate on what do you mean by back pressure on source and >>> how will it impact the latency calculations? >>> >>> Sorry if these are trivial questions. I am a bit new to the real world >>> streaming systems. >>> >>> -------------------------------------------------- >>> Dhruv Kumar >>> PhD Candidate >>> Department of Computer Science and Engineering >>> University of Minnesota >>> www.dhruvkumar.me <http://www.dhruvkumar.me/> >>> >>>> On Apr 26, 2018, at 13:26, TechnoMage <mla...@technomage.com >>>> <mailto:mla...@technomage.com>> wrote: >>>> >>>> In a single machine system this may work ok. In a multi-machine system >>>> this is not as reliable as the time skew from one machine (source) to >>>> another (sink) can impact the measurements. This also does not account >>>> for back presure on the source. We are using an external process to in >>>> parallel read the source and output of the sink to measure the latency on >>>> a single system clock. It does account for those issues, but of course >>>> does not account for delivery delays in the messaging system (kafka in our >>>> case). But, does measure real world latency as seen by the rest of the >>>> system which is ultimately what matters to us. >>>> >>>> Michael >>>> >>>>> On Apr 26, 2018, at 12:01 PM, Dhruv Kumar <gargdhru...@gmail.com >>>>> <mailto:gargdhru...@gmail.com>> wrote: >>>>> >>>>> Hi >>>>> >>>>> I was trying to compute the end-to-end-latency for each record processed >>>>> by Flink. By end-to-end latency, I mean the difference between the time >>>>> at which the record entered the Flink system (came at source) and the >>>>> time at which the record is finally emitted into the sink. What is the >>>>> best way to measure this? I was thinking of doing the following: >>>>> 1. Add the current system timestamp to the record when the record arrives >>>>> at Flink. >>>>> 2. Add the current system timestamp to the record when the record is >>>>> finally being emitted into the sink. >>>>> 3. Take the difference between 2 and 1 offline when all the records have >>>>> been written into the sink. >>>>> >>>>> Does this sound ok? >>>>> >>>>> Also, if I use Processing time characteristic for this >>>>> end-to-end-latency, will it be fine? >>>>> >>>>> Thanks >>>>> -------------------------------------------------- >>>>> Dhruv Kumar >>>>> PhD Candidate >>>>> Department of Computer Science and Engineering >>>>> University of Minnesota >>>>> www.dhruvkumar.me <http://www.dhruvkumar.me/> >>>> >>> >> >