Hi Stephan, As I’m already familiar with the latency markers of Flink 1.2, there is one question that bothers me in regard to them - how does Flink measure end-to-end latency when dealing with e.g. aggregations?
Suppose you have a topology ingesting data from Kafka, and you want to output frequency per key. In this case, the sink is just given tuples of (key: String, frequency: Int). > On 25 Jan 2017, at 16:11, Stephan Ewen <se...@apache.org> wrote: > > Hi! > > There are new latency metrics in Flink 1.2 that you can use. They are > sampled, so not on every record. > > You can always attach your own timestamps, in order to measure the latency of > specific records. > > Stephan > > > On Fri, Dec 16, 2016 at 5:02 PM, Meghashyam Sandeep V > <vr1meghash...@gmail.com <mailto:vr1meghash...@gmail.com>> wrote: > Hi Stephan, > > Thanks for your answer. Is there a way to get the metrics such as latency of > each message in the stream? For eg. I have a Kafka source, Cassandra sink > and I do some processing in between. I would like to know how long does it > take for each message from the beginning(entering flink streaming from kafka) > to end(sending/executing the query). > > On Fri, Dec 16, 2016 at 7:36 AM, Stephan Ewen <se...@apache.org > <mailto:se...@apache.org>> wrote: > Hi! > > I am not sure there exists a recommended benchmarking tool. Performance > comparisons depend heavily on the scenarios you are looking at: Simple event > processing, shuffles (grouping aggregation), joins, small state, large state, > etc... > > As fas as I know, most people try to write a "mock" version of a job that is > representative for the jobs they want to run, and test with that. > > That said, I agree that it would actually be helpful to collect some jobs in > a form of "evaluation suite". > > Stephan > > > > On Thu, Dec 15, 2016 at 6:11 PM, Meghashyam Sandeep V > <vr1meghash...@gmail.com <mailto:vr1meghash...@gmail.com>> wrote: > Hi There, > > We are evaluating Flink streaming for real time data analysis. I have my > flink job running in EMR with Yarn. What are the possible benchmarking tools > that work best with Flink? I couldn't find this information in the Apache > website. > > Thanks, > Sandeep > > >