Hi Stephan,

As I’m already familiar with the latency markers of Flink 1.2, there is one 
question that bothers me in regard to them - how does Flink measure end-to-end 
latency when dealing with e.g. aggregations? 

Suppose you have a topology ingesting data from Kafka, and you want to output 
frequency per key. In this case, the sink is just given tuples of (key: String, 
frequency: Int).   

> On 25 Jan 2017, at 16:11, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi!
> 
> There are new latency metrics in Flink 1.2 that you can use. They are 
> sampled, so not on every record.
> 
> You can always attach your own timestamps, in order to measure the latency of 
> specific records.
> 
> Stephan
> 
> 
> On Fri, Dec 16, 2016 at 5:02 PM, Meghashyam Sandeep V 
> <vr1meghash...@gmail.com <mailto:vr1meghash...@gmail.com>> wrote:
> Hi Stephan,
> 
> Thanks for your answer. Is there a way to get the metrics such as latency of 
> each message in the stream? For eg. I have a Kafka source, Cassandra  sink 
> and I do some processing in between. I would like to know how long does it 
> take for each message from the beginning(entering flink streaming from kafka) 
> to end(sending/executing the query). 
> 
> On Fri, Dec 16, 2016 at 7:36 AM, Stephan Ewen <se...@apache.org 
> <mailto:se...@apache.org>> wrote:
> Hi!
> 
> I am not sure there exists a recommended benchmarking tool. Performance 
> comparisons depend heavily on the scenarios you are looking at: Simple event 
> processing, shuffles (grouping aggregation), joins, small state, large state, 
> etc...
> 
> As fas as I know, most people try to write a "mock" version of a job that is 
> representative for the jobs they want to run, and test with that.
> 
> That said, I agree that it would actually be helpful to collect some jobs in 
> a form of "evaluation suite".
> 
> Stephan
> 
> 
> 
> On Thu, Dec 15, 2016 at 6:11 PM, Meghashyam Sandeep V 
> <vr1meghash...@gmail.com <mailto:vr1meghash...@gmail.com>> wrote:
> Hi There,
> 
> We are evaluating Flink streaming for real time data analysis. I have my 
> flink job running in EMR with Yarn. What are the possible benchmarking tools 
> that work best with Flink? I couldn't find this information in the Apache 
> website. 
> 
> Thanks,
> Sandeep
> 
> 
> 

Reply via email to