Re: Measuring latency in a DataStream

2016-05-03 Thread Robert Schmidtke
After fixing the clock issue on the application level, the latency is as expected. Thanks again! Robert On Tue, May 3, 2016 at 9:54 AM, Robert Schmidtke wrote: > Hi Igor, thanks for your reply. > > As for your first point I'm not sure I understand correctly. I'm ingesting > records at a rate of

Re: Measuring latency in a DataStream

2016-05-03 Thread Robert Schmidtke
Hi Igor, thanks for your reply. As for your first point I'm not sure I understand correctly. I'm ingesting records at a rate of about 50k records per second, and those records are fairly small. If I add a time stamp to each of them, I will have a lot more data, which is not exactly what I want. In

Re: Measuring latency in a DataStream

2016-05-02 Thread Igor Berman
1. why are you doing join instead of something like System.currentTimeInMillis()? at the end you have tuple of your data with timestamp anyways...so why just not to wrap you data in tuple2 with additional info of creation ts? 2. are you sure that consumer/producer machines' clocks are in sync? you

Measuring latency in a DataStream

2016-05-02 Thread Robert Schmidtke
Hi everyone, I have implemented a way to measure latency in a DataStream (I hope): I'm consuming a Kafka topic and I'm union'ing the resulting stream with a custom source that emits a (machine-local) timestamp every 1000ms (using currentTimeMillis). On the consuming end I'm distinguishing between