Re: Review Request 34170: Patch for KAFKA-2191

Jay Kreps Fri, 15 May 2015 10:54:10 -0700


> On May 13, 2015, 11:50 p.m., Aditya Auradkar wrote:
> > clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java,
> >  line 45
> > <https://reviews.apache.org/r/34170/diff/2/?file=958605#file958605line45>
> >
> >     I think this is a good catch.
> >     
> >     Just so I understand, this is adding new samples for blank intervals of 
> > time when record was not called. As an optimization, I think you should 
> > only add "MetricConfig.samples" number of samples. In the rare case, there 
> > is no activity for 10 mins (say), this will add 10sec*6*10 = 600 samples 
> > which will be purged immediately on the next record call.
> 
> Jay Kreps wrote:
>     This is a really good point. It is totally possible for a metric to track 
> activity on a topic that has no writes for a month, the first write would 
> then cause you to cycle through a month of samples. The logic around 
> correctly skipping these windows and calculating the correct window boundry 
> needs to be carefully worked out.
> 
> Jay Kreps wrote:
>     Actually maybe you can elaborate on why this is needed at all? In the 
> current code if the current sample is complete we add a new sample. Your code 
> adds lots of samples. But why do we need that? Isn't purging obsolete samples 
> handled on measurement already? I think that is more elegant. You must see 
> some issue there, maybe you can explain?
> 
> Dong Lin wrote:
>     Adi: Good catch! I will update the patch to fix the problem.
>     
>     Jay: Sure. Here is the problem I find with the current code.
>     
>     - The current code implements SampledStat.advance in a way that is 
> probably different from users' expectation: typically when rate is sampled, 
> time will be divided into consecutive slots of equal length and samples are 
> expired in unit of these time slots, which assures the user the rate is 
> measured on samples collected in the past expireAge period. The current 
> implementation doesn't provide such an easy-to-understand interpretations, 
> since the effective sampled period can be anywhere between 0 and expireAge.
>     
>     - As one extreme example, say we call rate.measure after expireAge (i.e. 
> config.samples() * config.timeWindowMs()) has passed since last sample. Then 
> purgeObsoleteSamples will reset all samples and elapsed will be 0 as a 
> result. This is problematic: rate should instead keep the information that we 
> haven’t observed any sample during the past expireAge.
>     
>     Does it make sense?

Hey Dong, I think what you are saying is that the last sample is partial, that 
is it doesn't cover a full window yet. This is true and is by design. That is 
the whole reason there are multiple samples--to stabalize the partial estimate. 
The ideal window would be a backwards looking sliding window of N ms from the 
time of measurement but that would be computationally unfeasible.

I don't understand your example. If no events have occurred in config.samples() 
* config.timeWindowMs() then the observed rate is 0, right? I think you are 
saying that the time estimate should be the full window, but that is fixed by 
changing the other formula as I suggested.

- Jay

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34170/#review83687
-----------------------------------------------------------

On May 14, 2015, 7:34 a.m., Dong Lin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34170/
> -----------------------------------------------------------
> 
> (Updated May 14, 2015, 7:34 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2191
>     https://issues.apache.org/jira/browse/KAFKA-2191
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> KAFKA-2191; Measured rate should not be infinite
> 
> 
> Diffs
> -----
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/stats/Rate.java 
> 98429da34418f7f1deba1b5e44e2e6025212edb3 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java 
> b341b7daaa10204906d78b812fb05fd27bc69373 
> 
> Diff: https://reviews.apache.org/r/34170/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Dong Lin
> 
>

Re: Review Request 34170: Patch for KAFKA-2191

Reply via email to