> On March 23, 2014, 11:57 p.m., Jun Rao wrote:
> > Thanks for the patch. Some comments.
> > 
> > 1. For producer level stats (ie., not node or topic level), for jmx beans, 
> > the package name becomes kafka.producer (since the bean name is 
> > kafka.producer:type=client-id). For the topic and node level stats, jmx 
> > beans have the package name kafka.producer.client-id. Perhaps we can name 
> > the producer level stats under kafka.producer.client-id.aggregate?
> > 
> > 2. Could we make the metrics.timeWindowNs and metrics.samples configurable?
> > 
> > 3. Also, in Metrics, what's the purpose of eventWindow? It's not being used 
> > and I am not sure if there is a case when we will roll the window by count, 
> > instead of size.
> > 
> > 4. When testing using console producer, I sent 1 message and expected the 
> > topic level message rate to be non-zero for at least 30 secs (on average, I 
> > expect the non-zero value to last 45 secs). However, in multiple tests, the 
> > number drops to 0 well before 30 secs.
> > 
> > 5. Metric: Our measurement returns a double. However, we don't synchronize 
> > btw the reader and the writer of the double. Since double access is not 
> > guaranteed to be atomic in java, we could see incorrect value in the output.
> > 
> > 6. Not sure why we need to change the license header. Perhaps we can have a 
> > separate jira to standardize all headers if there is inconsistency?
> 
> Jay Kreps wrote:
>     1. Yes, that makes sense.
>     2. Yes, will do.
>     3. The windows are defined by a time AND event count whichever is 
> triggered first. I am not currently using this but it allows for more 
> responsive metrics updates for metrics where data is plentiful. That is 1000 
> measurements should be enough to get a reasonably low variance measurement so 
> you could make your updates more repsonsive by setting the window to be 1000 
> or 30 seconds, so that frequent metrics would potentially shrink the window 
> to just a few seconds.
>     4. There was a bug that Guozhang caught which I think was causing this. 
> Will be fixed in the next patch.
>     5. Access to the metric value is synchronized by the sensor in 
> KafkaMetric.
>     6. The licenses weren't at the default comment length. I can file a patch 
> to change them all.
> 
> Jun Rao wrote:
>     5. The synchronization in KafkaMetric only handles the reads of the 
> metric. The updates don't seem to be synchronized on the same lock, right?
>     
>     7. Also, noticed that when calculating a reading, we used the elapsed 
> time since the oldest update time, which may not align on the window 
> boundary. Say we have a window of 30 secs and an update comes in at 20 secs. 
> When calculating a rate, we will only account for 10 secs in that window. 
> However, actually only 1 event shows up in the whole 30 secs window.
> 
> Jay Kreps wrote:
>     5. The sensor's lock is used for the metrics the sensor owns so all reads 
> and writes go through that. The sensors acquires and releases the lock for 
> all metric updates and the metric acquires it for reads.
>     7. You raise a good point, though I'm not sure if your description is 
> right either. Currently I compute the elapsed time across all samples first. 
> However if one of the samples is very old (e.g. in the case where there have 
> been no updates) this elapsed time will be very long. However when I compute 
> the value across samples I will purge the old samples. This is backwards, I 
> need to first compute the value and do the purge, then compute the elapsed 
> time off the remaining samples so that these two numbers are comparable.
> 
> Jun Rao wrote:
>     5. Got it. It wasn't obvious to me that KafkaMetrics.lock is actually the 
> sensor lock. Perhaps we could rename it to sensorLock?
>     
>     7. Yes, that's another issue. My original question is a bit different. 
> Say we keep one sample with a window size of 30 secs. At time 20 sec an event 
> is recorded. We will use 1/(now - 20) to calculate the rate. However, there 
> wasn't any update from 0 to 20. So, may be the rate should be 1/(now - 0)?

5. Yeah it is a little tricky. I'd rather not rename as in the case of metrics 
which have no sensor associated with them that doesn't make sense.
7. I don't think I follow what you are saying. If over the course of 20 seconds 
we see one event, the rate is 1/20th per second. Discarding time in which we 
saw no events would artificially decrease the rate--that is if we get one event 
every 5 seconds we should not report the rate as 1 per second, but rather 1/5 
per second. Maybe I am misunderstanding.


- Jay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19388/#review38261
-----------------------------------------------------------


On March 20, 2014, 12:30 a.m., Jay Kreps wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19388/
> -----------------------------------------------------------
> 
> (Updated March 20, 2014, 12:30 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1251
>     https://issues.apache.org/jira/browse/KAFKA-1251
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> KAFKA-1251: Add metrics to the producer.
> 
> 
> Diffs
> -----
> 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 1ac69436f117800815b8d50f042e9e2a29364b43 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Metadata.java
>  33d62a4b83fbab5b22b91b22f6b744af1c98d262 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java
>  673b2962771c28ceb3c7a6c0fd6f69521bd7ed16 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordBatch.java
>  038a05a94b795ec0a95b2d40a89222394b5a74c4 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
> 565331dfb9cd1d65be37ed97830aa42e44d2e127 
>   
> clients/src/main/java/org/apache/kafka/clients/tools/ProducerPerformance.java 
> 3ebbb804242be6a001b3bae6524afccc85a87602 
>   clients/src/main/java/org/apache/kafka/common/metrics/Metrics.java 
> 6db2dfbe94c940efa37463298f0b0b1893e646e1 
>   clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java 
> 7e4849b7a148009c8a878349d7f0239108ccad8c 
>   clients/src/main/java/org/apache/kafka/common/metrics/stats/Rate.java 
> 3b0454f26490d1f4a2a80efb00165fc72587fbf8 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java 
> f8b413a8c273cdad56177fbc6971fece4feb86b3 
>   clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java 
> 9305b61ddeaa2bb400cbbb6d3c99c8ecaade6b8f 
>   clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java 
> 51d4892dfc18580e5e213d386c5de387a47d3c6b 
>   clients/src/main/java/org/apache/kafka/common/network/Selector.java 
> 983963200ce81614577cd6182a5d2f10c22b95d4 
>   clients/src/test/java/org/apache/kafka/clients/producer/MetadataTest.java 
> 09a5355d25a3b94c8e23caa2adc77cb1c59368b9 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/RecordAccumulatorTest.java
>  ed5690641a22fbe4bd91b0c6055d465944b08c06 
>   clients/src/test/java/org/apache/kafka/clients/producer/SenderTest.java 
> 12c9500ce4387306ab5aa7a5781b4aca52b86604 
>   clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java 
> 90e2dcf5434db546387302fb0219edfdb363592e 
>   clients/src/test/java/org/apache/kafka/test/MetricsBench.java 
> 7239b4a56e93f019e66aa2cf2aa9b04c26908bfd 
> 
> Diff: https://reviews.apache.org/r/19388/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jay Kreps
> 
>

Reply via email to