> On March 23, 2014, 11:57 p.m., Jun Rao wrote: > > Thanks for the patch. Some comments. > > > > 1. For producer level stats (ie., not node or topic level), for jmx beans, > > the package name becomes kafka.producer (since the bean name is > > kafka.producer:type=client-id). For the topic and node level stats, jmx > > beans have the package name kafka.producer.client-id. Perhaps we can name > > the producer level stats under kafka.producer.client-id.aggregate? > > > > 2. Could we make the metrics.timeWindowNs and metrics.samples configurable? > > > > 3. Also, in Metrics, what's the purpose of eventWindow? It's not being used > > and I am not sure if there is a case when we will roll the window by count, > > instead of size. > > > > 4. When testing using console producer, I sent 1 message and expected the > > topic level message rate to be non-zero for at least 30 secs (on average, I > > expect the non-zero value to last 45 secs). However, in multiple tests, the > > number drops to 0 well before 30 secs. > > > > 5. Metric: Our measurement returns a double. However, we don't synchronize > > btw the reader and the writer of the double. Since double access is not > > guaranteed to be atomic in java, we could see incorrect value in the output. > > > > 6. Not sure why we need to change the license header. Perhaps we can have a > > separate jira to standardize all headers if there is inconsistency? > > Jay Kreps wrote: > 1. Yes, that makes sense. > 2. Yes, will do. > 3. The windows are defined by a time AND event count whichever is > triggered first. I am not currently using this but it allows for more > responsive metrics updates for metrics where data is plentiful. That is 1000 > measurements should be enough to get a reasonably low variance measurement so > you could make your updates more repsonsive by setting the window to be 1000 > or 30 seconds, so that frequent metrics would potentially shrink the window > to just a few seconds. > 4. There was a bug that Guozhang caught which I think was causing this. > Will be fixed in the next patch. > 5. Access to the metric value is synchronized by the sensor in > KafkaMetric. > 6. The licenses weren't at the default comment length. I can file a patch > to change them all. > > Jun Rao wrote: > 5. The synchronization in KafkaMetric only handles the reads of the > metric. The updates don't seem to be synchronized on the same lock, right? > > 7. Also, noticed that when calculating a reading, we used the elapsed > time since the oldest update time, which may not align on the window > boundary. Say we have a window of 30 secs and an update comes in at 20 secs. > When calculating a rate, we will only account for 10 secs in that window. > However, actually only 1 event shows up in the whole 30 secs window. > > Jay Kreps wrote: > 5. The sensor's lock is used for the metrics the sensor owns so all reads > and writes go through that. The sensors acquires and releases the lock for > all metric updates and the metric acquires it for reads. > 7. You raise a good point, though I'm not sure if your description is > right either. Currently I compute the elapsed time across all samples first. > However if one of the samples is very old (e.g. in the case where there have > been no updates) this elapsed time will be very long. However when I compute > the value across samples I will purge the old samples. This is backwards, I > need to first compute the value and do the purge, then compute the elapsed > time off the remaining samples so that these two numbers are comparable. > > Jun Rao wrote: > 5. Got it. It wasn't obvious to me that KafkaMetrics.lock is actually the > sensor lock. Perhaps we could rename it to sensorLock? > > 7. Yes, that's another issue. My original question is a bit different. > Say we keep one sample with a window size of 30 secs. At time 20 sec an event > is recorded. We will use 1/(now - 20) to calculate the rate. However, there > wasn't any update from 0 to 20. So, may be the rate should be 1/(now - 0)?
5. Yeah it is a little tricky. I'd rather not rename as in the case of metrics which have no sensor associated with them that doesn't make sense. 7. I don't think I follow what you are saying. If over the course of 20 seconds we see one event, the rate is 1/20th per second. Discarding time in which we saw no events would artificially decrease the rate--that is if we get one event every 5 seconds we should not report the rate as 1 per second, but rather 1/5 per second. Maybe I am misunderstanding. - Jay ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19388/#review38261 ----------------------------------------------------------- On March 20, 2014, 12:30 a.m., Jay Kreps wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19388/ > ----------------------------------------------------------- > > (Updated March 20, 2014, 12:30 a.m.) > > > Review request for kafka. > > > Bugs: KAFKA-1251 > https://issues.apache.org/jira/browse/KAFKA-1251 > > > Repository: kafka > > > Description > ------- > > KAFKA-1251: Add metrics to the producer. > > > Diffs > ----- > > clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java > 1ac69436f117800815b8d50f042e9e2a29364b43 > > clients/src/main/java/org/apache/kafka/clients/producer/internals/Metadata.java > 33d62a4b83fbab5b22b91b22f6b744af1c98d262 > > clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java > 673b2962771c28ceb3c7a6c0fd6f69521bd7ed16 > > clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordBatch.java > 038a05a94b795ec0a95b2d40a89222394b5a74c4 > > clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java > 565331dfb9cd1d65be37ed97830aa42e44d2e127 > > clients/src/main/java/org/apache/kafka/clients/tools/ProducerPerformance.java > 3ebbb804242be6a001b3bae6524afccc85a87602 > clients/src/main/java/org/apache/kafka/common/metrics/Metrics.java > 6db2dfbe94c940efa37463298f0b0b1893e646e1 > clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java > 7e4849b7a148009c8a878349d7f0239108ccad8c > clients/src/main/java/org/apache/kafka/common/metrics/stats/Rate.java > 3b0454f26490d1f4a2a80efb00165fc72587fbf8 > > clients/src/main/java/org/apache/kafka/common/metrics/stats/SampledStat.java > f8b413a8c273cdad56177fbc6971fece4feb86b3 > clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java > 9305b61ddeaa2bb400cbbb6d3c99c8ecaade6b8f > clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java > 51d4892dfc18580e5e213d386c5de387a47d3c6b > clients/src/main/java/org/apache/kafka/common/network/Selector.java > 983963200ce81614577cd6182a5d2f10c22b95d4 > clients/src/test/java/org/apache/kafka/clients/producer/MetadataTest.java > 09a5355d25a3b94c8e23caa2adc77cb1c59368b9 > > clients/src/test/java/org/apache/kafka/clients/producer/RecordAccumulatorTest.java > ed5690641a22fbe4bd91b0c6055d465944b08c06 > clients/src/test/java/org/apache/kafka/clients/producer/SenderTest.java > 12c9500ce4387306ab5aa7a5781b4aca52b86604 > clients/src/test/java/org/apache/kafka/common/network/SelectorTest.java > 90e2dcf5434db546387302fb0219edfdb363592e > clients/src/test/java/org/apache/kafka/test/MetricsBench.java > 7239b4a56e93f019e66aa2cf2aa9b04c26908bfd > > Diff: https://reviews.apache.org/r/19388/diff/ > > > Testing > ------- > > > Thanks, > > Jay Kreps > >