Hi Guozhang, I also had the same thought about using the existing "dropped-records" metrics. However, I think in this case it would be better to use a new metric because dropped idempotent updates is an optimization, they do not represent missed records. The dropped idempotent updates in general do not change the result and so do not need a warn log message. Whereas dropped records due to expired windows, serialization errors, or lateness might be something concerning that need a warn log message.
Looking at the metrics, you would be happy to see "dropped-idempotent-updates" increase, because that means Streams gets rid of no-ops downstream, but you would be concerned if "dropped-records" would increase, because that means your records or the configuration of your app has issues. The "dropped-idempotent-updates" metric could also be an indication that you could further optimize your setup, by getting rid of idempotent updates further upstream. Best, Bruno On Tue, Mar 3, 2020 at 7:58 AM Guozhang Wang <wangg...@gmail.com> wrote: > > Hello Richard, > > Thanks for the KIP. I once reviewed it and was concerned about its effects > on stream time advancing. After reading the updated KIP I think it has > answered a lot of them already. > > I have a couple minor comments still, otherwise I'm +1: > > 1) I want to clarify that for operations resulted in KTables (not only > aggregations, but consider KTable#filter that may also result in a new > KTable), even if we drop emissions to the downstream topics we would still > append to the corresponding changelog if timestamp has changed. This is > because the timestamps on the changelog is read by the standby tasks which > relies on them to infer its own stream time advancing. > > 2) About the metrics, in KIP-444 we are consolidating all types of > scenarios that can cause dropped records to the same metrics: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-557%3A+Add+emit+on+change+support+for+Kafka+Streams > > late-records-drop: INFO at processor node level, replaced by INFO > task-level "dropped-records". > > skipped-records: INFO at thread and processor node level, replaced by INFO > task-level "dropped-records". > > expired-window-record-drop: DEBUG at state store level, replaced by INFO > task-level "dropped-records". > > The main idea is that instead of using different metrics to indicate > different types of scenarios, and users just alert on that single metrics. > When alert triggers, they can look into the log4j for its causes (we made > sure that all sensor recordings of this metric would be associated with a > warning log4j). > > So I'd suggest that instead of introducing a new per-node > "dropped-idempotent-updates", we just piggy-back on the existing task-level > metric; unless we think that idempotent drops are more frequent than others > and also they do not worth a warning log, in that case we can consider > break this metric down with different tags for example. > > Guozhang > > On Mon, Mar 2, 2020 at 1:59 PM Richard Yu <yohan.richard...@gmail.com> > wrote: > > > Hi all, > > > > Thanks for the votes so far! > > @Matthias or @Guozhang Wang <guozh...@confluent.io> it would be great to > > also get your input on this KIP. > > > > It looks to be pretty close to completion, so the finishing touches are all > > we need. :) > > > > Best, > > Richard > > > > On Mon, Mar 2, 2020 at 11:45 AM Ghassan Yammine < > > ghassan.yamm...@bazaarvoice.com> wrote: > > > > > Hello all, > > > > > > +1 (non-binding) > > > > > > Thanks, > > > > > > Ghassan > > > > > > On 3/2/20, 12:43 PM, "Bruno Cadonna" <br...@confluent.io> wrote: > > > > > > EXTERNAL: This email originated from outside of Bazaarvoice. Do not > > > click any links or open any attachments unless you trust the sender and > > > know the content is safe. > > > > > > > > > Hi Richard, > > > > > > +1 (non-binding) > > > > > > Best, > > > Bruno > > > > > > On Mon, Mar 2, 2020 at 4:33 PM John Roesler <vvcep...@apache.org> > > > wrote: > > > > > > > > Hi Richard, > > > > > > > > Thanks for the KIP! > > > > > > > > I'm +1 (binding) > > > > > > > > -john > > > > > > > > On Thu, Feb 27, 2020, at 14:40, Richard Yu wrote: > > > > > Hi all, > > > > > > > > > > I am proposing a new optimization to Kafka Streams which would > > > greatly > > > > > reduce the number of idempotent updates (or no-ops) in the Kafka > > > Streams > > > > > DAG. > > > > > A number of users have been interested in this feature, so it > > > would be nice > > > > > to pass this one in. > > > > > > > > > > For information, the KIP is described below: > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-557%3A+Add+emit+on+change+support+for+Kafka+Streams > > > > > > > > > > We aim to make Kafka Streams more efficient by adopting the "emit > > > on > > > > > change" reporting strategy. > > > > > > > > > > Please cast your vote! > > > > > > > > > > Best, > > > > > Richard > > > > > > > > > > > > > > > > > > > -- > -- Guozhang