Thanks Sophie for the KIP, a few quick thoughts: 1) The end-to-end latency includes both the processing latency of the task and the latency spent sitting in intermediate topics. I have a similar feeling as Boyang mentioned above that the latency metric of a task A actually measures the latency of the sub-topology up-to but not including the processing of A, which is a bit weird.
Maybe the my feeling comes from the name "latency" itself, since today we already have several "latency" metrics already which are measuring elapsed system-time for processing a record / etc, while here we are comparing the system wallclock time with the record timestamp. Maybe we can consider renaming it as "record-staleness" (note we already have a "record-lateness" metric), in which case recording at the system-time before we start processing the record sounds more natural. 2) With that in mind, I'm wondering if the processor-node-level DEBUG metric is worth to add, given that we already have a task-level processing latency metric. Basically, a specific node's e2e latency is similar to the task-level e2e latency + task-level processing latency. Personally I think having a task-level record-staleness metric is sufficient. Guozhang On Wed, May 13, 2020 at 11:46 AM Sophie Blee-Goldman <sop...@confluent.io> wrote: > 1. I felt that 50% was not a particularly useful gauge for this specific > metric, as > it's presumably most useful at putting an *upper *bound on the latency you > can > reasonably expect to see. I chose percentiles that would hopefully give a > good > sense of what *most* records will experience, and what *close to all* > records > will. > > However I'm not married to these specific numbers and could be convinced. > Would be especially interested in hearing from users on this. > > 2. I'm inclined to not include the "hop-to-hop latency" in this KIP since > users > can always compute it themselves by subtracting the previous node's > end-to-end latency. I guess we could do it either way since you can always > compute one from the other, but I think the end-to-end latency feels more > valuable as it's main motivation is not to debug bottlenecks in the > topology but > to give users a sense of how long it takes arecord to be reflected in > certain parts > of the topology. For example this might be useful for users who are > wondering > roughly when a record that was just produced will be included in their IQ > results. > Debugging is just a nice side effect -- but maybe I didn't make that clear > enough > in the KIP's motivation. > > 3. Good question, I should address this in the KIP. The short answer is > "yes", > we will include late records. I added a paragraph to the end of the > Proposed > Changes section explaining the reasoning here, please let me know if you > have > any concerns. > > 4. Assuming you're referring to the existing metric "process-latency", that > metric > reflects the time for the literal Node#process method to run whereas this > metric > would always be measured relative to the event timestamp. > > That said, the naming collision there is pretty confusing so I've renamed > the > metrics in this KIP to "end-to-end-latency" which I feel better reflects > the nature > of the metric anyway. > > Thanks for the feedback! > > On Wed, May 13, 2020 at 10:21 AM Boyang Chen <reluctanthero...@gmail.com> > wrote: > > > Thanks for the KIP Sophie. Getting the E2E latency is important for > > understanding the bottleneck of the application. > > > > A couple of questions and ideas: > > > > 1. Could you clarify the rational of picking 75, 99 and max percentiles? > > Normally I see cases where we use 50, 90 percentile as well in production > > systems. > > > > 2. The current latency being computed is cumulative, I.E if a record goes > > through A -> B -> C, then P(C) = T(B->C) + P(B) = T(B->C) + T(A->B) + > T(A) > > and so on, where P() represents the captured latency, and T() represents > > the time for transiting the records between two nodes, including > processing > > time. For monitoring purpose, maybe having T(B->C) and T(A->B) are more > > natural to view as "hop-to-hop latency", otherwise if there is a spike in > > T(A->B), both P(B) and P(C) are affected in the same time. In the same > > spirit, the E2E latency is meaningful only when the record exits from the > > sink as this marks the whole time this record spent inside the funnel. Do > > you think we could have separate treatment for sink nodes and other > > nodes, so that other nodes only count the time receiving the record from > > last hop? I'm not proposing a solution here, just want to discuss this > > alternative to see if it is reasonable. > > > > 3. As we are going to monitor late arrival records as well, they would > > create some really spiky graphs when the out-of-order records are > > interleaving with on time records. Should we also supply a smooth version > > of the latency metrics, or user should just take care of it by themself? > > > > 4. Regarding this new metrics, we haven't discussed its relation with our > > existing processing latency metrics, could you add some context on > > comparison and a simple `when to use which` tutorial for the best? > > > > Boyang > > > > On Tue, May 12, 2020 at 7:28 PM Sophie Blee-Goldman <sop...@confluent.io > > > > wrote: > > > > > Hey all, > > > > > > I'd like to kick off discussion on KIP-613 which aims to add end-to-end > > > latency metrics to Streams. Please take a look: > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-613%3A+Add+end-to-end+latency+metrics+to+Streams > > > > > > Cheers, > > > Sophie > > > > > > -- -- Guozhang