Matthias, Thanks again for the detailed explanation.
My customer doesn't want this information, but I do. They want simple aggregations (Tell me how many events happened on this stream between time 0-30, 30-60, 60-90, etc). Sometimes the aggregations don't fire at the time I would expect. This is where I want to interrogate the stream prior to the aggregation so I can see the timestamps of the partitions feeding the aggregation, and maybe that will help me pinpoint that I have a topic/partition that isn't being used(maybe due to a bad partitioning scheme on my part?). Does that make sense? Dan On Mon, Apr 2, 2018 at 8:27 PM Matthias J. Sax <matth...@confluent.io> wrote: > Atm, Kafka does not ship any ready-to-use transformers. All operators > that are ready-to-use are provided via the DSL only and their > implementation itself is not public API. > > The question arises though, if a transformer like this would be the best > way to expose this information or if there might be a better way if we > include something like this in the library directly. > > I guess it also depends on the use case in detail. Can you elaborate why > your customer wants this information? If it's something that is > generally useful/applicable, we are of course interested. > > We are always open for suggestions and happy if people contribute. If > you are interested, please subscribe to the dev list and we can discuss > there. Or just create a JIRA with a feature request as starting point. > Whatever suits you better. > > > -Matthias > > > On 4/2/18 3:00 PM, dan bress wrote: > > Hi Matthias, > > > > Yes, I'm aware of Wall-clock time punctuation, but that is not the > behavior > > my customers want, so I can't use that unfortunately... > > > > OK. I'm going to try and write a utility "StreamTimeInspector" > transformer > > that will do as you suggest and give me the stats the I want. If we find > > this useful at my company I'll consider submitting it as a PR. > > > > Is there any current precedent around transformers like this in > > KafkaStreams? Transformers you insert for interrogation only? > > > > Dan > > > > On Mon, Apr 2, 2018 at 2:10 PM Matthias J. Sax <matth...@confluent.io> > > wrote: > > > >> Dan, > >> > >> Kafka Streams supports two types of punctuation: event-time and > >> wall-clock time punctuation. Wall-clock time punctuation will be called > >> even if event-time does not progress and even if there is no new input > >> data available for processing. Not sure if this is what you are looking > >> for. > >> > >> Beside this, Processor API exposes record metadata information. Thus, > >> you can plugin a custom `transform()` step to access and maintain the > >> information/stats you are interested in (The provided `context` object > >> from `init()` is your friend for this.) and maybe make it queryable via > >> Interactive Queries? > >> > >> Hope this helps. > >> > >> > >> -Matthias > >> > >> On 4/2/18 11:26 AM, dan bress wrote: > >>> I have an app that is consuming multiple topics, and punctuating on > >> stream > >>> time. I know that punctuate is driven by the min time of all the > >>> partitions of all the topics driving the transformer that I am > >> punctuating > >>> on. When I deploy my app and punctuate is not called as I expect, what > >>> tools do I have to understand where time is per > >>> instance/thread/topic/partition? Does Kafka Streams expose stats for > >> this? > >>> > >>> I would like something like a stat of something like: > >>> > >>> Kafka Streams Instance / Kafka Stream Thread / Topic / Partition / > >>> EventTime = 1522693495291 > >>> > >>> Then I can look at which topic/partition is lagging to debug my > problem. > >>> > >>> Does something like this exist? Is this a reasonable feature request? > >>> > >>> Thanks! > >>> Dan > >>> > >> > >> > > > >