Re: [DISCUSS] KIP-606: Add Metadata Context to MetricsReporter

2020-05-22 Thread Thomas Becker
This looks useful, I think the only nit I would pick would be to name the MetricsReporter method contextChanged (past tense), which seems more conventional for methods like this. On Tue, 2020-05-05 at 16:58 -0700, Xavier Léauté wrote: [EXTERNAL EMAIL] Attention: This email was sent from outsid

Re: [DISCUSS] KIP-566: Add rebalance callbacks to ConsumerInterceptor

2020-05-05 Thread Thomas Becker
Bumping to get and get some attention on this KIP before initiating a vote. Using ConsumerInterceptor for its intended purpose quite difficult without this. On Mon, 2020-02-10 at 15:50 +, Thomas Becker wrote: [EXTERNAL EMAIL] Attention: This email was sent from outside TiVo. DO NOT CLICK

Re: [DISCUSS] KIP-566: Add rebalance callbacks to ConsumerInterceptor

2020-02-10 Thread Thomas Becker
Bumping this again for visibility. If no one has any comments, maybe I'll just start the VOTE thread? On Wed, 2020-01-29 at 22:24 +0000, Thomas Becker wrote: [EXTERNAL EMAIL] Attention: This email was sent from outside TiVo. DO NOT CLICK any links or attachments unless you expected

Re: [KAFKA-557] Add emit on change support for Kafka Streams

2020-02-04 Thread Thomas Becker
case because it's mostly metadata, though to be honest we haven't looked at headers much (mostly because, and to your point, support seems to be lacking). I feel like there would be other cases where this feature could be valuable, but I admit I can't come up with anything right thi

Re: [KAFKA-557] Add emit on change support for Kafka Streams

2020-02-03 Thread Thomas Becker
help think of a use case that isn’t also served by filtering or mapping beforehand? Thanks for helping to design this feature! -John On Fri, Jan 31, 2020, at 18:56, yuzhih...@gmail.com<mailto:yuzhih...@gmail.com> wrote: I think this is good idea. On Jan 31, 2020, at 4:49 P

Re: [KAFKA-557] Add emit on change support for Kafka Streams

2020-01-31 Thread Thomas Becker
How do folks feel about allowing the mechanism by which no-ops are detected to be pluggable? Meaning use something like a hash by default, but you could optionally provide an implementation of something to use instead, like a ChangeDetector. This could be useful for example to ignore changes to

Re: [DISCUSS] KIP-566: Add rebalance callbacks to ConsumerInterceptor

2020-01-29 Thread Thomas Becker
from outside TiVo. DO NOT CLICK any links or attachments unless you expected them. Hey Thomas, On Thu, 23 Jan 2020 at 21:17, Thomas Becker wrote: > Hi folks, > I'd like to open the discussion for KIP-566: Add rebalance callbacks to > ConsumerInte

[DISCUSS] KIP-566: Add rebalance callbacks to ConsumerInterceptor

2020-01-23 Thread Thomas Becker
Hi folks, I'd like to open the discussion for KIP-566: Add rebalance callbacks to ConsumerInterceptor. We've been looking to implement some custom metrics via ConsumerInterceptor, and not knowing when partition ownership changes is a significant impediment. I'd appreciate your thoughts. https:/

Create KIP Permission

2020-01-23 Thread Thomas Becker
I'd like permission to create a KIP please. My confluence account is twbecker. This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any

Re: Proposal for EmbeddedKafkaCluster: Support JUnit 5 Extension in addition to JUnit 4 Rule

2019-12-09 Thread Thomas Becker
fkaCluster into the public API. Particularly, for testing, we need something more efficient and that can be more synchronous, but still presents the same API as a broker. This would be a ton of work to design an build, though, which I assume is why no one has done it. Thanks, -John On Fri

Re: Proposal for EmbeddedKafkaCluster: Support JUnit 5 Extension in addition to JUnit 4 Rule

2019-12-06 Thread Thomas Becker
Personally, I would love to see EmbeddedKafkaCluster moved to a public test artifact, similarly to kafka-streams-test-utils. Having to copy/paste internal classes into your project is...unfortunate. On Fri, 2019-12-06 at 10:42 -0600, John Roesler wrote: [EXTERNAL EMAIL] Attention: This email w

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-07 Thread Thomas Becker
+1 non-binding We've run into issues trying to decorate the AdminClient due it being an abstract class. Will be nice to have consistency with Producer/Consumer as well. On Tue, 2019-06-04 at 17:17 +0100, Andy Coates wrote: Hi folks As there's been no chatter on this KIP I'm assuming it's non-

Re: [VOTE] KIP-349 Priorities for Source Topics

2019-01-24 Thread Thomas Becker
Yes, I think this type of strategy interface would be valuable. On Wed, 2019-01-16 at 15:41 +, Jan Filipiak wrote: On 16.01.2019 14:05, Thomas Becker wrote: I'm going to bow out of this discussion since it's been made clear that the feature is not targeted at streams. But for

Re: [VOTE] KIP-349 Priorities for Source Topics

2019-01-16 Thread Thomas Becker
I'm going to bow out of this discussion since it's been made clear that the feature is not targeted at streams. But for the record, my desire is to have an alternative to the timestamp based message choosing strategy streams currently imposes, and I thought topic prioritization in the consumer c

Re: [VOTE] KIP-349 Priorities for Source Topics

2018-10-08 Thread Thomas Becker
priority topic partitions are at the HW before I can decide if I want to poll the lower priority ones. Right? On Fri, 2018-10-05 at 11:34 -0700, Colin McCabe wrote: On Fri, Oct 5, 2018, at 10:58, Thomas Becker wrote: Colin, Would you mind sharing your vision for how this looks with multiple c

Re: [VOTE] KIP-349 Priorities for Source Topics

2018-10-05 Thread Thomas Becker
Colin, Would you mind sharing your vision for how this looks with multiple consumers? I'm still getting my bearings with the new consumer but it's not immediately obvious to me how this would work. In particular, it doesn't seem particularly easy to know when you are at the high watermark of a t

Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Thomas Becker
ecord, period. Trying to force a temporal relationship between that and an event where the item was viewed is non-sensical. On Mon, 2018-09-17 at 18:18 +0000, Thomas Becker wrote: Hi Matthias, I'm familiar with how the timestamp synchronization currently works. I also submit that it doe

Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Thomas Becker
at---from my point of view---is semantically incorrect. Shameless plug: you might want to read https://www.confluent.io/blog/streams-tables-two-sides-same-coin -Matthias On 9/17/18 8:23 AM, Thomas Becker wrote: For my part, a major use-case for this feature is stream-table joins. Currently, Kafk

Re: [DISCUSS] KIP-349 Priorities for Source Topics

2018-09-17 Thread Thomas Becker
For my part, a major use-case for this feature is stream-table joins. Currently, KafkaStreams does the wrong thing in some cases because the only message choosing strategy available is timestamp-based. So table records that have been updated recently will not be read until the stream records rea

Re: [VOTE] KIP-349 Priorities for Source Topics

2018-08-20 Thread Thomas Becker
I agree with Jan. A strategy interface for choosing processing order is nice, and would hopefully be a step towards getting this in streams. -Tommy On Mon, 2018-08-20 at 12:52 +0200, Jan Filipiak wrote: On 20.08.2018 00:19, Matthias J. Sax wrote: @Nick: A KIP is only accepted if it got 3 bindi

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Thomas Becker
duce out-of-ordering. Guozhang On Tue, Aug 7, 2018 at 9:59 AM, Thomas Becker mailto:thomas.bec...@tivo.com>> wrote: Thanks Guozhang. So in the scenario you describe, where one topic has vastly lower throughput, you're saying that when the lower throughput topic is fully caug

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Thomas Becker
is topic itself should be discussed as a separate KIP, maybe for both Streams and Consumer clients, and hence I intentionally avoid overlapping with it and stays with a static messaging choosing mechanism in my KIP. Guozhang On Tue, Aug 7, 2018 at 4:55 AM, Thomas Becker mailto:thomas.bec...@tivo.c

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Thomas Becker
+1 (non-binding) We've hit issues with the log cleaner in the past, and this would be a great improvement. On Tue, 2018-08-07 at 12:19 +0100, Stanislav Kozlovski wrote: Hey everybody, I'm starting a vote on KIP-346

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Thomas Becker
This looks like a big step in the right direction IMO. So am I correct in assuming this idle period would only come into play after startup when waiting for initial records to be fetched? In other words, once we have seen records from all topics and have established the stream time processing wi

Re: Builder Pattern for kafka-clients in 2.x ?

2018-07-05 Thread Thomas Becker
Personally, I like the idea of builders for the producer/consumer themselves, but I'm less enthusiastic about one for ProducerRecord. Mostly because I think the following is overly verbose/reads poorly: producer.send(ProducerRecord.builder() .topic("mytopic") .key("Key") .value("the-val"

Re: [DISCUSS] KIP-221: Repartition Topic Hints in Streams

2017-11-06 Thread Thomas Becker
ith either `-repartition` or `-changelog`. Thus, from my point of view, it would make sense to keep the current distinction. -Matthias On 11/6/17 4:45 PM, Thomas Becker wrote: I think this sounds good as well. It's worth clarifying whether topics that are named by the user but created

Re: [DISCUSS] KIP-221: Repartition Topic Hints in Streams

2017-11-06 Thread Thomas Becker
I think this sounds good as well. It's worth clarifying whether topics that are named by the user but created by streams are considered "internal" topics also. On Sun, 2017-11-05 at 23:02 +0100, Matthias J. Sax wrote: My idea was, to relax the requirement for through() that a topic must be creat

Re: [DISCUSS] KIP-211: Revise Expiration Semantics of Consumer Group Offsets

2017-10-19 Thread Thomas Becker
I think it would be helpful to clarify what happens if consumers rejoin an empty group. I would presume that the expiration timer is stopped and reset back to offsets.retention.minutes when it is empty again but the KIP doesn't say. On Wed, 2017-10-18 at 16:45 -0700, Vahid S Hashemian wrote: H

RE: GlobalKTable limitations

2017-05-25 Thread Thomas Becker
c and then constructing the GlobalKTable from the latter? The GlobalKTable has the limitations you mention since it was primarily designed for joins only. We should consider allowing a less restrictive interface if it makes sense. Eno > On 25 May 2017, at 14:48, Thomas Becker wrote: > > We nee

GlobalKTable limitations

2017-05-25 Thread Thomas Becker
We need to do a series of joins against a KTable that we can't co- partition with the stream, so we're looking at GlobalKTable. But the topic backing the table is not ideally keyed for the sort of lookups this particular processor needs to do. Unfortunately, GlobalKTable is very limited in that yo

Re: [VOTE] KIP-138: Change punctuate semantics

2017-05-10 Thread Thomas Becker
+1 On Wed, 2017-05-10 at 10:52 +0100, Michal Borowiecki wrote: > Hi all, > > This vote thread has gone quiet. > > In view of the looming cut-off for 0.11.0.0 I'd like to encourage > anyone > who cares about this to have a look and vote and/or comment on this > proposal. > > Thanks, > > Michał > >

GlobalKTable not checkpointing offsets but reusing store

2017-05-09 Thread Thomas Becker
I'm experimenting with a streams application that does a KStream- GlobalKTable join, and I'm seeing some unexpected behavior when re- running the application. First, it does not appear that the offsets in the topic backing the GlobalKTable are being checkpointed to a file as I expected. This result

Exiting a streams app at end of stream?

2017-05-03 Thread Thomas Becker
We have had a number of situations where we need to migrate data in a Kafka topic to a new topic that is keyed differently. Stream processing is a good fit for this use-case with one exception: there is no easy way to know when your "migration job" is finished. Has any thought been given to adding

Re: [VOTE] KIP-123: Allow per stream/table timestamp extractor

2017-04-24 Thread Thomas Becker
+1 (non-binding) On Tue, 2017-02-28 at 08:59 +, Jeyhun Karimov wrote: > Dear community, > > I'd like to start the vote for KIP-123: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=6871 > 4788 > > > Cheers, > Jeyhun -- Tommy Becker Senior Software Engineer O +

Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-11 Thread Thomas Becker
- > > 15926036 > > > > > > > > > > > >>>> ], > > > > > > > > >>>> > > > > > > > > >>>>- Our approached involved using the event time &g

Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-04 Thread Thomas Becker
gt; suggest > > enum PunctuationType { > EVENT_TIME, > SYSTEM_TIME, > } > > or similar. Just to keep the door open -- it's easier to add new > stuff > if the name is more generic. > > > -Matthias > > > On 4/4/17 5:30 AM, Thomas Becker wrote: > > &g

Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-04 Thread Thomas Becker
although I don't see an actual use case where you might need > > anything > > else then those two). Hence I also proposed the option to allow > > users > > to, effectively, decide what "stream time" is for them given the > > presence or absence of messages, much

Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-03 Thread Thomas Becker
Although I fully agree we need a way to trigger periodic processing that is independent from whether and when messages arrive, I'm not sure I like the idea of changing the existing semantics across the board. What if we added an additional callback to Processor that can be scheduled similarly to pu

Old producer slow/no recovery on broker failure

2017-02-09 Thread Thomas Becker
We ran into an incident a while back where one of our broker machines abruptly went down (AWS is fun). While the leadership transitions and so forth seemed to work correctly with the remaining brokers, our producers hung shortly thereafter. I should point out that we are using the old Scala produce