Re: Kafka Streams Internal Topic Retention not applied

2018-04-07 Thread Matthias J. Sax
Björn, Couple of answers: >> So, a streams internal topic for aggregation will be of cleanup.policy = >> compact. Yes. (for non-windowed aggregation) However, in your case, you are using a windowed aggregation, and there the policy is "compact,delete". Because for window aggregation, the key s

Re: Kafka Streams Internal Topic Retention not applied

2018-04-07 Thread Björn Häuser
Hello Matthias, thank you very much for your patience. I am still trying to understand the complete picture here. So, a streams internal topic for aggregation will be of cleanup.policy = compact. Which means that while doing aggregation ignoring tombstones records will cause havoc, right? Th

Re: Kafka Streams Internal Topic Retention not applied

2018-04-06 Thread Matthias J. Sax
Björn, broker configs are default config but can be overwritten when a topic is created. And this happens when Kafka Streams creates internal topics. Thus, you need to change the setting Kafka Streams applies when creating topics. Also note: if cleanup.policy = compact, the setting of `log.retent

Re: Kafka Streams Internal Topic Retention not applied

2018-04-06 Thread Björn Häuser
Hello Guozhang thanks. So after reading much more docs I still do not have the complete picture. These are our relevant settings from kafka broker configuration: log.cleanup.policy=delete # set log.retention.bytes to 15 gb log.retention.bytes=16106127360 # set log.retention.hours to 30 days log

Re: Kafka Streams Internal Topic Retention not applied

2018-03-28 Thread Guozhang Wang
Hello, You can set the topic-level configs via the StreamsConfig#topicPrefix(String), please find the following web docs (search for KIP-173): https://kafka.apache.org/documentation/streams/upgrade-guide#streams_api_changes_100 Guozhang On Wed, Mar 28, 2018 at 3:23 AM, Björn Häuser wrote:

Kafka Streams Internal Topic Retention not applied

2018-03-28 Thread Björn Häuser
Hello Everyone, we are running a Kafka Streams Application with does time window aggregates (using kafka 1.0.0). Unfortunately one of the changelog topics is now growing quite a bit in size maxing out the brokers. I did not find any settings in the kafka stream properties to configure retentio

Re: Kafka streams internal topic data retention

2017-04-30 Thread Sachin Mittal
topicConfigMap.put("segment.ms", "720"); And using Stores.create().enableLogging(topicConfigMap) you pass this map for creating internal topic (atleast for change log topics this is the way). Sachin On Sun, Apr 30, 2017 at 5:28 PM, Shimi Kiviti wrote: > Hi > > W

Kafka streams internal topic data retention

2017-04-30 Thread Shimi Kiviti
Hi Where can I find what is the Kafka streams internal topic data retention time and how to change it Thanks, Shimi

Re: Kafka Streams internal topic naming

2016-11-18 Thread Srikanth
Thanks Matthias/Michael/Guozhang! Using app id may help to some extent. Will have to think & test this through. Good to know there will be more direct support for this in the future. May be it will play well with KIP-37. Srikanth On Fri, Nov 18, 2016 at 1:12 PM, Guozhang Wang wrote: > Srikanth

Re: Kafka Streams internal topic naming

2016-11-18 Thread Guozhang Wang
Srikanth, Are you checking to see if you can manually set the internal topic names to follow your own naming convention in your shared cluster? For that the current answer is no, as Streams are trying to abstract users from worrying about them since they are treated as "internals" anyways. But I t

Re: Kafka Streams internal topic naming

2016-11-18 Thread Michael Noll
Srikanth, as Matthias said, you can achieve some namespacing effects through the use of (your own in-house) conventions of defining `application.id` across teams. The id is used as the prefix for topics, see http://docs.confluent.io/current/streams/developer-guide.html#required-configuration-para

Re: Kafka Streams internal topic naming

2016-11-17 Thread Matthias J. Sax
The only way to influence the naming is via application.id which you can set as you wish. Hope this is good enough to meet your naming conventions. As Michael mentioned, there is no way to manually specify internal topic names right now. -Matthias On 11/17/16 8:45 AM, Srikanth wrote: > That is r

Re: Kafka Streams internal topic naming

2016-11-17 Thread Srikanth
That is right Michael. Most teams that use kafka library can adhere to certain naming convention. Using streams API will break that. Srikanth On Wed, Nov 16, 2016 at 2:32 PM, Michael Noll wrote: > Srikanth, > > no, there's isn't any API to control the naming of internal topics. > > Is the reaso

Re: Kafka Streams internal topic naming

2016-11-16 Thread Michael Noll
Srikanth, no, there's isn't any API to control the naming of internal topics. Is the reason you're asking for such functionality only/mostly about multi-tenancy issues (as you mentioned in your first message)? -Michael On Wed, Nov 16, 2016 at 8:20 PM, Srikanth wrote: > Hello, > > Does kafka

Kafka Streams internal topic naming

2016-11-16 Thread Srikanth
Hello, Does kafka stream provide an API to control how internal topics are named? Right now it uses appId, operator name, etc. In a shared kafka cluster its common to have naming convention that may require some prefix/suffix. Srikanth

Re: kafka streams internal topic

2016-05-20 Thread Srikanth
Thanks for the details! I do see a pattern where through() is useful both explicitly and implicitly by the DSL. I guess that fits well with kafka streams design of utilizing kafka's strength. Srikanth On Fri, May 20, 2016 at 4:38 AM, Matthias J. Sax wrote: > Hi Srikanth, > > I basically agree

Re: kafka streams internal topic

2016-05-20 Thread Matthias J. Sax
Hi Srikanth, I basically agree on (1). We are still working on configuration options for Kafka Streams. For (2), you would get an error. If the number of partitions is not the same, the join cannot be computed. There is already a ticket to insert a re-partitioning step automatically, in case data

Re: kafka streams internal topic

2016-05-19 Thread Srikanth
Thanks Guozhang for your reply. I have a few follow-ups based on your response. Writing it inline would have made it hard to read. So here is the extract 1) *Internal topics use default retention policy*. Will it be better to add another config for this? Or something like topic.log.retention.hour

Re: kafka streams internal topic

2016-05-19 Thread Guozhang Wang
Hello Srikanth, Thanks for your questions, please see replies inlined. On Tue, May 17, 2016 at 7:36 PM, Srikanth wrote: > Hi, > > I was reading about Kafka streams and trying to understand its programming > model. > Some observations that I wanted to get some clarity on.. > > 1) Joins & aggreg

kafka streams internal topic

2016-05-17 Thread Srikanth
Hi, I was reading about Kafka streams and trying to understand its programming model. Some observations that I wanted to get some clarity on.. 1) Joins & aggregations use an internal topic for shuffle. Source processors will write to this topic with the key used for join. Then it is free to commi