Björn,
Couple of answers:
>> So, a streams internal topic for aggregation will be of cleanup.policy =
>> compact.
Yes. (for non-windowed aggregation)
However, in your case, you are using a windowed aggregation, and there
the policy is "compact,delete". Because for window aggregation, the key
s
Hello Matthias,
thank you very much for your patience.
I am still trying to understand the complete picture here.
So, a streams internal topic for aggregation will be of cleanup.policy =
compact.
Which means that while doing aggregation ignoring tombstones records will cause
havoc, right?
Th
Björn,
broker configs are default config but can be overwritten when a topic is
created. And this happens when Kafka Streams creates internal topics.
Thus, you need to change the setting Kafka Streams applies when creating
topics.
Also note: if cleanup.policy = compact, the setting of `log.retent
Hello Guozhang
thanks.
So after reading much more docs I still do not have the complete picture.
These are our relevant settings from kafka broker configuration:
log.cleanup.policy=delete
# set log.retention.bytes to 15 gb
log.retention.bytes=16106127360
# set log.retention.hours to 30 days
log
Hello,
You can set the topic-level configs via the
StreamsConfig#topicPrefix(String), please find the following web docs
(search for KIP-173):
https://kafka.apache.org/documentation/streams/upgrade-guide#streams_api_changes_100
Guozhang
On Wed, Mar 28, 2018 at 3:23 AM, Björn Häuser
wrote:
Hello Everyone,
we are running a Kafka Streams Application with does time window aggregates
(using kafka 1.0.0).
Unfortunately one of the changelog topics is now growing quite a bit in size
maxing out the brokers. I did not find any settings in the kafka stream
properties to configure retentio
topicConfigMap.put("segment.ms", "720");
And using Stores.create().enableLogging(topicConfigMap) you pass this map
for creating internal topic (atleast for change log topics this is the way).
Sachin
On Sun, Apr 30, 2017 at 5:28 PM, Shimi Kiviti wrote:
> Hi
>
> W
Hi
Where can I find what is the Kafka streams internal topic data retention
time and how to change it
Thanks,
Shimi
Thanks Matthias/Michael/Guozhang!
Using app id may help to some extent. Will have to think & test this
through.
Good to know there will be more direct support for this in the future. May
be it will play well with KIP-37.
Srikanth
On Fri, Nov 18, 2016 at 1:12 PM, Guozhang Wang wrote:
> Srikanth
Srikanth,
Are you checking to see if you can manually set the internal topic names to
follow your own naming convention in your shared cluster? For that the
current answer is no, as Streams are trying to abstract users from worrying
about them since they are treated as "internals" anyways. But I t
Srikanth,
as Matthias said, you can achieve some namespacing effects through the use
of (your own in-house) conventions of defining `application.id` across
teams. The id is used as the prefix for topics, see
http://docs.confluent.io/current/streams/developer-guide.html#required-configuration-para
The only way to influence the naming is via application.id which you can
set as you wish. Hope this is good enough to meet your naming conventions.
As Michael mentioned, there is no way to manually specify internal topic
names right now.
-Matthias
On 11/17/16 8:45 AM, Srikanth wrote:
> That is r
That is right Michael. Most teams that use kafka library can adhere to
certain naming convention.
Using streams API will break that.
Srikanth
On Wed, Nov 16, 2016 at 2:32 PM, Michael Noll wrote:
> Srikanth,
>
> no, there's isn't any API to control the naming of internal topics.
>
> Is the reaso
Srikanth,
no, there's isn't any API to control the naming of internal topics.
Is the reason you're asking for such functionality only/mostly about
multi-tenancy issues (as you mentioned in your first message)?
-Michael
On Wed, Nov 16, 2016 at 8:20 PM, Srikanth wrote:
> Hello,
>
> Does kafka
Hello,
Does kafka stream provide an API to control how internal topics are named?
Right now it uses appId, operator name, etc.
In a shared kafka cluster its common to have naming convention that may
require some prefix/suffix.
Srikanth
Thanks for the details!
I do see a pattern where through() is useful both explicitly and implicitly
by the DSL. I guess that fits well with kafka streams design of utilizing
kafka's strength.
Srikanth
On Fri, May 20, 2016 at 4:38 AM, Matthias J. Sax
wrote:
> Hi Srikanth,
>
> I basically agree
Hi Srikanth,
I basically agree on (1). We are still working on configuration options
for Kafka Streams.
For (2), you would get an error. If the number of partitions is not the
same, the join cannot be computed. There is already a ticket to insert a
re-partitioning step automatically, in case data
Thanks Guozhang for your reply.
I have a few follow-ups based on your response. Writing it inline would
have made it hard to read. So here is the extract
1) *Internal topics use default retention policy*.
Will it be better to add another config for this? Or something like
topic.log.retention.hour
Hello Srikanth,
Thanks for your questions, please see replies inlined.
On Tue, May 17, 2016 at 7:36 PM, Srikanth wrote:
> Hi,
>
> I was reading about Kafka streams and trying to understand its programming
> model.
> Some observations that I wanted to get some clarity on..
>
> 1) Joins & aggreg
Hi,
I was reading about Kafka streams and trying to understand its programming
model.
Some observations that I wanted to get some clarity on..
1) Joins & aggregations use an internal topic for shuffle. Source
processors will write to this topic with the key used for join. Then it is
free to commi
20 matches
Mail list logo