Thanks Bruno and Patrik,
my fault was that since on confluent cloud auto creation is set to false,
I've manually created the topics, without taking care of changing the
cleanup policy.
Doing so the store changelogs kept the whole changes, in fact after
enabling the compacting policy the storage us
Hi
Regarding the I/O, RocksDB has something called write amplification which
writes the data to multiple levels internally to enable better optimization at
the cost of storage and I/O.
This is also the reason the stores can get larger than the topics themselves.
This can be modified by RocksDB se
Hi Alessandro,
> - how do I specify the retention period of the data? Just by setting the
> max retention time for the changelog topic?
For window and session stores, you can set retention time on a local
state store by using Materialized.withRetention(...). Consult the
javadocs for details. If
I've just noticed that the store topic created automatically by our streams
app have different cleanup.policy.
I think that's the main reason I'm seeing that big read/write IO, having a
compact policy instead of delete would make the topic much smaller.
I'll try that to also see the impact on our s
Hi Bruno,
Oh I see, I'll try to add a persistent disk where the local stores are.
I've other questions then:
- why is it also writing that much?
- how do I specify the retention period of the data? Just by setting the
max retention time for the changelog topic?
- wouldn't be possible, for examp
Hi Alessandro,
I am not sure I understand your issue completely. If you start your
streams app in a new container without any existing local state, then
it is expected that the changelog topics are read from the beginning
to restore the local state stores. Am I misunderstanding you?
Best,
Bruno
I think I'm having again this issue, this time though it only happens on
some state stores.
Here you can find the code and the logs
https://gist.github.com/alex88/1d60f63f0eee9f1568d89d5e1900fffc
We first seen that our confluent cloud bill went up 10x, then seen that our
streams processor was rest
I'm not sure, one thing I know for sure is that on the cloud control panel,
in the consumer lag page, the offset didn't reset on the input topic, so it
was probably something after that.
Anyway, thanks a lot for helping, if we experience that again I'll try to
add more verbose logging to better un
Honestly I cannot think of an issue that fixed in 2.2.1 but not in 2.2.0
which could be correlated to your observations:
https://issues.apache.org/jira/issues/?filter=-1&jql=project%20%3D%20KAFKA%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.2.1%20AND%20component%20%3D%20streams%20
Yes that's right,
could that be the problem? Anyway, so far after upgrading to 2.2.1 from
2.2.0 we didn't experience that problem anymore.
Regards
--
Alessandro Tagliapietra
On Thu, Jun 6, 2019 at 10:50 AM Guozhang Wang wrote:
> That's right, but local state is used as a "materialized view"
that if you
can live with the extra delay at the beginning of the app restart, stateful set
is not required
From: Guozhang Wang
Sent: Thursday, June 6, 2019 10:13 AM
To: users@kafka.apache.org
Subject: Re: Streams reprocessing whole topic when deployed but not
That's right, but local state is used as a "materialized view" of your
changelog topics: if you have nothing locally, then it has to bootstrap
from the beginning of your changelog topic.
But I think your question was about the source "sensors-input" topic, not
the changelog topic. I looked at the
Isn't the windowing state stored in the additional state store topics that
I had to additionally create?
Like these I have here:
sensors-pipeline-KSTREAM-AGGREGATE-STATE-STORE-01-changelog
sensors-pipeline-KTABLE-SUPPRESS-STATE-STORE-04-changelog
Thank you
--
Alessandro Tagliapi
If you deploy your streams app into a docker container, you'd need to make
sure local state directories are preserved, since otherwise whenever you
restart all the state would be lost and Streams then has to bootstrap from
scratch. E.g. if you are using K8s for cluster management, you'd better use
Hi Guozhang,
sorry, by "app" i mean the stream processor app, the one shown in
pipeline.kt.
The app reads a topic of data sent by a sensor each second and generates a
20 second window output to another topic.
My "problem" is that when running locally with my local kafka setup, let's
say I stop it
Hello Alessandro,
What did you do for `restarting the app online`? I'm not sure I follow the
difference between "restart the streams app" and "restart the app online"
from your description.
Guozhang
On Wed, Jun 5, 2019 at 10:42 AM Alessandro Tagliapietra <
tagliapietra.alessan...@gmail.com> wr
Hello everyone,
I've a small streams app, the configuration and part of the code I'm using
can be found here
https://gist.github.com/alex88/6b7b31c2b008817a24f63246557099bc
There's also the log when the app is started locally and when the app is
started on our servers connecting to the confluent c
17 matches
Mail list logo