Thanks Matthias. My doubt is on a more fundamental level, but I'll ask on a
separate thread as per Eno's recommendation.
On Tue, May 16, 2017 at 3:26 PM Matthias J. Sax
wrote:
> Just to add to this:
>
> Streams create re-partitioning topics "lazy", meaning when the key is
> (potentially) changes
Just to add to this:
Streams create re-partitioning topics "lazy", meaning when the key is
(potentially) changes, we only set a flag but don't add the
re-partitioning topic. On groupByKey() or join() we check is this
"re-partitioning required flag" is set in add the topic.
Thus, if you have a
st
(it's preferred to create another email thread for a different topic to make it
easier to look back)
Yes, there could be room for optimizations, e.g., see this:
http://mail-archives.apache.org/mod_mbox/kafka-users/201705.mbox/%3cCAJikTEUHR=r0ika6vlf_y+qajxg8f_q19og_-s+q-gozpqb...@mail.gmail.com%
Follow up doubt (let me know if a new question should be created).
Do we always need a repartitioning topic?
If I'm reading things correctly when we change the key of a record we need
make sure the new key falls on the same partition that we are processing.
This makes a lot of sense if after such
Very useful links, thank you.
Part of my original misunderstanding was that the at-least-once guarantee
was considered fulfilled if the record reached a sink node.
Thanks for all the feedback, you may consider my question answered.
Feel free to ask further questions about the use case if found in
Yes.
It is basically "documented", as Streams guarantees at-least-once
semantics. Thus, we make sure, you will not loose any data in case of
failure. (ie, the overall guarantee is documented)
To achieve this, we always flush before we commit offsets. (This is not
explicitly documented as it's an
I think I now understand what Matthias meant when he said "If you use a
global remote store, you would not need to back your changes in a changelog
topic, as the store would not be lost if case of failure".
I had the misconception that if a state store threw an exception during
"flush", all messag
Replies in line as well
On Sat, May 13, 2017 at 3:25 AM Eno Thereska wrote:
> Hi João,
>
> Some answers inline:
>
> > On 12 May 2017, at 18:27, João Peixoto wrote:
> >
> > Thanks for the comments, here are some clarifications:
> >
> > I did look at interactive queries, if I understood them cor
Hi João,
Some answers inline:
> On 12 May 2017, at 18:27, João Peixoto wrote:
>
> Thanks for the comments, here are some clarifications:
>
> I did look at interactive queries, if I understood them correctly it means
> that my state store must hold all the results in order for it to be
> querie
Thanks for the comments, here are some clarifications:
I did look at interactive queries, if I understood them correctly it means
that my state store must hold all the results in order for it to be
queried, either in memory or through disk (RocksDB).
1. The retention policy on my aggregate operati
Hi,
I am not sure about your overall setup. Are your stores local (similar
to RocksDB) or are you using one global remote store? If you use a
global remote store, you would not need to back your changes in a
changelog topic, as the store would not be lost if case of failure.
Also (in case that yo
Hi there,
A couple of general comments, plus some answers:
- general comment: have you thought of using Interactive Queries to directly
query the aggregate data, without needing to store them to an external database
(see this blog:
https://www.confluent.io/blog/unifying-stream-processing-and-i
On a stream definition I perform an "aggregate" which is configured with a
state store.
*Goal*: Persist the aggregation results into a database, e.g. MySQL or
MongoDB
*Working approach*:
I have created a custom StateStore backed by a changelog topic like the
builtin state stores. Whenever the sto
13 matches
Mail list logo