Jon,
You don't need all the data for every topic as the data is partitioned by
key. Therefore each state-store instance is de-duplicating a subset of the
key set.
Thanks,
Damian

On Mon, 27 Mar 2017 at 13:47 Jon Yeargers <jon.yearg...@cedexis.com> wrote:

> Ive been (re)reading this document(
> http://docs.confluent.io/3.2.0/streams/developer-guide.html#state-stores)
> hoping to better understand StateStores. At the top of the section there is
> a tantalizing note implying that one could do deduplication using a store.
>
> At present we using Redis for this as it gives us a shared location. Ive
> been of the mind that a given store was local to a streams instance. To
> truly support deduplication I would think one would need access to _all_
> the data for a topic and not just on a per-partition basis.
>
> Am I completely misunderstanding this?
>

Reply via email to