Thanks a lot. it solved my issue.

On Wed, Apr 27, 2022 at 3:19 PM Jan Lukavský <je...@seznam.cz> wrote:

> Hi Sigalit,
>
> if I understand correctly, what you want is a deduplication of outputs
> coming from your GBK logic, is that correct? If do, you can use a
> stateful DoFn and a ValueState holding the last value seen per key in
> global window. There is an implementation of this approach in
> Deduplicate.KeyedValues [1].
>
>   Jan
>
> [1]
>
> https://beam.apache.org/releases/javadoc/2.38.0/org/apache/beam/sdk/transforms/Deduplicate.KeyedValues.html
>
> On 4/27/22 13:36, Sigalit Eliazov wrote:
> > Hi all
> > i have the following scenario:
> > a. a pipeline that reads messages from kafka and a session window with
> > 1 minute duration.
> > b.  groupbykey in order to aggregate the data
> > c. for each 'group' i do some calculation and build a new event to
> > send to kafka.
> >
> > the output of this cycle is
> > key1 - value1
> > key2 - value2
> >
> > If a new message arrives with the same key i would like to have a
> > logic that checks
> > if the current message is : key1-value1 don't send (because it was
> > already sent).
> > Currently we implemented this using DB (postgres).
> > the performance of this implementation is not very good.
> > Is there any way to implement this without any external state?
> >
> > thanks a lot
> > Sigalit
>

Reply via email to