Thanks a lot. it solved my issue. On Wed, Apr 27, 2022 at 3:19 PM Jan Lukavský <je...@seznam.cz> wrote:
> Hi Sigalit, > > if I understand correctly, what you want is a deduplication of outputs > coming from your GBK logic, is that correct? If do, you can use a > stateful DoFn and a ValueState holding the last value seen per key in > global window. There is an implementation of this approach in > Deduplicate.KeyedValues [1]. > > Jan > > [1] > > https://beam.apache.org/releases/javadoc/2.38.0/org/apache/beam/sdk/transforms/Deduplicate.KeyedValues.html > > On 4/27/22 13:36, Sigalit Eliazov wrote: > > Hi all > > i have the following scenario: > > a. a pipeline that reads messages from kafka and a session window with > > 1 minute duration. > > b. groupbykey in order to aggregate the data > > c. for each 'group' i do some calculation and build a new event to > > send to kafka. > > > > the output of this cycle is > > key1 - value1 > > key2 - value2 > > > > If a new message arrives with the same key i would like to have a > > logic that checks > > if the current message is : key1-value1 don't send (because it was > > already sent). > > Currently we implemented this using DB (postgres). > > the performance of this implementation is not very good. > > Is there any way to implement this without any external state? > > > > thanks a lot > > Sigalit >