How about just keeping track of a buffer and flush the buffer after 100
messages and if there is a buffer on finish_bundle as well?


On Fri, Apr 12, 2024 at 21.23 Ruben Vargas <[email protected]> wrote:

> Hello guys
>
> Maybe this question was already answered, but I cannot find it  and
> want some more input on this topic.
>
> I have some messages that don't have any particular key candidate,
> except the ID,  but I don't want to use it because the idea is to
> group multiple IDs in the same batch.
>
> This is my use case:
>
> I have an endpoint where I'm gonna send the message ID, this endpoint
> is gonna return me certain information which I will use to enrich my
> message. In order to avoid fetching the endpoint per message I want to
> batch it in 100 and send the 100 IDs in one request ( the endpoint
> supports it) . I was thinking on using GroupIntoBatches.
>
> - If I choose the ID as the key, my understanding is that it won't
> work in the way I want (because it will form batches of the same ID).
> - Use a constant will be a problem for parallelism, is that correct?
>
> Then my question is, what should I use as a key? Maybe something
> regarding the timestamp? so I can have groups of messages that arrive
> at a certain second?
>
> Any suggestions would be appreciated
>
> Thanks.
>

Reply via email to