How about just keeping track of a buffer and flush the buffer after 100 messages and if there is a buffer on finish_bundle as well?
On Fri, Apr 12, 2024 at 21.23 Ruben Vargas <[email protected]> wrote: > Hello guys > > Maybe this question was already answered, but I cannot find it and > want some more input on this topic. > > I have some messages that don't have any particular key candidate, > except the ID, but I don't want to use it because the idea is to > group multiple IDs in the same batch. > > This is my use case: > > I have an endpoint where I'm gonna send the message ID, this endpoint > is gonna return me certain information which I will use to enrich my > message. In order to avoid fetching the endpoint per message I want to > batch it in 100 and send the 100 IDs in one request ( the endpoint > supports it) . I was thinking on using GroupIntoBatches. > > - If I choose the ID as the key, my understanding is that it won't > work in the way I want (because it will form batches of the same ID). > - Use a constant will be a problem for parallelism, is that correct? > > Then my question is, what should I use as a key? Maybe something > regarding the timestamp? so I can have groups of messages that arrive > at a certain second? > > Any suggestions would be appreciated > > Thanks. >
