Hi Till, thank you for your reply. > What you can do, though, is to create a custom operator or use a flatMap to build your own windowing operator. Since my stream wouldn't be keyed, does this mean that I would need to use "Managed Operator State" (aka raw state)?
On Thu, Apr 29, 2021 at 10:34 AM Till Rohrmann <trohrm...@apache.org> wrote: > Hi Yegor, > > If you want to use Flink's keyed windowing logic, then you need to insert > a keyBy/shuffle operation because Flink currently cannot simply use the > partitioning of the Kinesis shards. The reason is that Flink needs to group > the keys into the correct key groups in order to support rescaling of the > state. > > What you can do, though, is to create a custom operator or use a flatMap > to build your own windowing operator. This operator could then use the > partitioning of the Kinesis shards by simply collecting the events until > either 30 seconds or 1000 events are observed. > > Cheers, > Till > > On Wed, Apr 28, 2021 at 11:12 AM Yegor Roganov <yegor....@gmail.com> > wrote: > >> Hello >> >> To learn Flink I'm trying to build a simple application where I want to >> save events coming from Kinesis to S3. >> I want to subscribe to each shard, and within each shard I want to batch >> for 30 seconds, or until 1000 events are observed. These batches should >> then be uploaded to S3. >> What I don't understand is how to key my source on shard id, and do it in >> a way that doesn't induce unnecessary shuffling. >> Is this possible with Flink? >> >