Hi Dan, For a deeper dive into state backends and how they manage state, or performance critical aspects such as state serialization and choosing appropriate state structures, I highly recommend starting from this webinar done by my colleague Seth Weismann: https://www.youtube.com/watch?v=9GF8Hwqzwnk.
Cheers, Gordon On Wed, Mar 10, 2021 at 1:58 AM Dan Hill <quietgol...@gmail.com> wrote: > Hi! > > I'm working on a join setup that does fuzzy matching in case the client > does not send enough parameters to join by a foreign key. There's a few > ways I can store the state. I'm curious about best practices around this. > I'm using rocksdb as the state storage. > > I was reading the code for IntervalJoin > <https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/co/IntervalJoinOperator.java> > and was a little shocked by the implementation. It feels designed for very > short join intervals. > > I read this set of pages > <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html> > but I'm looking for one level deeper. E.g. what are performance > characteristics of different types of state crud operations with rocksdb? > E.g. I could create extra MapState to act as an index. When is this worth > it? > > >