Hi Dan,

For a deeper dive into state backends and how they manage state, or
performance critical aspects such as state serialization and choosing
appropriate state structures, I highly recommend starting from this webinar
done by my colleague Seth Weismann:
https://www.youtube.com/watch?v=9GF8Hwqzwnk.

Cheers,
Gordon

On Wed, Mar 10, 2021 at 1:58 AM Dan Hill <quietgol...@gmail.com> wrote:

> Hi!
>
> I'm working on a join setup that does fuzzy matching in case the client
> does not send enough parameters to join by a foreign key.  There's a few
> ways I can store the state.  I'm curious about best practices around this.
> I'm using rocksdb as the state storage.
>
> I was reading the code for IntervalJoin
> <https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/co/IntervalJoinOperator.java>
> and was a little shocked by the implementation.  It feels designed for very
> short join intervals.
>
> I read this set of pages
> <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html>
> but I'm looking for one level deeper.  E.g. what are performance
> characteristics of different types of state crud operations with rocksdb?
> E.g. I could create extra MapState to act as an index.  When is this worth
> it?
>
>
>

Reply via email to