Got it! Any chance you can open-source some of that? I think it can be extremely useful for the community.
Thank you. On Thu, Apr 27, 2023 at 8:08 AM Giannis Polyzos <ipolyzos...@gmail.com> wrote: > Correct, its some custom code i put together to investigate what gets > written in rocksdb > > On Thu, Apr 27, 2023 at 6:06 PM Yaroslav Tkachenko <yaros...@goldsky.com> > wrote: > >> Hi Giannis, >> >> I'm curious, what tool did you use for this analysis (what the screenshot >> shows)? Is it something custom? >> >> Thank you. >> >> On Wed, Apr 26, 2023 at 10:38 PM Giannis Polyzos <ipolyzos...@gmail.com> >> wrote: >> >>> This is really helpful, >>> >>> Thanks >>> >>> On Thu, Apr 27, 2023 at 5:46 AM Yanfei Lei <fredia...@gmail.com> wrote: >>> >>>> Hi Giannis, >>>> >>>> Except “default” Colume Family(CF), all other CFs represent the state >>>> in rocksdb state backend, the name of a CF is the name of a >>>> StateDescriptor. >>>> >>>> - deduplicate-state is a value state, you can find it in >>>> DeduplicateFunctionBase.java and >>>> MiniBatchDeduplicateFunctionBase.java, they are used for >>>> deduplication. >>>> - _timer_state/event_user-timers, _timer_state/event_timers , >>>> _timer_state/processing_timers and _timer_state/processing_user-timers >>>> are created by internal time service, which can be found in >>>> InternalTimeServiceManagerImpl.java. Here is a blog post[1] on best >>>> practices for using timers. >>>> - timer, next-index, left and right can be found in >>>> TemporalRowTimeJoinOperator.java, TemporalRowTimeJoinOperator >>>> implements the logic of temporal join, this post[2] might be helpful >>>> in understanding what happened to temporal join. >>>> >>>> [1] >>>> https://www.alibabacloud.com/help/en/realtime-compute-for-apache-flink/latest/datastream-timer >>>> [2] >>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#temporal-joins >>>> >>>> Giannis Polyzos <ipolyzos...@gmail.com> 于2023年4月26日周三 23:19写道: >>>> > >>>> > I have two input kafka topics - a compacted one (with upsert-kafka) >>>> and a normal one. >>>> > When I perform a temporal join I notice the following state being >>>> created in rocksdb and was hoping someone could help me better understand >>>> what everything means >>>> > >>>> > >>>> > > deduplicate-state: does it refer to duplicate keys found by the >>>> kafka-upsert-connector? >>>> > > timers: what timer and _timer_state/event_timers refer to and whats >>>> their difference? Is it to keep track on when the join results need to be >>>> materialised or state to be expired? >>>> > > next-index: what does it refer to? >>>> > > left: also I'm curious why the left cf has 407 entries. Is it >>>> records that are being buffered because there is no match on the right >>>> table? >>>> > >>>> > Thanks >>>> >>>> >>>> >>>> -- >>>> Best, >>>> Yanfei >>>> >>>