Will definitely do as it's going to be part of a wider Flink course / book
(haven't decided yet on the format) Im putting together.
but I can share before that If you want

On Thu, Apr 27, 2023 at 6:11 PM Yaroslav Tkachenko <yaros...@goldsky.com>
wrote:

> Got it! Any chance you can open-source some of that? I think it can be
> extremely useful for the community.
>
> Thank you.
>
> On Thu, Apr 27, 2023 at 8:08 AM Giannis Polyzos <ipolyzos...@gmail.com>
> wrote:
>
>> Correct, its some custom code i put together to investigate what gets
>> written in rocksdb
>>
>> On Thu, Apr 27, 2023 at 6:06 PM Yaroslav Tkachenko <yaros...@goldsky.com>
>> wrote:
>>
>>> Hi Giannis,
>>>
>>> I'm curious, what tool did you use for this analysis (what the
>>> screenshot shows)? Is it something custom?
>>>
>>> Thank you.
>>>
>>> On Wed, Apr 26, 2023 at 10:38 PM Giannis Polyzos <ipolyzos...@gmail.com>
>>> wrote:
>>>
>>>> This is really helpful,
>>>>
>>>> Thanks
>>>>
>>>> On Thu, Apr 27, 2023 at 5:46 AM Yanfei Lei <fredia...@gmail.com> wrote:
>>>>
>>>>> Hi Giannis,
>>>>>
>>>>> Except “default” Colume Family(CF), all other CFs represent the state
>>>>> in rocksdb state backend, the name of a CF is the name of a
>>>>> StateDescriptor.
>>>>>
>>>>> - deduplicate-state is a value state, you can find it in
>>>>> DeduplicateFunctionBase.java and
>>>>> MiniBatchDeduplicateFunctionBase.java, they are used for
>>>>> deduplication.
>>>>> - _timer_state/event_user-timers, _timer_state/event_timers ,
>>>>> _timer_state/processing_timers and _timer_state/processing_user-timers
>>>>>  are created by internal time service, which can be found in
>>>>> InternalTimeServiceManagerImpl.java. Here is a blog post[1] on best
>>>>> practices for using timers.
>>>>> - timer, next-index, left and right can be found in
>>>>> TemporalRowTimeJoinOperator.java, TemporalRowTimeJoinOperator
>>>>> implements the logic of temporal join, this post[2] might be helpful
>>>>> in understanding what happened to temporal join.
>>>>>
>>>>> [1]
>>>>> https://www.alibabacloud.com/help/en/realtime-compute-for-apache-flink/latest/datastream-timer
>>>>> [2]
>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#temporal-joins
>>>>>
>>>>> Giannis Polyzos <ipolyzos...@gmail.com> 于2023年4月26日周三 23:19写道:
>>>>> >
>>>>> > I have two input kafka topics - a compacted one (with upsert-kafka)
>>>>> and a normal one.
>>>>> > When I perform a temporal join I notice the following state being
>>>>> created in rocksdb and was hoping someone could help me better understand
>>>>> what everything means
>>>>> >
>>>>> >
>>>>> > > deduplicate-state: does it refer to duplicate keys found by the
>>>>> kafka-upsert-connector?
>>>>> > > timers: what timer and _timer_state/event_timers refer to and
>>>>> whats their difference? Is it to keep track on when the join results need
>>>>> to be materialised or state to be expired?
>>>>> > > next-index: what does it refer to?
>>>>> > > left: also I'm curious why the left cf has 407 entries. Is it
>>>>> records that are being buffered because there is no match on the right
>>>>> table?
>>>>> >
>>>>> > Thanks
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best,
>>>>> Yanfei
>>>>>
>>>>

Reply via email to