Any thoughts on this?

On Fri, Apr 21, 2023 at 4:10 PM Sumanta Majumdar <feel.suma...@gmail.com>
wrote:

> Hi,
>
> Currently we have a streaming use case where we have a flink application
> which runs on a session cluster which is responsible for reading data from
> Kafka source which is basically table transaction events getting ingested
> into the upstream kafka topic which is converted to a row and then
> deduplicated to extract distinct rows and persist via a sink function to an
> external warehouse such as vertica or snowflake schema.
>
> Now initially what we have observed by using TumblingWindow windows
> assigner implementation is that the state sizes are growing unconditionally
> even when we have tuned rocksdb options and provided a good chunk of
> managed memory.
>
> We are able to read more than 150000 records within a period of 4 mins
> which is our time window set based on our requirements.
>
> Now one optimization which I see is suggested through the flink docs in
> order to reduce the state size is to use incremental aggregation using
> reduce or aggregate functions available.
>
> Now I did use aggregate function along with window in order implement the
> same but now I am observing that the consumption rate has reduced
> drastically post this implementation which has increased the overall
> throughput of the pipeline.
>
> Any thoughts as to why this can happen?
>
> --
> Thanks and Regards,
> Sumanta Majumdar
>


-- 
Thanks and Regards,
Sumanta Majumdar

Reply via email to