Hi Fabian,

Yes I can tell you a bit more about the job we are seeing the problem with.
I'll simplify things a bit but this captures the essence:

1. Input datastreams are from a few kafka sources that we intend to join.
2. We wrap the datastreams we want to join into a common container class
and key them on the join key.
3. Union and process the datastreams with a KeyedProcessFunction which
holds the latest value seen for each source in ValueStates, and emits an
output that is the function of the stored ValueStates each time a new value
comes in.
4. We have to support arbitrarily late arriving data, so we don't window,
and just keep everything in ValueState.
5. The state we want to support is very large, on the order of several TBs.

On Wed, Oct 6, 2021 at 10:50 AM Fabian Paul <fabianp...@ververica.com>
wrote:

> Hi Kevin,
>
> Since you are seeing the problem across multiple Flink versions and with
> the default RocksDb and custom configuration it might be related
>  to something else. A lot of different components can allocate direct
> memory i.e. some filesystem implementations, the connectors or some user
> grpc dependency.
>
>
> Can you tell use a bit more about the job you are seeing the problem with?
>
> Best,
> Fabian
>
>

Reply via email to