Hi, Gyula.
It seems related to https://issues.apache.org/jira/browse/FLINK-23346.
We also saw core dump while using list state after triggering state
migration and ttl compaction filter. Have you triggered the schema
evolution ?
It seems a bug of the rocksdb list state together with ttl compaction
filter.

On Wed, May 17, 2023 at 7:05 PM Gyula Fóra <gyula.f...@gmail.com> wrote:

> Hi All!
>
> We are encountering an error on a larger stateful job (around 1 TB +
> state) on restore from a rocksdb checkpoint. The taskmanagers keep crashing
> with a segfault coming from the rocksdb native logic and seem to be related
> to the FlinkCompactionFilter mechanism.
>
> The gist with the full error report:  report:
> https://gist.github.com/gyfora/f307aa570d324d063e0ade9810f8bb25
>
> The core part is here:
> V  [libjvm.so+0x79478f]  Exceptions::
> (Thread*, char const*, int, oopDesc*)+0x15f
> V  [libjvm.so+0x960a68]  jni_Throw+0x88
> C  [librocksdbjni-linux64.so+0x222aa1]
>  JavaListElementFilter::NextUnexpiredOffset(rocksdb::Slice const&, long,
> long) const+0x121
> C  [librocksdbjni-linux64.so+0x6486c1]
>  rocksdb::flink::FlinkCompactionFilter::ListDecide(rocksdb::Slice const&,
> std::string*) const+0x81
> C  [librocksdbjni-linux64.so+0x648bea]
>  rocksdb::flink::FlinkCompactionFilter::FilterV2(int, rocksdb::Slice
> const&, rocksdb::CompactionFilter::ValueType, rocksdb::Slice const&,
> std::string*, std::string*) const+0x14a
>
> Has anyone encountered a similar issue before?
>
> Thanks
> Gyula
>
>

-- 
Best,
Hangxiang.

Reply via email to