Hi Lei, Have you tried to make the key smaller, and store a list of found keys as a value?
Let's make the operator key a hash of your original key, and store a list of the full keys in the state. You can play with your hash length to achieve the optimal number of keys. I hope this helps, Peter On Fri, Mar 29, 2024, 09:08 Lei Wang <leiwang...@gmail.com> wrote: > > Use RocksDBBackend to store whether the element appeared within the last > one day, here is the code: > > *public class DedupFunction extends KeyedProcessFunction<Long, IN,OUT> {* > > * private ValueState<Boolean> isExist;* > > * public void open(Configuration parameters) throws Exception {* > * ValueStateDescriptor<boolean> desc = new ........* > * StateTtlConfig ttlConfig = > StateTtlConfig.newBuilder(Time.hours(24)).setUpdateType......* > * desc.enableTimeToLive(ttlConfig);* > * isExist = getRuntimeContext().getState(desc);* > * }* > > * public void processElement(IN in, .... ) {* > * if(null == isExist.value()) {* > * out.collect(in)* > * isExist.update(true)* > * } * > * }* > *}* > > Because the number of distinct key is too large(about 10 billion one day > ), there's performance bottleneck for this operator. > How can I optimize the performance? > > Thanks, > Lei > >