Use RocksDBBackend to store whether the element appeared within the last one day, here is the code:
*public class DedupFunction extends KeyedProcessFunction<Long, IN,OUT> {* * private ValueState<Boolean> isExist;* * public void open(Configuration parameters) throws Exception {* * ValueStateDescriptor<boolean> desc = new ........* * StateTtlConfig ttlConfig = StateTtlConfig.newBuilder(Time.hours(24)).setUpdateType......* * desc.enableTimeToLive(ttlConfig);* * isExist = getRuntimeContext().getState(desc);* * }* * public void processElement(IN in, .... ) {* * if(null == isExist.value()) {* * out.collect(in)* * isExist.update(true)* * } * * }* *}* Because the number of distinct key is too large(about 10 billion one day ), there's performance bottleneck for this operator. How can I optimize the performance? Thanks, Lei