Github user StefanRRichter commented on a diff in the pull request: https://github.com/apache/flink/pull/5582#discussion_r192337341 --- Diff: flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackendTest.java --- @@ -547,4 +549,30 @@ public boolean accept(File file, String s) { return true; } } + + private static class TestRocksDBStateBackend extends RocksDBStateBackend { + + public TestRocksDBStateBackend(AbstractStateBackend checkpointStreamBackend, boolean enableIncrementalCheckpointing) { + super(checkpointStreamBackend, enableIncrementalCheckpointing); + } + + @Override + public <K> AbstractKeyedStateBackend<K> createKeyedStateBackend( + Environment env, + JobID jobID, + String operatorIdentifier, + TypeSerializer<K> keySerializer, + int numberOfKeyGroups, + KeyGroupRange keyGroupRange, + TaskKvStateRegistry kvStateRegistry) throws IOException { + + AbstractKeyedStateBackend<K> keyedStateBackend = super.createKeyedStateBackend( + env, jobID, operatorIdentifier, keySerializer, numberOfKeyGroups, keyGroupRange, kvStateRegistry); + + // We ignore the range deletions on production, but when we are running the tests we shouldn't ignore it. --- End diff -- As I see, this is only happening in the case where there is only one handle and we are only interested in a subset of the key-groups. Unfortunately, that should be the common case of scaling out. I am wondering if we should not prefer to apply normal deletes over range delete, because what will happen if we take again a snapshot from a database that was using range deletes? Will the keys all be gone in cases of full and incremental snapshots? If the performance of normal deletes is not terrible, that might be cleaner for as long as range deletes are not working properly or have potential negative side-effects. What is your opinion about this?
---