jojochuang commented on code in PR #9813:
URL: https://github.com/apache/ozone/pull/9813#discussion_r3017456381
##########
pom.xml:
##########
@@ -2570,6 +2430,17 @@
<id>apache.snapshots.https</id>
<url>https://repository.apache.org/content/repositories/snapshots</url>
</repository>
+ <repository>
+ <releases>
+ <enabled>false</enabled>
+ </releases>
+ <snapshots>
+ <enabled>true</enabled>
+ <updatePolicy>always</updatePolicy>
+ </snapshots>
+ <id>central.sonatype.snapshots</id>
+ <url>https://central.sonatype.com/repository/maven-snapshots/</url>
+ </repository>
Review Comment:
no longer needed?
##########
pom.xml:
##########
@@ -203,7 +204,9 @@
<reflections.version>0.10.2</reflections.version>
<reload4j.version>1.2.26</reload4j.version>
<restrict-imports.enforcer.version>3.0.0</restrict-imports.enforcer.version>
- <rocksdb.version>7.7.3</rocksdb.version>
+ <rocksdb.source.version>10.10.1</rocksdb.source.version>
+ <rocksdb.version>10.10.1</rocksdb.version>
+ <rocksdbjni.groupId>org.rocksdb</rocksdbjni.groupId>
Review Comment:
just curious why do we need to make the dependency group a variable here.
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/filter/ReclaimableRenameEntryFilter.java:
##########
@@ -83,6 +83,11 @@ private final boolean
isRenameEntryReclaimable(Table.KeyValue<String, String> re
if (previousTable != null) {
String prevDbKey = renameEntry.getValue();
WithObjectID withObjectID = previousTable.getIfExist(prevDbKey);
+ if (withObjectID == null) {
Review Comment:
is this a change of behavior in rocksdb?
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/DBProfile.java:
##########
@@ -87,6 +87,7 @@ public ManagedBlockBasedTableConfig
getBlockBasedTableConfig() {
ManagedBlockBasedTableConfig config = new ManagedBlockBasedTableConfig();
config.setBlockCache(new ManagedLRUCache(blockCacheSize))
.setBlockSize(blockSize)
+ .setFormatVersion(BLOCK_BASED_TABLE_FORMAT_VERSION)
Review Comment:
adding format version field is backward compatible.
##########
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/KeyManagerImpl.java:
##########
@@ -966,9 +966,22 @@ private <T> CheckedFunction<KeyManager, T, IOException>
getPreviousSnapshotOzone
OmBucketInfo bucketInfo, long objectId, String currentKeyPath,
Function<KeyManager, Table<String, T>> table) throws IOException {
String renameKey =
metadataManager.getRenameKey(bucketInfo.getVolumeName(),
bucketInfo.getBucketName(), objectId);
- String renamedKey =
metadataManager.getSnapshotRenamedTable().getIfExist(renameKey);
- return (previousSnapshotKM) -> table.apply(previousSnapshotKM).get(
- renamedKey != null ? renamedKey : currentKeyPath);
+ Table<String, String> renamedTable =
metadataManager.getSnapshotRenamedTable();
+ String renamedKey = renamedTable.getIfExist(renameKey);
Review Comment:
if getIfExist() is definitive, we wouldn't need to make this change?
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBTable.java:
##########
@@ -141,15 +141,15 @@ public byte[] getSkipCache(byte[] bytes) throws
RocksDatabaseException {
@Override
public byte[] getIfExist(byte[] key) throws RocksDatabaseException {
rdbMetrics.incNumDBKeyGetIfExistChecks();
- final Supplier<byte[]> value = db.keyMayExist(family, key);
Review Comment:
@smengcl can you put the above keyMayExist() behavior change in the PR
description.
The goal is to ensure the Ozone API does not change semantics after RocksDB
update.
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBTable.java:
##########
@@ -141,15 +141,15 @@ public byte[] getSkipCache(byte[] bytes) throws
RocksDatabaseException {
@Override
public byte[] getIfExist(byte[] key) throws RocksDatabaseException {
rdbMetrics.incNumDBKeyGetIfExistChecks();
- final Supplier<byte[]> value = db.keyMayExist(family, key);
Review Comment:
from commit history:
https://github.com/apache/ozone/pull/9813/changes/24dd41512e7342337e01cc421af6e1e676cba45f
Fix snapshot existence checks after RocksDB 9.10.0 keyMayExist behavior
change
RocksDB 9.10.0 changed DB::KeyMayExist behavior semantics to follow its
function comment. Ozone snapshot code paths were treating
keyMayExist/getIfExist misses as definitive, which could misclassify existing
keys in snapshot-related flows under RocksDB >= 9.10.0 and cause test failures.
Treat keyMayExist/getIfExist as hints and fall back to point reads before
deciding not-found:
- RDBTable: verify with get()/get(ByteBuffer) on inconclusive keyMayExist
- TypedTable: in codec-buffer isExist path, fallback to full get before
returning false
- KeyManagerImpl and ReclaimableRenameEntryFilter: use getSkipCache fallback
for snapshot rename lookups
RocksDB changelog for reference:
https://github.com/facebook/rocksdb/releases/tag/v9.10.0
Behavior Changes
DB::KeyMayExist() now follows its function comment, which means value
parameter can be null, and it will be set only if value_found is passed in.
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/RDBTable.java:
##########
@@ -141,15 +141,15 @@ public byte[] getSkipCache(byte[] bytes) throws
RocksDatabaseException {
@Override
public byte[] getIfExist(byte[] key) throws RocksDatabaseException {
Review Comment:
does this API provide definitive result?
if not, the javadoc needs to update
##########
hadoop-hdds/managed-rocksdb/src/main/java/org/apache/hadoop/hdds/utils/db/managed/ManagedBloomFilter.java:
##########
@@ -28,6 +28,16 @@
public class ManagedBloomFilter extends BloomFilter {
private final UncheckedAutoCloseable leakTracker = track(this);
+ @Override
Review Comment:
Looks unnecessary to me
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/TypedTable.java:
##########
@@ -179,11 +179,17 @@ public boolean isExist(KEY key) throws
RocksDatabaseException, CodecException {
} else if (keyCodec.supportCodecBuffer()) {
// keyCodec.supportCodecBuffer() is enough since value is not needed.
try (CodecBuffer inKey = keyCodec.toDirectCodecBuffer(key)) {
- // Use zero capacity buffer since value is not needed.
+ // Try the lightweight "existence only" check first.
try (CodecBuffer outValue = CodecBuffer.getEmptyBuffer()) {
- return getFromTableIfExist(inKey, outValue) != null;
+ if (getFromTableIfExist(inKey, outValue) != null) {
Review Comment:
shouldn't this Ozone API be definitive?
##########
hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/db/TypedTable.java:
##########
@@ -179,11 +179,17 @@ public boolean isExist(KEY key) throws
RocksDatabaseException, CodecException {
} else if (keyCodec.supportCodecBuffer()) {
// keyCodec.supportCodecBuffer() is enough since value is not needed.
try (CodecBuffer inKey = keyCodec.toDirectCodecBuffer(key)) {
- // Use zero capacity buffer since value is not needed.
+ // Try the lightweight "existence only" check first.
try (CodecBuffer outValue = CodecBuffer.getEmptyBuffer()) {
- return getFromTableIfExist(inKey, outValue) != null;
+ if (getFromTableIfExist(inKey, outValue) != null) {
+ return true;
+ }
}
}
+ // keyMayExist/getIfExist can still be false-negative in some RocksDB
+ // configurations (observed with snapshot/checkpoint DBs), so confirm
with
+ // a regular point lookup before returning false.
+ return getFromTable(key) != null;
Review Comment:
Is this really necessary? This means an additional lookup.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]