gardenia commented on PR #8871: URL: https://github.com/apache/ozone/pull/8871#issuecomment-3224601180
We are just updating the design doc based on the latest feedback and the
discussion from last week's community call. I have realized I was left with
some uncertainty over the ledger schema which I thought would be good to flesh
out here before re-publishing the design doc.
My original strawman schema as discussed in last week's call:
```
message OperationInfo {
optional int64 trxLogIndex = 1;
optional string op = 2;
optional string volumeName = 3;
optional string bucketName = 4;
optional string keyName = 5;
optional uint64 creationTime = 6;
repeated hadoop.hdds.KeyValue extraArgs = 7;
}
```
@errose28 made a good point that the use of the extraArgs field (effectively
a Map<String, String>) for miscellaneous fields (such as the rename "to" path
or acl fields) would lead to a brittle schema management.
@errose28 you referenced a pattern where such things can be provided as a
bunch of "optional" fields (for examples, as per OMRequest) but it wasn't clear
exactly what you had in mind therefore I thought it would be useful to discuss
a few possible intrepretations:
1. use the full request object in the ledger item schema, e.g. :
```
message OperationInfo {
required int64 trxLogIndex = 1;
required Type cmdType = 2; // Type of the command
required uint64 creationTime = 3;
optional CreateKeyRequest createKeyRequest = 4;
optional RenameKeyRequest renameKeyRequest = 5;
optional DeleteKeyRequest deleteKeyRequest = 6;
optional CommitKeyRequest commitKeyRequest = 7;
optional CreateDirectoryRequest createDirectoryRequest = 8;
optional CreateFileRequest createFileRequest = 9;
}
```
initially this is what I inferred that you meant but I may have
misunderstood. At any rate - I'm not sure this is ideal - for a couple of
reasons:
a. it pulls in the entire request object into the ledger which may be
overkill
b. it is a bit loose about what it provides to consumers of the ledger and
so those consumers could end up using fields they shouldn't and end up
depending on them.
2. a less loose interpretation
```
message CreateKeyOperationArgs {
}
message RenameKeyOperationArgs {
required string toKeyName = 1;
}
message DeleteKeyOperationArgs {
}
message CommitKeyOperationArgs {
}
message CreateDirectoryOperationArgs {
}
message CreateFileOperationArgs {
required bool isRecursive = 2;
required bool isOverwrite = 3;
}
message OperationInfo {
optional int64 trxLogIndex = 1;
required Type cmdType = 2; // Type of the command
optional string volumeName = 3;
optional string bucketName = 4;
optional string keyName = 5;
optional uint64 creationTime = 6;
optional CreateKeyOperationArgs createKeyArgs = 7;
optional RenameKeyOperationArgs renameKeyArgs = 8;
optional DeleteKeyOperationArgs deleteKeyArgs = 9;
optional CommitKeyOperationArgs commitKeyArgs = 10;
optional CreateDirectoryOperationArgs createDirectoryArgs = 11;
optional CreateFileOperationArgs createFileArgs = 12;
}
```
This is effectively the same "optional" trick but rather than copy in the
full request (from the OmRequest) we cherry-pick the fields which are exposed
to the ledger via explicitly defined helper messages (RenameKeyOperationArgs,
CreateFileOperationArgs etc) and so consumers have a clear contract as to what
they can consume. This provides stronger typing around what extra fields are
provided and this also makes schema changes simpler.
It could be argued that #1 gives more flexibility to consumers of the
ledger but I feel like a more explicit contract (as per #2) is ultimately
better and to keep it minimal to begin with makes sense. Therefore I would
have a preference for #2.
But I would like to verify that I correctly understood your feedback
@errose28 so please let me know what you (and others) think (or if you have a
different suggestion).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
