gortiz commented on code in PR #15919:
URL: https://github.com/apache/pinot/pull/15919#discussion_r2137677116
##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/ZeroCopyDataBlockSerde.java:
##########
@@ -254,21 +265,37 @@ private long calculateEndOffset(DataBuffer buffer, Header
header) {
return currentOffset;
}
+ /// Deserializes the exceptions and metadata from the stream.
@VisibleForTesting
- static Map<Integer, String> deserializeExceptions(PinotInputStream stream,
Header header)
+ static ErrorsAndMetadata deserializeExceptions(PinotInputStream stream,
Header header)
throws IOException {
if (header._exceptionsLength == 0) {
- return new HashMap<>();
+ return ErrorsAndMetadata.EMPTY;
}
+ long currentOffset = header.getExceptionsStart();
+
stream.seek(header.getExceptionsStart());
int numExceptions = stream.readInt();
- Map<Integer, String> exceptions = new
HashMap<>(HashUtil.getHashMapCapacity(numExceptions));
+ // We reserve extra space for the fake error codes storing stageId,
workerId and serverId
+ Map<Integer, String> exceptions = new
HashMap<>(HashUtil.getHashMapCapacity(numExceptions + 3));
for (int i = 0; i < numExceptions; i++) {
int errCode = stream.readInt();
String errMessage = stream.readInt4UTF();
exceptions.put(errCode, errMessage);
}
- return exceptions;
+
+ long readOffset = stream.getCurrentOffset() - currentOffset;
+ if (readOffset >= header._exceptionsLength) {
+ return new ErrorsAndMetadata(exceptions, -1, -1, "");
+ }
+ int errorMetadataVersion = stream.readInt();
+ if (errorMetadataVersion != ERROR_METADATA_VERSION) {
+ return new ErrorsAndMetadata(exceptions, -1, -1, "");
+ }
+ int stageId = stream.readInt();
+ int workerId = stream.readInt();
+ String serverId = stream.readInt4UTF();
Review Comment:
I'm using "".
I think it is better (may avoid NPEs) and simplifies the serialized code, as
it is simpler to write an empty string than a null value. We can then transform
this empty string into null when converting it to a block if needed
##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/ZeroCopyDataBlockSerde.java:
##########
@@ -254,21 +265,37 @@ private long calculateEndOffset(DataBuffer buffer, Header
header) {
return currentOffset;
}
+ /// Deserializes the exceptions and metadata from the stream.
@VisibleForTesting
- static Map<Integer, String> deserializeExceptions(PinotInputStream stream,
Header header)
+ static ErrorsAndMetadata deserializeExceptions(PinotInputStream stream,
Header header)
throws IOException {
if (header._exceptionsLength == 0) {
- return new HashMap<>();
+ return ErrorsAndMetadata.EMPTY;
}
+ long currentOffset = header.getExceptionsStart();
+
stream.seek(header.getExceptionsStart());
int numExceptions = stream.readInt();
- Map<Integer, String> exceptions = new
HashMap<>(HashUtil.getHashMapCapacity(numExceptions));
+ // We reserve extra space for the fake error codes storing stageId,
workerId and serverId
+ Map<Integer, String> exceptions = new
HashMap<>(HashUtil.getHashMapCapacity(numExceptions + 3));
for (int i = 0; i < numExceptions; i++) {
int errCode = stream.readInt();
String errMessage = stream.readInt4UTF();
exceptions.put(errCode, errMessage);
}
- return exceptions;
+
+ long readOffset = stream.getCurrentOffset() - currentOffset;
+ if (readOffset >= header._exceptionsLength) {
+ return new ErrorsAndMetadata(exceptions, -1, -1, "");
Review Comment:
We cannot use EMPTY, as exceptions could be non-empty.
##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/MetadataBlock.java:
##########
@@ -69,6 +71,17 @@ public static MetadataBlock newEosWithStats(List<DataBuffer>
statsByStage) {
public MetadataBlock(List<DataBuffer> statsByStage) {
Review Comment:
> Is there scenario where stageId and workerId are unavailable?
Yes, MetadataBlock is used to serialize any EOS, including those that
succeed and those that don't. But stageId and workerId are only used on error
blocks. We could include them in successful blocks as well, but that won't be
very useful and would make SuccessMseBlock stateful, which is not great given
they can be used as singletons now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]