cshuo commented on code in PR #13408:
URL: https://github.com/apache/hudi/pull/13408#discussion_r2139076809
##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -183,6 +187,14 @@ public Option<Predicate> getKeyFilterOpt() {
return keyFilterOpt;
}
+ public SizeEstimator<BufferedRecord<T>> getRecordSizeEstimator() {
+ return new HoodieRecordSizeEstimator<>(schemaHandler.getRequiredSchema());
+ }
+
+ public CustomSerializer<BufferedRecord<T>> getRecordSerializer() {
+ return new DefaultSerializer<>();
Review Comment:
I've run a local micro benchmark comparing `DefaultSerializer` and
`BufferedRecordSerializer(DefaultRecordSerializer)`, with 1 million flink
`GenericRowData` records.
```
public class DefaultRecordSerializer<T> implements RecordSerializer<T>{
@Override
public byte[] serialize(T record) {
try {
return SerializationUtils.serialize(record);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
@Override
public T deserialize(byte[] bytes, int schemaId) {
return SerializationUtils.deserialize(bytes);
}
}
```
records:
```
GenericRowData record = new GenericRowData(5);
record.setField(0, "lily");
record.setField(1, 23);
record.setField(2, "shanghai");
record.setField(3, 1000L);
record.setField(4, "20240101");
BufferedRecord<GenericRowData> bufferedRecord = new
BufferedRecord<>("lily", 1000L, record, 1, false);
```
Results:
~~Legacy default: 1439s~~
Legacy default: 1082s
Legacy default: 982s
Legacy default: 969s
Legacy default: 958s
Avg: 997
~~New: 1164s~~
New: 1164s
New: 1144s
New: 1155s
New: 1165s
Avg: 1157
Seems the legacy `DefaultSerializer` performs little better, so we maybe
keep `DefaultSerializer` as default until we implement efficient
`RecordSerializer` for other engine-specific rows.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]