yihua commented on code in PR #13213:
URL: https://github.com/apache/hudi/pull/13213#discussion_r2057781727
##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -243,39 +234,30 @@ public String getRecordKey(T record, Schema schema) {
/**
* Gets the ordering value in particular type.
*
- * @param recordOption An option of record.
- * @param metadataMap A map containing the record metadata.
- * @param schema The Avro schema of the record.
+ * @param record An option of record.
Review Comment:
So now this record is non-null, correct?
```suggestion
* @param record The record.
```
##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/BaseSparkInternalRowReaderContext.java:
##########
@@ -107,6 +107,15 @@ public InternalRow seal(InternalRow internalRow) {
return internalRow.copy();
}
+ @Override
+ public InternalRow toBinaryRow(Schema schema, InternalRow internalRow) {
+ if (internalRow instanceof UnsafeRow) {
+ return internalRow;
+ }
+ final UnsafeProjection unsafeProjection =
HoodieInternalRowUtils.getCachedUnsafeProjection(schema);
Review Comment:
Let's create a follow-up JIRA to see if we can further simplify this part by
considering schema evolution cc @jonvex , without having to get the projection
instance per row.
##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -243,39 +234,30 @@ public String getRecordKey(T record, Schema schema) {
/**
* Gets the ordering value in particular type.
*
- * @param recordOption An option of record.
- * @param metadataMap A map containing the record metadata.
- * @param schema The Avro schema of the record.
+ * @param record An option of record.
+ * @param schema The Avro schema of the record.
* @param orderingFieldName name of the ordering field
* @return The ordering value.
*/
- public Comparable getOrderingValue(Option<T> recordOption,
- Map<String, Object> metadataMap,
+ public Comparable getOrderingValue(T record,
Schema schema,
Option<String> orderingFieldName) {
- if (metadataMap.containsKey(INTERNAL_META_ORDERING_FIELD)) {
- return (Comparable) metadataMap.get(INTERNAL_META_ORDERING_FIELD);
- }
-
- if (!recordOption.isPresent() || orderingFieldName.isEmpty()) {
+ if (orderingFieldName.isEmpty()) {
return DEFAULT_ORDERING_VALUE;
}
- Object value = getValue(recordOption.get(), schema,
orderingFieldName.get());
+ Object value = getValue(record, schema, orderingFieldName.get());
Comparable finalOrderingVal = value != null ?
convertValueToEngineType((Comparable) value) : DEFAULT_ORDERING_VALUE;
- metadataMap.put(INTERNAL_META_ORDERING_FIELD, finalOrderingVal);
return finalOrderingVal;
}
/**
* Constructs a new {@link HoodieRecord} based on the record of
engine-specific type and metadata for merging.
*
- * @param recordOption An option of the record in engine-specific type if
exists.
- * @param metadataMap The record metadata.
+ * @param bufferedRecord buffer record
Review Comment:
revise the javadocs: `record of engine-specific type and metadata for
merging.` -> `record of engine-specific type for merging.`
##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -285,6 +277,14 @@ public abstract HoodieRecord<T>
constructHoodieRecord(Option<T> recordOption,
*/
public abstract T seal(T record);
+ /**
+ * Convert engine specific row into binary format.
+ *
+ * @param record The engine row
+ * @return row with binary format
+ */
+ public abstract T toBinaryRow(T record);
Review Comment:
I also think that the `UnsafeRow` conversion is Spark only and that can be
done in the log record iterator where the Avro record is deserialized to
`UnsafeRow`, without adding the public API method `toBinaryRow` to
`HoodieReaderContext`. However I see there is difficulty due to schema
evolution and other handling that require more refactoring to achieve the goal.
If we want to keep `toBinaryRow` in this PR, mark it as `@Deprecated` or
`@PublicAPIMethod(maturity = ApiMaturityLevel.DEPRECATED)` so further usage is
disallowed.
##########
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java:
##########
@@ -243,39 +234,30 @@ public String getRecordKey(T record, Schema schema) {
/**
* Gets the ordering value in particular type.
*
- * @param recordOption An option of record.
- * @param metadataMap A map containing the record metadata.
- * @param schema The Avro schema of the record.
+ * @param record An option of record.
+ * @param schema The Avro schema of the record.
* @param orderingFieldName name of the ordering field
* @return The ordering value.
*/
- public Comparable getOrderingValue(Option<T> recordOption,
- Map<String, Object> metadataMap,
+ public Comparable getOrderingValue(T record,
Schema schema,
Option<String> orderingFieldName) {
- if (metadataMap.containsKey(INTERNAL_META_ORDERING_FIELD)) {
- return (Comparable) metadataMap.get(INTERNAL_META_ORDERING_FIELD);
- }
-
- if (!recordOption.isPresent() || orderingFieldName.isEmpty()) {
+ if (orderingFieldName.isEmpty()) {
return DEFAULT_ORDERING_VALUE;
}
- Object value = getValue(recordOption.get(), schema,
orderingFieldName.get());
+ Object value = getValue(record, schema, orderingFieldName.get());
Comparable finalOrderingVal = value != null ?
convertValueToEngineType((Comparable) value) : DEFAULT_ORDERING_VALUE;
- metadataMap.put(INTERNAL_META_ORDERING_FIELD, finalOrderingVal);
return finalOrderingVal;
}
/**
* Constructs a new {@link HoodieRecord} based on the record of
engine-specific type and metadata for merging.
*
- * @param recordOption An option of the record in engine-specific type if
exists.
- * @param metadataMap The record metadata.
+ * @param bufferedRecord buffer record
Review Comment:
```suggestion
* @param bufferedRecord {@link BufferedRecord} object with
engine-specific type
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]