yuanfenghu created FLINK-38555:
----------------------------------
Summary: Optimize performance of `RecordUtils.compareObjects()`
method by avoiding unnecessary `toString()` calls for temporal types
(LocalDateTime, LocalDate, Instant, etc.).
Key: FLINK-38555
URL: https://issues.apache.org/jira/browse/FLINK-38555
Project: Flink
Issue Type: Improvement
Components: Flink CDC
Affects Versions: cdc-3.5.0
Reporter: yuanfenghu
Fix For: cdc-3.6.0
Attachments: image-2025-10-24-10-15-18-027.png,
image-2025-10-24-10-15-37-328.png
## Background
While analyzing flame graphs of a Flink CDC MySQL source job, I identified that
`RecordUtils.splitKeyRangeContains()` was a performance bottleneck. Further
investigation revealed that `compareObjects()` was using `toString()` to
compare temporal objects, which is significantly slower than direct comparison.
### Root Cause
In the current implementation:
{code:java}
private static int compareObjects(Object o1, Object o2) {
if (o1 instanceof Comparable && o1.getClass().equals(o2.getClass())) {
return ((Comparable) o1).compareTo(o2);
} else if (isNumericObject(o1) && isNumericObject(o2)) {
return toBigDecimal(o1).compareTo(toBigDecimal(o2));
} else {
return o1.toString().compareTo(o2.toString());
}
}{code}
When comparing `LocalDateTime` objects, the first condition fails if the
objects are cast to `Object`, falling through to the `toString()` comparison
path.
### Impact
This method is called extensively during the snapshot phase when evaluating
whether binlog records fall within completed split ranges. For tables with:
- Temporal types (DATETIME, TIMESTAMP, DATE, TIME) as chunk keys
- High binlog throughput during snapshot phase
- Many splits (large tables with small chunk size)
The performance impact can be significant (80% CPU in some cases).
!image-2025-10-24-10-15-18-027.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)