github-actions[bot] commented on code in PR #61389:
URL: https://github.com/apache/doris/pull/61389#discussion_r2939726347
##########
fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcSourceOffsetProvider.java:
##########
@@ -535,7 +545,21 @@ private boolean checkNeedSplitChunks(Map<String, String>
sourceProperties) {
if (startMode == null) {
return false;
}
- return DataSourceConfigKeys.OFFSET_INITIAL.equalsIgnoreCase(startMode);
+ return DataSourceConfigKeys.OFFSET_INITIAL.equalsIgnoreCase(startMode)
+ ||
DataSourceConfigKeys.OFFSET_SNAPSHOT.equalsIgnoreCase(startMode);
+ }
+
+ private boolean isSnapshotOnlyMode() {
+ String offset = sourceProperties.get(DataSourceConfigKeys.OFFSET);
+ return DataSourceConfigKeys.OFFSET_SNAPSHOT.equalsIgnoreCase(offset);
+ }
+
+ @Override
+ public boolean hasReachedEnd() {
+ return isSnapshotOnlyMode()
+ && currentOffset != null
+ && currentOffset.snapshotSplit()
+ && remainingSplits.isEmpty();
Review Comment:
**Note:** `hasReachedEnd()` calls `currentOffset.snapshotSplit()` which
requires non-null, non-empty splits (enforced by `Preconditions.checkState` in
`JdbcOffset.snapshotSplit()`). After crash recovery for a completed
snapshot-only job, `currentOffset` has null splits (see comment on
`replayIfNeed`), so this method will throw `IllegalStateException`.
Consider either:
1. Adding a null-safe split check here directly: `currentOffset.getSplits()
!= null && !currentOffset.getSplits().isEmpty() &&
currentOffset.snapshotSplit()`
2. Or ensuring `replayIfNeed` always leaves `currentOffset` in a valid state
with actual splits set.
##########
fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcSourceOffsetProvider.java:
##########
@@ -535,7 +545,21 @@ private boolean checkNeedSplitChunks(Map<String, String>
sourceProperties) {
Review Comment:
**Bug (Critical): `currentOffset` created with null/empty splits causes
crash on recovery**
When `isSnapshotOnlyMode()` is true and `lastSnapshotSplits.isEmpty()`, this
code creates a `JdbcOffset` via `new JdbcOffset()` at line 382 but never calls
`setSplits()`. The `splits` field remains `null` (default from
`@NoArgsConstructor`).
Later, `hasReachedEnd()` calls `currentOffset.snapshotSplit()` which does:
```java
Preconditions.checkState(splits != null && !splits.isEmpty(), "splits is
null or empty");
```
This throws `IllegalStateException`. Same crash path via
`hasMoreDataToConsume()` → `currentOffset.snapshotSplit()`.
**Suggested fix:** Either:
1. Set a sentinel split list so `snapshotSplit()` works, or
2. Modify `hasReachedEnd()` to check for null/empty splits directly without
calling `snapshotSplit()`, or
3. In this branch, handle the snapshot-only completed state by transitioning
the job to FINISHED immediately during replay (best approach).
##########
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/StreamingMultiTblTask.java:
##########
@@ -110,6 +110,11 @@ public void before() throws Exception {
this.status = TaskStatus.RUNNING;
this.startTimeMs = System.currentTimeMillis();
this.runningOffset = offsetProvider.getNextOffset(null,
sourceProperties);
+ if (this.runningOffset == null) {
+ // snapshot-only mode: all splits completed, task exits immediately
+ log.info("streaming multi task {} offset is null (snapshot-only
completed), skip execution", taskId);
+ return;
+ }
log.info("streaming multi task {} get running offset: {}", taskId,
runningOffset.toString());
Review Comment:
**Bug (High): Early return from `before()` when offset is null leaves the
job stuck**
When `getNextOffset()` returns null (snapshot-only completed), this task
returns early from `before()`, then `run()` also returns early, then
`onSuccess()` returns `false` without calling `onStreamTaskSuccess()`. The
`StreamingMultiTblTask.onSuccess()` never calls `onStreamTaskSuccess()` — it
always returns false (line 243).
The job completion signal for multi-table tasks only comes via the external
`commitOffset()` → `successCallback()` → `onStreamTaskSuccess()` callback path,
which is never triggered for a no-op task (no data sent to BE, no offset
committed).
Result: The task finishes silently, the job stays in `RUNNING` with a dead
task reference, eventually times out to `PAUSED`, auto-resumes to `PENDING`,
creates another null-offset task, and loops forever.
**Suggested fix:** In `StreamingMultiTblTask.onSuccess()`, when
`runningOffset == null`, directly call
`streamingInsertJob.onStreamTaskSuccess(this)` (after acquiring the write lock
via the appropriate path). Alternatively, handle this in `before()` by checking
`offsetProvider.hasReachedEnd()` and signaling the job directly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]