yzeng1618 commented on issue #10452:
URL: https://github.com/apache/seatunnel/issues/10452#issuecomment-3949348572

   The issue identified this time is the same as the one reported in case 
[10354](https://github.com/apache/seatunnel/issues/10354). I will analyze the 
root cause.
   The preliminary root cause analysis is as follows:
   When isStartWithSavePoint=true, the Worker node also fetches the checkpoint 
locally. However, CheckpointService is only initialized on the Master, which 
results in a NullPointerException. This exception is eventually wrapped by the 
filter into {"status":"fail","message":null}.
   
   ```mermaid
   sequenceDiagram
     participant C as Client
     participant W as Worker REST
     participant M as Master
     C->>W: POST /submit-job?isStartWithSavePoint=true
     W->>W: build() -> getCheckpointService()
     Note over W: checkpointService == null (当前)
     W-->>C: HTTP 500, message:null
   
     Note over W,M: 修复后
     W->>M: GetJobCheckpointOperation(jobId)
     M-->>W: checkpoint data
     W->>M: SubmitJobOperation
     M-->>C: 200 / 4xx(可读错误)
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to