ccoffline opened a new issue #6155:
URL: https://github.com/apache/incubator-doris/issues/6155


   **Describe the bug**
   A replay NPE made 3 FE crash and cannot recover
   ```
   2021-07-02 04:22:36,862 ERROR (replayer|83) [EditLog.loadJournal():816] 
Operation Type 29
   java.lang.NullPointerException: null
           at 
org.apache.doris.consistency.ConsistencyChecker.replayFinishConsistencyCheck(ConsistencyChecker.java:368)
 ~[palo-fe.jar:3.4.0]
           at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:339) 
[palo-fe.jar:3.4.0]
           at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2560) 
[palo-fe.jar:3.4.0]
           at org.apache.doris.catalog.Catalog$3.runOneCycle(Catalog.java:2344) 
[palo-fe.jar:3.4.0]
           at org.apache.doris.common.util.Daemon.run(Daemon.java:116) 
[palo-fe.jar:3.4.0]
   ```
   
https://github.com/apache/incubator-doris/blob/d6e6c7815b452d0e262b5c5a7a52fce0880c6117/fe/fe-core/src/main/java/org/apache/doris/consistency/ConsistencyChecker.java#L365-L370
   The previous version of this file didn't prevent the NPE anyway, but never 
cause NPE.
   
https://github.com/apache/incubator-doris/blob/94a81e52c796150333c54838a889be01934983a4/fe/fe-core/src/main/java/org/apache/doris/consistency/ConsistencyChecker.java#L366-L371
   We infer that this NPE is caused by a change in the write-order of editlog. 
We don't have enough log to prove what’s really going on, but one possible 
explanation is that:
   - `CheckConsistencyJob.tryFinishJob` has already got the table and try to 
lock.
   - The table has been dropped just after `tryFinishJob` got the table.
   - The op succeeded on the dropped table, and write an editlog.
   - A follower replay this editlog and crash, and never recover.
   
https://github.com/apache/incubator-doris/blob/d6e6c7815b452d0e262b5c5a7a52fce0880c6117/fe/fe-core/src/main/java/org/apache/doris/consistency/CheckConsistencyJob.java#L244-L270
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to