vagetablechicken opened a new issue #2893: Condition destroy error shouldn't be 
a fatal error
URL: https://github.com/apache/incubator-doris/issues/2893
 
 
   When we stop one be,  it always makes a fatal error:
   
   `F0213 12:33:53.604131 117681 utils.cpp:1124] fail to destroy cond. 
err=Device or resource busy`
   
   
https://github.com/apache/incubator-doris/blob/fd492e3b6fd729e617536842ba4092911f8afae8/be/src/olap/utils.cpp#L133-L139
   
   We all know that EBUSY means destroy the object referenced by cond while it 
is referenced by another thread.
   
   It's a common fault in multi-threads, so we shouldn't make it fatal after 
one try.
   How about make it fatal after several failure attempts? As follows.
   
   ```
   #define PTHREAD_COND_DESTROY_WITH_LOG(condptr) \
       do {\
           int cond_ret = 0;\
           int try_time = 0;\
           while (0 != (cond_ret = pthread_cond_destroy(condptr))) {\
               if (try_time++ < 20) sleep(1); \
               else LOG(FATAL) << "fail to destroy cond. err=" << 
strerror(cond_ret); \
           }\
       } while (0)
   ```
   
   My test result:
   It will wait 10~15s when the be is idle.
   The cost is a slightly longer time. Or any good method?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to