[ 
https://issues.apache.org/jira/browse/KUDU-3635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-3635:
--------------------------------
    Affects Version/s: 1.17.1
                       1.17.0

> kudu CLI tool sometimes crashes on exit with SIGSEGV in OPENSSL_cleanup
> -----------------------------------------------------------------------
>
>                 Key: KUDU-3635
>                 URL: https://issues.apache.org/jira/browse/KUDU-3635
>             Project: Kudu
>          Issue Type: Bug
>          Components: CLI
>    Affects Versions: 1.17.0, 1.18.0, 1.17.1
>            Reporter: Alexey Serbin
>            Priority: Major
>
> The kudu CLI tools sometimes crash on exit with SIGSEGV.
> I haven't had a chance looking at this closely, but it seems the problem is 
> related to the order of cleanup of different libraries and overall unexpected 
> state of the runtime when the implicitly installed cleanup handler for the 
> OpenSSL library is being called.
> Below is a snippet from the output of the 
> {{RebalanceIgnoredTserversTest.Basic}} test scenario.  That was generated by 
> Kudu bits built in RELEASE configuration on Ubuntu 18.04.6 LTS machine and 
> run via dist-test on Ubuntu 18.04.6 LTS as well.
> BTW, we have been suppressing TSAN warnings in the OpenSSL cleanup paths for 
> a long time due to well-known issue in the OpenSSL library (see [this TSAN 
> suppression|https://github.com/apache/kudu/blob/2b9a2012f6d7b59931119dfad03e8d40e3031a0e/src/kudu/util/sanitizer_options.cc#L177-L184]),
>  so there might be some other issues around that we haven't paid attention 
> for a long time.
> Probably, it's time to follow [best practices for at-exit cleanup of 
> applications using 
> OpenSSL|https://developers.redhat.com/articles/2022/10/31/best-practices-application-shutdown-openssl#].
>   In essence, that works at least with v1.1.1 and newer versions of the 
> OpenSSL library: use the {{OPENSSL_INIT_NO_ATEXIT}} option for 
> {{OPENSSL_init_ssl()}} at initialization and then explicitly call 
> {{OPENSSL_cleanup()}} upon exit/shutdown.
> {noformat}
> *** SIGSEGV (@0x10000562bd5) received by PID 1447 (TID 0x7fb1cda47480) from 
> PID 5647317; stack trace: ***
>     @     0x7fb1d6307980 (unknown) at ??:0                                    
>   
>     @     0x7fb1d5a37873 tcmalloc::ThreadCache::ReleaseToCentralCache() at 
> ??:0 
>     @     0x7fb1d5a37be7 tcmalloc::ThreadCache::Scavenge() at ??:0            
>   
>     @     0x7fb1d3bce271 OPENSSL_LH_free at ??:0                              
>   
>     @     0x7fb1d3bacbfd (unknown) at ??:0                                    
>   
>     @     0x7fb1d3bcbe10 OPENSSL_cleanup at ??:0                              
>   
>     @     0x7fb1d434e161 (unknown) at ??:0                                    
>   
>     @     0x7fb1d434e25a exit at ??:0                                         
>   
>     @     0x7fb1d432cbfe __libc_start_main at ??:0                            
>   
>     @     0x562bc9f8300a _start at ??:0   
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to