[ 
https://issues.apache.org/jira/browse/IMPALA-9314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17951922#comment-17951922
 ] 

ASF subversion and git services commented on IMPALA-9314:
---------------------------------------------------------

Commit 78d1c2cd3a4f2c81a9142bc80b8bba9d2e9ff292 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=78d1c2cd3 ]

IMPALA-14049: Fix TSAN issue with HdrHistogram in expr-test

IMPALA-13978 switched HdrHistrogram from using a gscoped_ptr
to unique_ptr. This has been causing TSAN issues during the
teardown for expr-test. gscoped_ptr doesn't null out the
pointer when it gets destructed, but unique_ptr does. This
is a data race with the threads that are still running and
trying to access the metrics.

The full solution would be to have an orderly shutdown of
all the threads before destructing things. That is a large
project that would touch many different components. As a
short-term fix, this avoids the TSAN issue by leaking
the statestore metrics.

We should consider fixing IMPALA-9314 and implementing
orderly shutdown.

Testing:
 - Ran expr-test in a loop with TSAN and didn't see this
   particular issue. There are other shutdown issues with
   much lower frequency that have different symptoms.

Change-Id: I73c3f4db16c6ffa272f2512e9871db5743be7a54
Reviewed-on: http://gerrit.cloudera.org:8080/22900
Reviewed-by: Riza Suminto <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Thrift server shutdown is racey
> -------------------------------
>
>                 Key: IMPALA-9314
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9314
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend
>            Reporter: Sahil Takiar
>            Priority: Major
>
> There are multiple issues with shutting down the {{ThriftServer}}:
>  * The docs for {{ThriftServer}} say "TODO: shutdown is buggy (which only 
> harms tests)"
>  * TSAN reports a thread leak when using the {{ThriftServer}} in 
> {{statestore-test}}
>  ** Attempts to shutdown the {{ThriftServer}} in the {{Statestore}} don't 
> seem to work properly
>  * The {{TAcceptQueueServer}} actually uses a {{volatile}} {{boolean}} called 
> {{stop_}} to coordinate the shutdown (see {{TAcceptQueueServer::stop()}}) 
> which is not-thread safe (TSAN complains about this as well)
> According to the docs this only affects test, specifically 
> {{statestore-test}}. We should consider whether it is worth fixing the 
> shutdown logic in the {{ThriftServer}}.
> {code:java}
>  WARNING: ThreadSanitizer: thread leak (pid=67752)
>   Thread T39 (tid=67794, finished) created by main thread at:
>     #0 pthread_create 
> /mnt/source/llvm/llvm-5.0.1.src-p1/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:889
>  (statestore-test+0x18903bb)
>     #1 boost::thread::start_thread_noexcept() <null> 
> (statestore-test+0x2f30b59)
>     #2 boost::thread::thread<void (*)(std::string const&, std::string const&, 
> boost::function<void ()>, impala::ThreadDebugInfo const*, 
> impala::Promise<long, (impala::PromiseMode)0>*), std::string, std::string, 
> boost::function<void ()>, impala::ThreadDebugInfo*, impala:
> :Promise<long, (impala::PromiseMode)0>*>(void (*)(std::string const&, 
> std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, 
> impala::Promise<long, (impala::PromiseMode)0>*), std::string, std::string, 
> boost::function<void ()>, impala::ThreadDebugIn
> fo*, impala::Promise<long, (impala::PromiseMode)0>*) 
> /home/systest/Impala/toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:419:13
>  (statestore-test+0x1ebf8eb)
>     #3 impala::Thread::StartThread(std::string const&, std::string const&, 
> boost::function<void ()> const&, std::unique_ptr<impala::Thread, 
> std::default_delete<impala::Thread> >*, bool) 
> /home/systest/Impala/be/src/util/thread.cc:317:13 (statestore-test+0x1ebcbd5)
>     #4 impala::Status impala::Thread::Create<void 
> (impala::ThriftServer::ThriftServerEventProcessor::*)(), 
> impala::ThriftServer::ThriftServerEventProcessor*>(std::string const&, 
> std::string const&, void (impala::ThriftServer::ThriftServerEventProcessor::* 
> const&)(), imp
> ala::ThriftServer::ThriftServerEventProcessor* const&, 
> std::unique_ptr<impala::Thread, std::default_delete<impala::Thread> >*, bool) 
> /home/systest/Impala/be/src/util/thread.h:81:12 (statestore-test+0x2707497)
>     #5 
> impala::ThriftServer::ThriftServerEventProcessor::StartAndWaitForServer() 
> /home/systest/Impala/be/src/rpc/thrift-server.cc:116:43 
> (statestore-test+0x27046f2)
>     #6 impala::ThriftServer::Start() 
> /home/systest/Impala/be/src/rpc/thrift-server.cc:447:60 
> (statestore-test+0x27063cc)
>     #7 impala::Statestore::Init(int) 
> /home/systest/Impala/be/src/statestore/statestore.cc:478:59 
> (statestore-test+0x1d7a431)
>     #8 impala::StatestoreTest_SmokeTest_Test::TestBody() 
> /home/systest/Impala/be/src/statestore/statestore-test.cc:55:132 
> (statestore-test+0x18fb381)
>     #9 void 
> testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, 
> void>(testing::Test*, void (testing::Test::*)(), char const*) <null> 
> (statestore-test+0x41051e2)
>     #10 __libc_start_main 
> /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291 
> (libc.so.6+0x2082f){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to