[ 
https://issues.apache.org/jira/browse/KUDU-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774567#comment-16774567
 ] 

Alexey Serbin commented on KUDU-2708:
-------------------------------------

The glibc code might make multiple attempts to find a not-yet-existing 
filename, and in case of high contention creating a temporary file with the 
same pattern at the same time those conflicts might happen often: 
https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/sysdeps/posix/tempname.c#L241

The code in the source files in glibc-2.12-1.149.el6.src.rpm (patches applied) 
for {{__gen_tempname}} looks pretty much like at the link above.  A small repro 
would help, but if it's indeed the contention of multiple threads over the name 
of the temporary file, adding thread identifier into the temporary file pattern 
might help.

However, I suspect the main issue there is just making IO while holding that 
lock, and just creating a file on a filesystem might be expensive under heavy 
load.

> Possible contention creating temporary files while flushing cmeta during an 
> election storm
> ------------------------------------------------------------------------------------------
>
>                 Key: KUDU-2708
>                 URL: https://issues.apache.org/jira/browse/KUDU-2708
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: Will Berkeley
>            Priority: Major
>
> Doing investigation into consensus queue overflows that happen under heavy 
> write load, I noticed 6/10 service threads at the time of overflow have 
> stacks like
> {noformat}
> 0x3b6720f710 <unknown>
>            0x1fb900a base::internal::SpinLockDelay()
>            0x1fb8ea7 base::SpinLock::SlowLock()
>             0xb82e25 kudu::consensus::RaftConsensus::RequestVote()
>             0x931555 
> kudu::tserver::ConsensusServiceImpl::RequestConsensusVote()
>            0x1e28a2c kudu::rpc::GeneratedServiceIf::Handle()
>            0x1e2935a kudu::rpc::ServicePool::RunThread()
>            0x1f9bd91 kudu::Thread::SuperviseThread()
>         0x3b672079d1 start_thread
>         0x3b66ee88fd clone
> {noformat}
> They are waiting on some tablet's Raft consensus instance's {{lock_}} in 
> order to vote. Looking into what might be holding that lock, I see stacks like
> {noformat}
> 0x3b6720f710 <unknown>
>         0x3b66edb2ed __GI_open64
>         0x3b66e63caa __gen_tempname
>            0x1f1cf35 kudu::(anonymous namespace)::PosixEnv::MkTmpFile()
>            0x1f1f662 kudu::(anonymous namespace)::PosixEnv::NewTempRWFile()
>            0x1f8305e kudu::pb_util::WritePBContainerToPath()
>             0xb47932 kudu::consensus::ConsensusMetadata::Flush()
>             0xb74164 
> kudu::consensus::RaftConsensus::SetVotedForCurrentTermUnlocked()
>             0xb783aa 
> kudu::consensus::RaftConsensus::RequestVoteRespondVoteGranted()
>             0xb836a1 kudu::consensus::RaftConsensus::RequestVote()
>             0x931555 
> kudu::tserver::ConsensusServiceImpl::RequestConsensusVote()
>            0x1e28a2c kudu::rpc::GeneratedServiceIf::Handle()
>            0x1e2935a kudu::rpc::ServicePool::RunThread()
>            0x1f9bd91 kudu::Thread::SuperviseThread()
>         0x3b672079d1 start_thread
>         0x3b66ee88fd clone
> {noformat}
> Doing some junior spelunking into glibc code, one hypothesis is that we are 
> generating lots of collisions of proposed temporary file names in the cmeta 
> folder because many threads are attempting to flush cmeta at once. The glibc 
> code looks like
> Maybe we could put the thread id into the temporary file name when a thread 
> does a cmeta flush.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to