On Thu, Mar 16, 2017 at 9:25 PM, 许雪寒 <xuxue...@360.cn> wrote:
> Hi, Gregory.
>
> On the other hand, I checked the fix 63e44e32974c9bae17bb1bfd4261dcb024ad845c 
> should be the one that we need. However, I notice that this fix has only been 
> backported down to v11.0.0, can we simply apply it to our Hammer 
> version(0.94.5)?

There hasn't been much SimpleMessenger churn, so probably. However,
you'll want to follow all the threads around
http://tracker.ceph.com/issues/14120 to make sure you get all the
appropriate patches.
-Greg

>
> -----邮件原件-----
> 发件人: 许雪寒
> 发送时间: 2017年3月17日 10:09
> 收件人: 'Gregory Farnum'
> 抄送: ceph-users@lists.ceph.com; jiajia zhong
> 主题: 答复: 答复: [ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5
>
> I got it. Thanks very much:-)
>
> 发件人: Gregory Farnum [mailto:gfar...@redhat.com]
> 发送时间: 2017年3月17日 2:10
> 收件人: 许雪寒
> 抄送: ceph-users@lists.ceph.com; jiajia zhong
> 主题: Re: 答复: [ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5
>
>
> On Thu, Mar 16, 2017 at 3:36 AM 许雪寒 <xuxue...@360.cn> wrote:
> Hi, Gregory, is it possible to unlock Connection::lock in Pipe::read_message 
> before tcp_read_nonblocking is called? I checked the code again, it seems 
> that the code in tcp_read_nonblocking doesn't need to be locked by 
> Connection::lock.
>
> Unfortunately it does. You'll note the memory buffers it's grabbing via the 
> Connection? Those need to be protected from changing (either being canceled, 
> or being set up) while the read is being processed.
> Now, you could probably do something more complicated around the buffer 
> update mechanism, or if you know your applications don't make use of it you 
> could just rip them out entirely. But while that mechanism exists it needs to 
> be synchronized.
> -Greg
>
>
>
>
> -----邮件原件-----
> 发件人: Gregory Farnum [mailto:gfar...@redhat.com]
> 发送时间: 2017年1月17日 7:14
> 收件人: 许雪寒
> 抄送: jiajia zhong; ceph-users@lists.ceph.com
> 主题: Re: 答复: [ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5
>
> On Sat, Jan 14, 2017 at 7:54 PM, 许雪寒 <xuxue...@360.cn> wrote:
>> Thanks for your help:-)
>>
>> I checked the source code again, and in read_message, it does hold the 
>> Connection::lock:
>
> You're correct of course; I wasn't looking and forgot about this bit.
> This was added to deal with client-allocated buffers and/or op cancellation 
> in librados, IIRC, and unfortunately definitely does need to be synchronized 
> — I'm not sure about with pipe lookups, but probably even that. :/
>
> Unfortunately it looks like you're running a version that didn't come from 
> upstream (I see hash 81d4ad40d0c2a4b73529ff0db3c8f22acd15c398 in another 
> email, which I can't find), so there's not much we can do to help with the 
> specifics of this case — it's fiddly and my guess would be the same as 
> Sage's, which you say is not the case.
> -Greg
>
>>
>>                      while (left > 0) {
>>                         // wait for data
>>                         if (tcp_read_wait() < 0)
>>                                 goto out_dethrottle;
>>
>>                         // get a buffer
>>                         connection_state->lock.Lock();
>>                         map<ceph_tid_t, pair<bufferlist, int> >::iterator p =
>>                                         
>> connection_state->rx_buffers.find(header.tid);
>>                         if (p != connection_state->rx_buffers.end()) {
>>                                 if (rxbuf.length() == 0 || p->second.second 
>> != rxbuf_version) {
>>                                         ldout(msgr->cct,10)
>>                                                                 << "reader 
>> seleting rx buffer v "
>>                                                                              
>>    << p->second.second << " at offset "
>>                                                                              
>>    << offset << " len "
>>                                                                              
>>    << p->second.first.length() << dendl;
>>                                         rxbuf = p->second.first;
>>                                         rxbuf_version = p->second.second;
>>                                         // make sure it's big enough
>>                                         if (rxbuf.length() < data_len)
>>                                                 rxbuf.push_back(
>>                                                                 
>> buffer::create(data_len - rxbuf.length()));
>>                                         blp = p->second.first.begin();
>>                                         blp.advance(offset);
>>                                 }
>>                         } else {
>>                                 if (!newbuf.length()) {
>>                                         ldout(msgr->cct,20)
>>                                                                 << "reader 
>> allocating new rx buffer at offset "
>>                                                                              
>>    << offset << dendl;
>>                                         alloc_aligned_buffer(newbuf, 
>> data_len, data_off);
>>                                         blp = newbuf.begin();
>>                                         blp.advance(offset);
>>                                 }
>>                         }
>>                         bufferptr bp = blp.get_current_ptr();
>>                         int read = MIN(bp.length(), left);
>>                         ldout(msgr->cct,20)
>>                                                 << "reader reading 
>> nonblocking into "
>>                                                                 << (void*) 
>> bp.c_str() << " len " << bp.length()
>>                                                                 << dendl;
>>                         int got = tcp_read_nonblocking(bp.c_str(), read);
>>                         ldout(msgr->cct,30)
>>                                                 << "reader read " << got << 
>> " of " << read << dendl;
>>                         connection_state->lock.Unlock();
>>                         if (got < 0)
>>                                 goto out_dethrottle;
>>                         if (got > 0) {
>>                                 blp.advance(got);
>>                                 data.append(bp, 0, got);
>>                                 offset += got;
>>                                 left -= got;
>>                         } // else we got a signal or something; just loop.
>>                 }
>>
>> As shown in the above code, in the reading loop, it first lock 
>> connection_state->lock and then do tcp_read_nonblocking. connection_state is 
>> of type PipeConnectionRef, connection_state->lock is Connection::lock.
>>
>> On the other hand, I'll check that whether there are a lot of message
>> to send as you suggested. Thanks:-)
>>
>>
>>
>> 发件人: Gregory Farnum [gfar...@redhat.com]
>>
>> 发送时间: 2017年1月14日 9:39
>>
>> 收件人: 许雪寒
>>
>> Cc: jiajia zhong; ceph-users@lists.ceph.com
>>
>> 主题: Re: [ceph-users] 答复: 答复: Pipe "deadlock" in Hammer, 0.94.5
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jan 12, 2017 at 7:58 PM, 许雪寒 <xuxue...@360.cn> wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thank you for your continuous helpJ.
>>
>> We are using hammer 0.94.5 version, and what I read is the version of the 
>> source code.
>>
>> However, on the other hand, if Pipe::do_recv do act as blocked, is it
>> reasonable for the Pipe::reader_thread to block threads calling 
>> SimpleMessenger::submit_message  by holding Connection::lock?
>>
>> I think maybe a different mutex should be used in Pipe::read_message rather 
>> than Connection::lock.
>>
>>
>>
>>
>>
>>
>>
>> I don't think it does use that lock. Pipe::read_message() is generally
>> called while the pipe_lock is held, but not Connection::lock. (They
>> are separate.)
>>
>> I haven't dug into the relevant OSD code in a while, but I think it's
>> a lot more likely your OSD is just overloaded and is taking a while to send 
>> a lot of different messages, and that the loop it's in doesn't update the 
>> HeartbeatMap or something. Did you check  that?
>>
>> -Greg
>>
>>
>>
>>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to