Re: [ceph-users] Exact scope of OSD heartbeating?

Anthony D'Atri Wed, 18 Jul 2018 10:21:35 -0700

Thanks, Dan.  I thought so but wanted to verify.  I'll see if I can work up a 
doc PR to clarify this.


>> The documentation here:
>> 
>> http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/
>> 
>> says
>> 
>> "Each Ceph OSD Daemon checks the heartbeat of other Ceph OSD Daemons every 6 
>> seconds"
>> 
>> and
>> 
>> " If a neighboring Ceph OSD Daemon doesn’t show a heartbeat within a 20 
>> second grace period, the Ceph OSD Daemon may consider the neighboring Ceph 
>> OSD Daemon down and report it back to a Ceph Monitor,"
>> 
>> I've always thought that each OSD heartbeats with *every* other OSD, which 
>> of course means that total heartbeat traffic grows ~ quadratically.  However 
>> in extending test we've observed that the number of other OSDs that a 
>> subject heartbeat (heartbeated?) was < N, which has us wondering if perhaps 
>> only OSDs with which a given OSD shares are contacted -- or some other 
>> subset.
>> 
> 
> OSDs heartbeat with their peers, the set of osds with whom they share
> at least one PG.
> You can see the heartbeat peers (HB_PEERS) in ceph pg dump -- after
> the header "OSD_STAT USED  AVAIL TOTAL HB_PEERS..."
> 
> This is one of the nice features of the placement group concept --
> heartbeats and peering in general stays constant with the number of
> PGs per OSD, rather than scaling up with the total number of OSDs in a
> cluster.
> 
> Cheers, Dan
> 
> 
>> I plan to submit a doc fix for mon_osd_min_down_reporters and wanted to 
>> resolve this FUD first.
>> 
>> -- aad
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Exact scope of OSD heartbeating?

Reply via email to