>> i agree to configure both interfaces as a bond.
>> from my experience, i see the following advantages for a separate public
>> and cluster network on the bond:
>> the isolation of public network and cluster network traffic makes it easier
>> to monitor client traffic and inter osd traffic.
>> and if it is necessary later, you can also prioritise or limit client
>> traffic via the separate interface.
>> It's also helpful to debug and analyse issues in the ceph cluster.

You aren’t wrong.  This however can complicate network setup, including 
sideband BMC interfaces and potentially MTU mismatch.  With modern releases and 
networking, FWIW, I haven’t seen DoS issues like we used to between client and 
replication traffic.  ymmv.

> I have a different experience. When one of the nodes had a cluster network 
> down, but a working public network, you get hard to troubleshoot issues. 
> Especially as a newbie in Ceph (this was a test cluster). You will see slow 
> operations, OSDs on the storage nodes that flag their peer down, the OSD 
> daemon itself will respond to those messages that it's still running, etc..

Exactly.  “I’m not dead yet!” Flap flap flap with performance impact.  This is 
called out in the docs.

>  Ideally you don't want to have "gray" failures like this and either want 
> your ceph node to be "UP" or "DOWN", but not something in between. A single 
> public interface will give you that.

Indeed.  And when there are only two network interfaces available, that 
redundancy (assuming active/active and appropriate xmit_hash_policy) will limit 
disruption when there’s a layer 1 issue, if the two links are to different 
switches.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to