[ceph-users] Re: Cephfs metadata and MDS on same node

Anthony D'Atri Fri, 26 Mar 2021 09:14:44 -0700


> On Mar 26, 2021, at 6:31 AM, Stefan Kooman <ste...@bit.nl> wrote:
> 
> On 3/9/21 4:03 PM, Jesper Lykkegaard Karlsen wrote:
>> Dear Ceph’ers
>> I am about to upgrade MDS nodes for Cephfs in the Ceph cluster (erasure code 
>> 8+3 ) I am administrating.
>> Since they will get plenty of memory and CPU cores, I was wondering if it 
>> would be a good idea to move metadata OSDs (NVMe's currently on OSD nodes 
>> together with cephfs_data ODS (HDD)) to the MDS nodes?
>> Configured as:
>> 4 x MDS with each a metadata OSD and configured with 4 x replication
>> so each metadata OSD would have a complete copy of metadata.
>> I know MDS, stores al lot of metadata in RAM, but if metadata OSDs were on 
>> MDS nodes, would that not bring down latency?
>> Anyway, I am just asking for your opinion on this? Pros and cons or even 
>> better somebody who actually have tried this?
> 
> I doubt you'll gain a lot from this. Data still has to be replicated, so 
> network latency. And reads would come from the primary OSDs from the cephFS 
> metadata pool. So only if you can make all primary OSDs be on the single 
> active MDS you might have gains. But you will have to do manual tuning with 
> upmap to achieve that.


I think primary affinity would be the way to do this vs upmap fwiw, though the 
net result might be mixed, since ops will be directed at only 25% of the OSDs.  
OSD busy-ness vs network latency.  And as the cluster topology changes one 
would need to periodically refresh the affinity values.

> I think your money is better spend at buying more NVMe disks and spreading 
> the load than to co-locate that on MDS.

Agreed.  Complex solutions have a way of being more brittle, and of hitting 
corner cases.  

> If you are planning on multi-active MDS I don't think it would make sense at 
> all.

Unless one provisions multiple filesystems each pinned to an MDS with unique 
set of OSDs (CRUSH root?) with affinities managed independently?  Not sure if 
that’s entirely possible; if it is, it’d be an awful lot of complexity.
 

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Cephfs metadata and MDS on same node

Reply via email to