On Tue, Mar 12, 2019 at 6:10 AM Stefan Kooman <ste...@bit.nl> wrote:

> Hmm, 6 GiB of RAM is not a whole lot. Especially if you are going to
> increase the amount of OSDs (partitions) like Patrick suggested. By
> default it will take 4 GiB per OSD ... Make sure you set the
> "osd_memory_target" parameter accordingly [1].
>

@Stefan: Not sure I follow you here - each OSD pod has 6GiB RAB allocated
to it, which accounts for the default 4GiB + 20% mentioned in the docs for
`osd_memory_target` plus a little extra. The pods are running on AWS
i3.2xlarge instances, which have 61GiB total RAM available, leaving plenty
of room for an additional OSD pod to manage the additional partition
created on each node. Why would I need to increase the RAM allocated to
each OSD pod and adjust `osd_memory_target`? Does using the default values
leave me at risk of running into some other kind of priority inversion
issue / deadlock / etc.?

@Patrick: Partitioning the NVMe drives and splitting out the metadata pool
seemed to work perfectly, so thanks for the tip! I was able to scale up to
1000 pods / 500 nodes in my latest load test, each pod reading ~11k files
on a 4-minute interval, and the cluster remained healthy the entire time.
Only the fact that I'd maxed out the compute resources of the control plane
on my dev K8s cluster prevented me from scaling up higher.

Thanks,
Zack
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to