So, in doing some testing last week, I believe I managed to exhaust the
number of threads available to nova-compute last week. After some
investigation, I found the pthread_create failure and increased nproc for
our Nova user to, what I considered, a ridiculous 120,000 threads after
reading that librados will require a thread per osd, plus a few for
overhead, per VM on our compute nodes.

This made me wonder: how many threads could Ceph possibly need on one of
our compute nodes.

32 cores * an overcommit ratio of 16, assuming each one is booted from a
Ceph volume, * 300 (approximate number of disks in our soon-to-go-live Ceph
cluster) = 153,600 threads.

So this is where I started to put the truck in reverse. Am I right? What
about when we triple the size of our Ceph cluster? I could easily see a
future where we have easily 1,000 disks, if not many, many more in our
cluster. How do people scale this? Do you RAID to increase the density of
your Ceph cluster? I can only imagine that this will also drastically
increase the amount of resources required on my data nodes as well.

So... suggestions? Reading?
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to