Hello, In my company, we currently have the following infrastructure:
- Ceph Luminous - OpenStack Pike. We have a cluster of 3 osd nodes with the following configuration: - 1 x Xeon (R) D-2146NT CPU @ 2.30GHz - 128GB RAM - 128GB ROOT DISK - 12 x 10TB SATA ST10000NM0146 (OSD) - 1 x Intel Optane P4800X SSD DC 375GB (block.DB / block.wal) - Ubuntu 16.04 - 2 X 10Gb network interface configured with lacp The compute nodes have - 4 x 10Gb network interfaces with lacp. We also have 4 monitors with: - 4 x 10Gb lacp network interfaces. - The monitor nodes are approx. 90% cpu idle time with 32GB / 256GB available RAM For each OSD disk we have created a partition of 33GB to block.db and block.wal. We are recently facing a number of performance issues. Virtual machines created in OpenStack are experiencing slow writing issues (approx. 50MB / s). The OSD nodes monitoring incur an average of 20% cpu IOwait time and 70 cpu idle time. The memory consumption is around 30% consumption. We have no latency issues (9ms average) My question is if what is happening may have to do with the amount of disk dedicated to DB / WAL. In the CEPH documentation it says it is recommended that the block.db size is not smaller than 4% of block. In this case for each disk in my environment block.db could not be less than 400GB / OSD. Another question is if I set my disks to use block.db / block.wal on the mechanical disks themselves, if that could lead to a performance degradation. Att. João Victor Rodrigues Soares
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io