On 10/6/22 01:57, Dominique Bejean wrote:
One of our customer have huge servers
- Bar-metal
- 64 CPU
- 512 Gb RAM
- 6x2Tb disk in RAID 6 (so 2Tb disk space available)
I think the best way to optimize resources usage of these servers is to
install several Solr instances.
That is not what I would do.
Do not configure disks in RAID 6 but, leave 6 standard volumes (more space
disk, more I/O available)
Install 3 or 6 solr instances each one using 1 ou 2 disk volumes
RAID10 will get you the best performance. Six 2TB drives in RAID10 has
6TB of total space. The ONLY disadvantage that RAID10 has is that you
pay for twice the usable storage. Disks are relatively cheap, though
hard to get in quantity these days. I would recommend going with the
largest stripe size your hardware can support. 1MB is typically where
that maxes out.
Any use of RAID5 or RAID6 has two major issues: 1) A serious
performance problem that also affects reads if there are ANY writes
happening. 2) If a disk fails, performance across the board is
terrible. When the bad disk is replaced, performance is REALLY terrible
as long as a rebuild is happening, and I have seen a RAID5/6 rebuild
take 24 to 48 hours with 2TB disks on a busy array. It would take even
longer with larger disks.
What I am not sure is how MMapDirectory will work with several Solr
instances. Will off heap memory correctly managed and shared between
several Solr instances ?
With symlinks or multiple mount points in the solr home, you can have a
single instance handle indexes on multiple storage devices. One
instance has less overhead, particularly in memory, than multiple
instances. Off heap memory for the disk cache should function as
expected with multiple instances or one instances.
Thanks,
Shawn