The ideal jvm size will be influenced by the latency sensitivity of the
application. Large VM's are mostly good if you need to hold large data
objects in memory, otherwise they fill up with large numbers of small
objects and that leads to long GC pauses (GC time relates to the number,
not the size of the objects). Testing to properly measure latency including
some longer runs and a clear picture of your latency requirement is
important for predictable results in production. Without information on how
many machines are available, the size of the corpus, the nature of the
application and the requirements I think it's hard to make solid
recommendations regarding cluster layout. There of course are other issues
and costs to managing many instances (zk overhead eventually becomes a
problem if this leads you into thousands of replicas). But the smallest
sufficient JVM plus some sort of safety margin to defend
vs growth/change/attack is usually what you want unless that leads to very
large numbers of nodes or difficulties managing complexities. Lots of
trade-offs.

On Fri, Oct 7, 2022 at 9:19 AM Shawn Heisey <apa...@elyograg.org> wrote:

> On 10/6/22 15:54, Dominique Bejean wrote:
> > We are starting to investigate performance issues with a new customer.
> > There are several bad practices (commit, sharding, replicas count and
> > types, heap size, ...), that can explain these issues and we will work on
> > it in the next few days. I agree we need to better understand specific
> > usage and make some tests after fixing the bad practices.
> >
> > Anyway, one of the specific aspects is these huge servers, so I am trying
> > to see what is the best way to use all these ressources.
> >
> >
> > * Why do you want to split it up at all?
> >
> > Because one of the bad practices is a huge heap size (80 Gb). I am pretty
> > sure this heap size is not required and anyway it doesn't respect the
> 31Gb
> > limit. After determining the best heap size, if this size is near 31Gb, I
> > imagine it is better to have several Solr JVMs with less heap size. For
> > instance 2 Solr JVMs with 20 Gb each or 4 Solr JVMs with 10 Gb each.
>
> IMHO, lowering the heap size requirement is the ONLY reason to run more
> than one Solr instance on a server.  But I would go with 2 JVMs at 20GB
> each rather than 4 at 11GB.  Reduce the number of moving parts that you
> must track and manage.  The minimum heap requirement of two JVMs each
> with half the data will be a little bit larger than the minimum
> requirement of one JVM with all the data, due to JVM overhead.  I don't
> have a number for you on how much overhead there is for each JVM.
>
> > * MMapDirectory JVM sharing
> >
> > This point is the main reason for my message. If several Solr JVMs are
> > running on one server, will MMapDirectory work fine or will the JVMs
> fight
> > with each other in order to use off heap memory ?
>
> There would be little or no difference in the competition for disk cache
> memory with one JVM or several.
>
> > Storage configuration is the second point that I would like to
> investigate
> > in order to better share disk resources.
> > Instead have one single RAID 6 volume, isn't it better to have one
> distinct
> > not RAID volume per Solr node (if multiple Solr nodes are running on the
> > server) or multiple not RAID volumes use by a single Solr JVM (if only
> one
> > Solr node is running on the server) ?
>
> It is true that if you have each instance running on its own disk that
> what a single instance does will have zero effect on another instance.
>
> But RAID10 can mean even better performance than one mirror set for each
> instance.  Here's some "back of the envelope" calculations for you.
> Let's assume that each of the drives has a sustained throughput of 125
> megabytes per second.  Most modern SATA disks can exceed that, and
> high-RPM enterprise SAS disks are faster.  SSD beats them all by a wide
> margin.
>
> If you move to an 8-drive RAID10 array with slower disks like I just
> described, then the array has access to a potential data write rate of
> 500 MB/s, as the array consists of four mirror sets with the volume
> striped across them.  A well-designed RAID controller can potentially
> have an even higher read rate than 500 MB/s, by taking advantage of the
> fact that every bit of data actually exists on two drives, not just
> one.  The highest possible data rates will not always happen, but the
> average throughput will very often exceed the single disk rate of 125 MB/s.
>
> There is another important consideration that applies no matter how the
> storage is arranged:  If the machine has sufficient spare memory, most
> of the data that Lucene needs will be sitting in the disk cache at all
> times and transfer at incredible speed, and the amount of data that is
> actually read from disk will be relatively small.  This is the secret to
> stellar Solr performance:  Lots of memory beyond what is needed by
> program heaps, so that disk accesses are not needed very often.
>
> Thanks,
> Shawn
>
>

-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Reply via email to