Hi Shawn,

Thanks for responding so quickly.

The server box is shared by multiple Solr nodes, each node is having more
than 100gb of disk usage (~2-4 replicas of different collections on one
Solr).

The NRTCachingDirectoryFactory is trying to cache as much segments as
possible into the memory, but the queries are for different collections and
are varying (less of repetitive query terms), so thinking this cached
segments are not actually very useful here, and RAM (apart from JVM
assigned) is not enough to cache even 10% of the index, for each Solr node
running.

Also it is an existing Solr, trying to improve performance, and as we know
NIO is better than IO in java and I can increase IOPS and throughput for
disk, so was gathering how will it affect?

Before changing anything will try removing the explicit configuration for
directoryFactory to see how it works/how it picks the best for underlying
OS. *As this should not affect the underlying  indexed data. for the
collectios.

Thanks.



Jayesh Shende


On Fri, 4 Aug 2023, 22:35 Shawn Heisey, <apa...@elyograg.org> wrote:

> On 8/4/23 09:56, Jayesh Shende wrote:
> > Using: Solr 8.11.2 with rhel9
> >
> > Currently using "solr.NRTCachingDirectoryFactory" for a collection,
> > the collection has grown big in size, but don't want to add more RAM to
> > machine(AWS),
> > I can increase IOPS and througput for data volume.
> >
> > Was thinking of using "solr.NIOFSDirectoryFactory",
> > but wanted to know, how will it impact to existing collection?
> > May be it is just a way to read index files,  but to be sure, will it
> > affect my existing indexed data?
>
> It's generally not a good idea to explicitly configure the directory
> factory.  That should only be done in very unusual circumstances.  Your
> situation probably does not qualify.
>
> Remove any config for that and let Solr/Lucene pick the class that's
> best for the environment.  It will probably choose
> NRTCachingDirectoryFactory.  If a better option becomes available in a
> newer Solr version, it will most likely be automatically chosen as long
> as the value isn't explicitly configured.
>
> Looking at the source, I cannot tell for sure whether NOIFS uses mmap,
> but I suspect it does not.  For nearly all use cases, you want a
> directory implementation that uses mmap, which the NRTCaching
> implementation does.
>
> Changing the directory factory is very unlikely to cause any problems
> with the existing index.  But I am curious why you want to change
> that... what have you encountered and why do you think you should go
> with a non-default class?
>
> If you have enough memory installed, the disk speed will have very
> little impact on performance.  Disk performance only becomes important
> in situations where you do not have enough spare memory for effective
> disk caching.  Memory is faster than disk, even if the disk is extremely
> fast SSD.
>
> A directory implementation that uses mmap will be the fastest option.
>
> Thanks,
> Shawn
>

Reply via email to