When segment count is around 16 to 20, performance seems OK.
Here is our config:
Solr 7.7.3.
java -version
openjdk version "11.0.11" 2021-04-20
Our current config is 4 solr nodes with 5 collections. Each collection
is split into two shards. Each collection is replicated.
They are Amazon VMs. 8GB Ram each, 2 CPU cores each. Index storage is on
NVME backed volumes.
When there is a high segment count processes on the systems start
showing very high "i/o wait" with associated idle CPU time according to
top. Which indicates to me that CPU core count isn't a main culprit.
SOLR_JAVA_MEM="-Xms1g -Xmx5g"
GC_TUNE=""
SOLR_LOG_LEVEL="WARN"
vm.swappiness = 0
swap in use = 0
free -h
total used free shared buff/cache
available
Mem: 7.7G 6.3G 130M 80M 1.3G 1.0G
Swap: 15G 0B 15G
On 11/3/21 23:19, Shawn Heisey wrote:
On 11/3/2021 1:44 PM, Michael Conrad wrote:
Is there a way to set max segment count for builtin merge policy?
I'm having a serious issue where I'm trying to reindex 75 million
documents and the segment count skyrockets with associated
significant drop in performance. To the point we start getting lots
of timeouts.
Is there a way to set the merge policy to try and keep the total
segment count to around 16 or so? (This seems to be close to the max
the hosts can manage without having serious performance issues.)
Solr 7.7.3.
The way to reduce total segment count is to reduce the thresholds that
used to be controlled by mergeFactor. I don't know of a way to
explicitly set the max total count, but the per-tier count will affect
the total count. There will be at least three tiers of merging on
most Solr installs, so the max total segment count will be at least
three times the per-tier setting.
This config represents the defaults for Solr's merging policy:
<mergePolicyFactory
class="org.apache.solr.index.TieredMergePolicyFactory">
<int name="maxMergeAtOnce">10</int>
<int name="segmentsPerTier">10</int>
</mergePolicyFactory>
On some Solr servers that I used to manage, those numbers were set to
35. I regularly saw total segment counts larger than 100. That did
not affect performance in a significant way.
If you are seeing significant performance problems it is more likely
one of two problems that have nothing to do with the segment count:
1) Your max heap size is not quite big enough and needs to be
increased. This can lead to severe GC pauses because Java will spend
more time doing GC than running the application.
2) Your index is so big that the amount of free memory on the server
cannot effectively cache it. The fix for that is to add physical
memory, so that more unallocated memory is available to the operating
system. Solr is absolutely reliant on effective index caching for
performance.
More of a side note: One problem that you might be having with
indexing millions of documents is that the indexing thread can get
paused when merging becomes heavy. This will be even more likely to
happen if you reduce the numbers in the config that I included above.
The fix for that is to fiddle with the mergeScheduler config.
<mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler">
<int name="maxMergeCount">6</int>
<int name="maxThreadCount">1</int>
</mergeScheduler>
Some notes: Go with a maxMergeCount that's at least 6. If your
indexes are on spinning hard disks, leave maxThreadCount at 1. If the
indexes are on SSD, you can increase the thread count, but don't go
too wild. Probably 3 or 4 max, and I would be more likely to choose
2. I have never had indexes on SSD, so I do not know how many threads
are too many.
Thanks,
Shawn