Hello,
We use solr for our search needs and we have documents indexed on a core in
multiple machines. Over time, the index on some machines has grown from 30
GB to 60 GB now to a giant 133 GB. While others are still hovering around
80GB, and some others are still under 30GB. We manually control which
documents go into which machine and do not use SolrCloud.

We have a field in our index which is a docValue. What we have noticed is
that facet queries on this field take around 10 seconds for almost the
first call every minute or so on the huge server machines which have ~130
GB index size. We commit every minute on our servers as well. We have
ensured that the machines do not starve on RAM and for the ones which have
130 GB of index, we have 256 GB of RAM. So the segments are all in memory
all the time.

Still, we see every call made after a minute or so takes 10 seconds on the
big shards with index size close to 130 GB, 6 seconds on the shards that
are 80GB, and less than 4 seconds on the normal shards whose size is less
than 30 GB.

How can we optimize and get rid of this latency? We have tried using
DocValuesFormat=Direct, increasing the number of facet.threads, increasing
the heap size etc. Is there anything else we can do to get the
performance of facet queries on the large shards to under 2 seconds?


Thanks
Arun

Reply via email to