Performance issue in Wildcard Query in Solr 8.9.0

Vishal Patel Mon, 23 Oct 2023 04:00:48 -0700

We are using Solr 8.9.0. We have configured Solr cloud like 2 shards and each 
shard has one replica. We have used 5 zoo keepers for Solr cloud.


 We have created collection name documents and index size of one shard is 21GB. 
Schema fields like here
<field name="id" type="string" indexed="true" stored="true" required="true" 
multiValued="false" omitNorms="true" termVectors="false" termPositions="false" 
termOffsets="false" docValues="true"/>
<field name="doc_ref" type="text_string" indexed="true" stored="true" 
multiValued="false" omitNorms="true" termVectors="false" termPositions="false" 
termOffsets="false" omitTermFreqAndPositions="true"/>
<fieldtype name="text_string" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldtype>



We want to search data which contains test. So, we are making our query 
doc_ref:*test*. I think wildcard query is taking high memory and CPU. Sometimes 
we faced issue that collection goes into recovery mode due to usage of wildcard 
query.
Fo better performance, We have implemented ReversedWildcardFilterFactory: 
https://risdenk.github.io/2018/10/25/apache-solr-leading-wildcard-queries-reversedwildcardfilterfactory.html

How can we search after the applying ReversedWildcardFilterFactory? We are not 
getting benefits in term of query execution time if we search in same manner 
doc_ref_rev:*test*

Can you please suggest best approach when we want to search wildcard 
string(*test*) when index size is large?

Regards,

Vishal

Performance issue in Wildcard Query in Solr 8.9.0

Reply via email to