On 10/23/23 05:00, Vishal Patel wrote:
We want to search data which contains test. So, we are making our query 
doc_ref:*test*. I think wildcard query is taking high memory and CPU. Sometimes 
we faced issue that collection goes into recovery mode due to usage of wildcard 
query.
Fo better performance, We have implemented ReversedWildcardFilterFactory: 
https://risdenk.github.io/2018/10/25/apache-solr-leading-wildcard-queries-reversedwildcardfilterfactory.html

As Mikhail indicated, ReversedWildcardFilterFactory is not designed to help with this. It is for leading wildcards, and your query has both leading and trailing wildcards.

Wildcard queries are particularly resource intensive.

Let's say that doc_ref:*test* matches one million different terms in the doc_ref field. I am not talking about documents, I am talking about terms.

Internally, Solr will do this in two steps: First it will expand the wildcard to retrieve all one million matching terms, and then it will execute the query, which will literally contain one million terms. This is going to consume a lot of CPU and memory.

Will "test" be a distinct word in the doc_ref field, or would you also need it to match a value of abctestxyz? If it's a distinctive word, you might be better off with a relatively standard analysis chain on a fieldType of TextField and no wildcards.

Thanks,
Shawn

Reply via email to