Yakov Sirotkin created SOLR-9974:
------------------------------------
Summary: Use Suffix Arrays for fast search with leading asterisks
Key: SOLR-9974
URL: https://issues.apache.org/jira/browse/SOLR-9974
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Yakov Sirotkin
If query term starts with asterisks FST checks all words in the dictionary so
request processing speed falls down. This problem can be solved with Suffix
Array approach. Luckily, Suffix Array can be constructed after Lucene start
from existing index. Unfortunately, Suffix Arrays requires a lot of RAM so we
can use it only when special flag is set:
-Dsolr.suffixArray.enable=true
It is possible to speed up Suffix Array initialization using several threads,
so we can control number of threads with
-Dsolr.suffixArray.initialization_treads_count=5
This system property can be omitted, the default value is 5.
Attached patch is the suggested implementation for SuffixArray support, it
works for all terms starting with asterisks with at least 3 consequent
non-wildcard characters. This patch do not change search results and affects
only performance issues.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]