Tanapol Nearunchorn created SOLR-12084:
------------------------------------------
Summary: ShingleFilter cause threads consume all available memory
Key: SOLR-12084
URL: https://issues.apache.org/jira/browse/SOLR-12084
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: Schema and Analysis
Affects Versions: 7.0.1, 6.5.1, 6.5
Reporter: Tanapol Nearunchorn
When putting ShingleFilter on query analyzer and after some specific query
patterns go through Solr, it causes all of handlers thread to hold a large
amount of SpanNearQuery objects and consume all available memory.
My query analyzer looks like this:
{code:java}
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="false" />
<filter class="solr.WordDelimiterGraphFilterFactory" preserveOriginal="0" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ShingleFilterFactory" tokenSeparator=""
maxShingleSize="3" />
</analyzer>{code}
After I tested with queries, it seems that the number of terms passing to
ShingleFilter directly effect Solr memory usage. If ShingleFilter got 10-15
terms as input, it takes much memory to process the request, so multiply with
concurrent make problem goes worse.
Not sure how to handle this problem, maybe we can put an upper limit number of
terms produced by ShingleFilter or should we optimize something?
Thank you.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]