[
https://issues.apache.org/jira/browse/SOLR-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michal Danilak updated SOLR-7192:
---------------------------------
Attachment: cz_CZ.zip
> Poor performance of Hunspell with Czech Dictionary
> --------------------------------------------------
>
> Key: SOLR-7192
> URL: https://issues.apache.org/jira/browse/SOLR-7192
> Project: Solr
> Issue Type: Bug
> Components: Schema and Analysis
> Affects Versions: 5.0
> Environment: Linux vld091 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64
> GNU/Linux
> Reporter: Michal Danilak
> Labels: performance
> Attachments: cz_CZ.zip
>
>
> Possibly related to issue 3245
> (https://issues.apache.org/jira/browse/SOLR-3245). Symptoms are exactly the
> same.
> HunspellStemFilterFactory with Czech dictionary is 100s times slower than
> CzechStemFilterFactory.
> Analyzer setup:
> <fieldtype name="text_cs" class="solr.TextField">
> <analyzer type="query">
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1"
> catenateWords="0"
> catenateNumbers="0"
> catenateAll="0"
> stemEnglishPossessive="0" />
> <filter class="solr.HunspellStemFilterFactory"
> dictionary="cs_CZ.dic"
> affix="cs_CZ.aff"
> ignoreCase="true"
> strictAffixParsing="true" />
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
> </analyzer>
> <analyzer type="index">
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1"
> generateNumberParts="1"
> catenateWords="1"
> catenateNumbers="1"
> catenateAll="0"
> stemEnglishPossessive="0" />
> <filter class="solr.HunspellStemFilterFactory"
> dictionary="cs_CZ.dic"
> affix="cs_CZ.aff"
> ignoreCase="true"
> strictAffixParsing="true" />
> <filter class="solr.ASCIIFoldingFilterFactory" />
> <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
> </analyzer>
> </fieldtype>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]