Safat,
DirectSolrSpellChecker defaults to Levenshtein Distance to determine how
closely related the query terms are versus the actual terms in the index. (see
https://en.wikipedia.org/wiki/Levenshtein_distance) . This is not an
English-specific metric and it works for many languages.
Assuming this is not appropriate for the Bangla language (sorry for my
ignorance!), you might need to implement your own Distance metric, implementing
the StringDistance interface. You can specify your custom class using the
"distanceMeasure" parameter under the SpellCheckComponent entry in
solrconfig.xml:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str
name="classname">solr.DirectSolrSpellChecker</str>
<str
name="distanceMeasure">fully.qualified.classname.here</str>
.. etc ..
</lst>
</searchComponent>
For more information, see:
http://lucene.apache.org/core/5_2_1/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html#setDistance%28org.apache.lucene.search.spell.StringDistance%29
Finally, if misplaced whitespace in the query are a problem in the Bangla, you
may wish to consider using WordBreakSolrSpellchecker in conjunction with
DirectSolrSpellChecker to correct these problems also. See the main Solr
example solrconfig.xml for more information.
(https://github.com/apache/lucene-solr/blob/branch_5x/solr/example/files/conf/solrconfig.xml)
James Dyer
Ingram Content Group
From: Safat Siddiqui [mailto:[email protected]]
Sent: Monday, July 06, 2015 10:06 PM
To: [email protected]
Subject: Solr Spell checker for non-english language
Hello,
I am using Solr version 4.10.3 and trying to customize it for bangla language.
I have already built a Bangla language stemmer for Solr indexing: It works fine.
Now I like to use Solr spell checker and suggestion functionality for Bangla
language. Which section in "DirectSolrSpellChecker" should I modify? I can not
find which section is causing the difference between "English" and
"Non-english" language. A direction will be very helpful for me. Thanks in
advance.
Regards,
Safat
--
Thanks,
Safat Siddiqui
Student
Department of CSE
Shahjalal University of Science and Technology
Sylhet, Bangladesh.