Hi,
we are using the kstem stemmer which is working fine most of the time but like
most stemmers has its problems as well. Did anybody ever come across a list of
common overrides to apply for the stemmer? I know that this depends a lot on
the data that is being indexed but I was wondering if th
Dear committers
Recently I wanted to be able to extend wildcard queries over phrases.
To do so, I dived into ComplexPhraseQueryParser.
It turned out that making a small change to that class allows me to
achieve my goal.
Because I thought that change might help others, I opend a Jira issue
and at
An n-gram tokenizer/filter might also work for you:
http://lucene.apache.org/core/7_3_1/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenizer.html
Regards,
András
On Wed, Jun 20, 2018 at 11:53 AM, Markus Jelsma
wrote:
> Hi Egorlex,
>
> Set the tokenSeparator to "" and ShingleFilter w
Hi Egorlex,
Set the tokenSeparator to "" and ShingleFilter will concatenate all shingles
without whitespace. Keep in mind, this will greatly increase the size of the
index so it might not be a good idea to concatenate all pairs of words.
If you are looking for finding "similarissues" with "sim
Thanks for replay!
sorry, could you help a little, according to example
"given the phrase “Shingles is a viral disease”, a shingle filter might
produce:
Shingles is
is a
a viral
viral disease
"
I do not quite understand how this ShingleFilter can turn "similarissues"
into "similar issues"
Tha