cture ?
(5) provides a way to give more weight to some XML element types during
relevance scoring ?
Best regards,
Maciej Gawinecki
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e
lic TokenStream tokenStream(final String fieldName, final Reader
reader) {
return new NGramTokenizer(reader, minGram, maxGram);
}
}
Thank you very much for a solution or any other approach.
Maciej
-
To
this or it was asked before,
please forgive me; just searching about Lucene indexing performance on
NTFS doesn't help me much...
Best regards,
Maciej
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For
it returns "ać". I would
expect that for words it has not be trained for it will return their
original forms, as it happens, for instance, when stemming words like
"xyz".
With kind regards,
Maciej Gawinecki
Here's minimal example to reproduce the issue:
package org.apache.l
>
> > You always pass "piwko" for stemming.
>
> I'm afraid that's not correct? You should *never* pass on piwko when
> stemming. :)
Haha, right, one should not mix both.
Anyway, thank your for your original suggestions. Training it with a
bigger corpus of inflection forms seems like a great idea.
> You always pass "piwko" for stemming.
Right, I've spotted my mistake once I've posted my question but
didn't want spam with too many posts (there's no way to edit already
posted question in a mailing list :-)). Anyway, the issue still
persists. Here's the corrected version to reproduce it:
imp