Re: Limitations of StempelStemmer

2019-09-10 Thread Dawid Weiss
Hi Maciej, Stempel uses a pretrained heuristic. You can find a longer description at [1] and [2]. The specific reason for the problems you mentioned may be the smaller training dictionary used for the version embedded in Lucene, I honestly don't know. If you need exact stemming/ lemmatization then

Limitations of StempelStemmer

2019-09-10 Thread Maciej Gawinecki
Hi, I have just checked out the latest version of Lucene from Git master branch. I have tried to stem a few words using StempelStemmer for Polish. However, it looks it cannot handle some words properly, e.g. joyce -> ąć wielce -> ąć piwko -> ąć royce -> ąć pip -> ąć xyz -> xyz 1. I surprised it

FileNotFoundException with version 4.10.4

2019-09-10 Thread Stuart Goldberg
We have been using version 4.10.4 for quite some time and ran into the following issue. Out of the clear blue, one of our clients sees the exception cited below. We see no prior evidence of anything going awry in our log files. This literally seems to occur out of nowhere. Is there any known issu