Re: How to use Hunspell dictionary to do the reverse of stemming ?

2017-10-24 Thread Robert Muir
On Tue, Oct 24, 2017 at 11:04 AM, julien Blaize wrote: > Hello, > > i am lookingfor a way to efficiently do the reverse of stemming. > Example : if i give to the program the verb "drug" it will give me > "drugged', "drugging", "drugs", "drugstore" etc... To generate the list up-front (for all wor

[ANNOUNCE] Apache Lucene 5.5.5 released

2017-10-24 Thread Steve Rowe
24 October 2017, Apache Lucene™ 5.5.5 available The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.5. Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that require

How to use Hunspell dictionary to do the reverse of stemming ?

2017-10-24 Thread julien Blaize
Hello, i am lookingfor a way to efficiently do the reverse of stemming. Example : if i give to the program the verb "drug" it will give me "drugged', "drugging", "drugs", "drugstore" etc... I have used the program wordforms from hunspell to generate all possibles combinations of the input word (e

Re: Accent insensitive search for greek characters

2017-10-24 Thread Robert Muir
Your greek transform stuff does not work because you use "Lower" instead of casefolding. If ICUFoldingFilter works for what you want, but you want to restrict it to greek, then just restrict it to the greek region. See FilteredNormalizer2 and UnicodeSet documentation. And look at how ICUFoldingFil

Re: Accent insensitive search for greek characters

2017-10-24 Thread Chitra
Hi, ICUTransformFilter is working fine for greek characters alone as per requirement. but one case it's breaking( σ & ς are the lower forms of Σ Sigma). *Example:* I indexed the terms πελάτης (indexed as πελατης) & πελάτηΣ (indexed as πελατης).I get the expected search results

Re: Increasing segment maxDoc limitation

2017-10-24 Thread Adrien Grand
I don't think there are any short-term plans to remove this limitation, the current answer to this problem is to partition your index into multiple shards that can be searched independently. Then you can use TopDocs.merge to merge results that come from your shards. In my opinion, if we were to al

Re: Issue with installing PyLucene 6.5.0

2017-10-24 Thread Bernd Fehling
Because it can't solve the preprocessor macro, do you have the same version of C++, make, JAVA 1.8, Ant, python3 on both machines? ANT_HOME, JAVA_HOME, JCC_JDK are set and also added to path? jcc/setup.py has the right path settings? Regards Bernd Am 24.10.2017 um 09:18 schrieb Amin Farajian: >

Increasing segment maxDoc limitation

2017-10-24 Thread Itay Adler
Hey everyone, We have a use-case for Solr+Lucene where we have a large amount of small docments we index, so it performs quite well even when we reach the document number limitation in Lucene. I was wondering if there are any plans to increase the 2^31-1 doc limitation, and if not what are the thi

Re: Issue with installing PyLucene 6.5.0

2017-10-24 Thread Amin Farajian
Hi Bernd, unfortunately, that didn't work. I could install jcc3 on another machine which is connected to the internet using conda-forge (see the command below) without any problem. $ conda install -c conda-forge jcc But, the machine that I have to run the experiments on does not have an internet