Re: Too many unique terms

2013-04-29 Thread Adrien Grand
On Sat, Apr 27, 2013 at 8:41 PM, Manuel Le Normand wrote: > Hi, real thanks for the previous reply. > For now i'm not able to make a separation between these useless words, > whether they contain words or digits. > I liked the idea of iterating with TermsEnum. Will it also delete the > occurances

Lucene Desktop Search Engine with JavaFX/Tika/Filesystem Crawler/HTML5

2013-04-29 Thread Mirko Sertic
Hi@all Lucene rocks, and based on some JavaFX/HTML5 hyprids i built a small Java search engine for your desktop! The prototype and the result can be seen here: http://www.mirkosertic.de/doku.php/javastuff/fxdesktopsearch I am using a multithreaded pipes and filters architecture with Tika as

Re: Too many unique terms

2013-04-29 Thread Manuel Le Normand
On Mon, Apr 29, 2013 at 1:22 PM, Adrien Grand wrote: > On Sat, Apr 27, 2013 at 8:41 PM, Manuel Le Normand > wrote: > > Hi, real thanks for the previous reply. > > For now i'm not able to make a separation between these useless words, > > whether they contain words or digits. > > I liked the idea

Re: Too many unique terms

2013-04-29 Thread Adrien Grand
Hi, On Mon, Apr 29, 2013 at 10:38 PM, Manuel Le Normand wrote: > I want to make sure: iterating with the TermsEnum will not delete all the > terms occuring in the same doc that includes the single term, but only the > single term right? > Going through the Class TermEnum i cannot find any "delete