[ANN] SearchBlox announces UNLIMITED Edition

2009-08-17 Thread Lucene
The SearchBlox Team is pleased to announce the availability of SearchBlox UNLIMITED Edition of its Lucene-based Search Software. The UNLIMITED Edition allows indexing of unlimited number of documents and provides for unlimited number of development and deployment licenses to the licensee for an

Bizarre indexing issue where thousands of files get created

2009-08-17 Thread Micah Jaffe
The Problem: periodically we see thousands of files get created from an IndexWriter in a Java process in a very short period of time. Since we started trying to track this, we saw an index go from ~25 files to over 200K files in about a half hour. The Context: a hand-rolled, all-in-one Luc

Re: Problem doing backup using the SnapshotDeletionPolicy class

2009-08-17 Thread Shai Erera
The way I'd do the backups is by having a background thread that is scheduled to run every X hours/days (depends on how frequent you want to do the backups) and when it wakes it: 1) Creates a SnapshotDeletionPolicy and retrieve the file names. 2) Lists the files in the backup folder. 3) Copies the

RE: IndexSearcher.search Behavior

2009-08-17 Thread Chris Adams
Thanks, Mark. That info helped tremendously. The problem in my case was that the parse(querystring) was using our custom analyzer when parsing the entire string, where as the BooleanObject that was being putting together wasn't using our custom analyzer for one of the terms when it should have be

Re: IndexSearcher.search Behavior

2009-08-17 Thread Mark Miller
Unfortunately, many Query's toString output is not actually parsable by QueryParser (though some are). If you look at the result Query object that gets built from the toString output, its likely different than the BooleanQuery you are putting together. -- - Mark http://www.lucidimagination.

IndexSearcher.search Behavior

2009-08-17 Thread Chris Adams
I'm not extremely familiar with Lucene, so I am confused at why the following behavior is happening: When I build up a BooleanQuery using the Lucene objects (combination of RangeQueries, TermQuery, etc.) I am getting a different result than when I do a QueryParser.parse(queryString). The Boolea

Re: Problem doing backup using the SnapshotDeletionPolicy class

2009-08-17 Thread Lucas Nazário dos Santos
Thanks Mike. I'm using Windows XP with Java 1.6.0 and Lucene 2.4.1. I don't know if I'm using the right backup strategy. I have an indexing process that happens from time to time and the index is getting every day bigger. Hence, copying all the index every time as a backup strategy is becoming a

Re: filter special chars

2009-08-17 Thread Strubbl
YES, you're right. thank you very much! :) Simon Willnauer wrote: > > hey there, > > did you try to pass it lower-cased "ø" I guess you have a > LowerCaseFilter / Tokeninzer in you analysis chain. > > simon > > On Mon, Aug 17, 2009 at 10:32 AM, Strubbl wrote: >> >> hi, is there any way to fi

Re: filter special chars

2009-08-17 Thread Simon Willnauer
hey there, did you try to pass it lower-cased "ø" I guess you have a LowerCaseFilter / Tokeninzer in you analysis chain. simon On Mon, Aug 17, 2009 at 10:32 AM, Strubbl wrote: > > hi, is there any way to filter special chars like this one: "Ø" out of my > fields I want to index? > until now I te

Re: How to normalize Lucene score?

2009-08-17 Thread Christian Reuschling
Hi Prashant, we let convergate the scores to 1 - whereby they will never reach one, to have also correct ratings with respect to higher Lucene scores which are more or less open-ended: normalizedScore = 1 - [ 1 / (1+luceneScore) ] best Christian On Sun, 16 Aug 2009 19:04:44 +0530 prashant ul

filter special chars

2009-08-17 Thread Strubbl
hi, is there any way to filter special chars like this one: "Ø" out of my fields I want to index? until now I tested it with the StopFilter("Ø"). But it doesn't work. When I search the index with Luke for this symbol I still get results. Can anyone help please? -- View this message in context:

Lucene Tokenizer + Merge terms

2009-08-17 Thread joe_coder
I am using a custom analyzer: public TokenStream tokenStream(String fieldName, Reader reader) { StandardTokenizer tokenStream = new StandardTokenizer(reader); tokenStream.setMaxTokenLength(maxTokenLength); TokenStream result = new ASCIIFoldingFilter(tokenStream);