RE: Document clustering using lucene

2006-06-15 Thread John Hamilton
I'v been thinking about a similar problem. However, it seems that the similarity score returned by a search is only relevant within those search results. You can't compare the similarity scores from two different searches. I think you will have to compute the similarities yourself using the t

RE: Newbie questions re: scoring

2006-05-04 Thread John Hamilton
> 2) independent of the scores being different, it is not safe to try and > pick a score threshold, this is mentioned in the FAQ... > > http://wiki.apache.org/jakarta-lucene/LuceneFAQ#head-912c1f237bb00259185353182948e5935f0c2f03 That link appears to be referring to normalized scores (everythin

RE: search a subdirectory (New to Lucene)

2006-02-23 Thread John Hamilton
nse to parse the documents before handing them to Lucene such that you're creating a Lucene Document for each paragraph rather than for each entire file. Slicing the granularity of a domain into Documents is a fascinating topic :) Erik On Feb 22, 2006, at 1:00 PM, John Hami

search a subdirectory (New to Lucene)

2006-02-22 Thread John Hamilton
I'm new to Lucene and was wondering what is the best way to perform a search on a subdirectory or subdirectories within the index? My thought at this point is to build a query to first search for files in the required directory(ies) and then use that query to make a QueryFilter and use that Que