>
> I think that I'd move your deduping logic to after the search and set
> a limit on the number of hits that you check. That way you'd also get
> the best hit first.
>
>
> --
> Ian.
>
>
> On Thu, Feb 4, 2010 at 5:23 AM, mpolzin wrote:
>>
I changed one line below... realized I missed the ! (NOT).. corrected in
original reply.
if ((hq.Size() < numHits || score >= minScore) &&
!collectedBaseURLArray.Contains(doc.BaseURL))
{
mpolzin wrote:
>
>
>
Hi thanks for the suggestion. I am relatively new to Lucene, so I have a few
more questions on this implementation. I looked at the source code for
Lucene and found the TopDocCollector class. It appears this class derives
from the HitCollector class, so I should be able to simply extend
TopDocColl
Hi thanks for the suggestion. I am relatively new to Lucene, so I have a few
more questions on this implementation. I looked at the source code for
Lucene and found the TopDocCollector class. It appears this class derives
from the HitCollector class, so I should be able to simply extend
TopDocColl