On 6/5/06, Erick Erickson <[EMAIL PROTECTED]> wrote:

A few thoughts...

1> are you sure you only indexed the document once? If you indexed the
same
data multiple times, you'll have duplicate documents, each of which will
have a different Lucene ID (i.e. doc()).


Yes.. but I will make sure again.


2> have you examined your index with, say, Luke? I've found that a wonderful
tool for seeing if the data I *thought* was in my index was actually
there.



Database is too huge.. I will need some time to go through it to look if I
did any mistake while creating indexes..

3> when you say "the same document", how do you know that? The internal
Lucene ID or some field you've put in the index? This really as another
form
of "are you sure you indexed the data once?" because the internal Lucene
id
is what you get back from hits.doc(). If you're getting multiple entries
like that, then I'm lost.



Same document means.. same path of the document say .. same URL miltiple
times.. well its a good point.. I will check if they all have same docIDs..

thanks for your suggestion.

Reply via email to