Thanks Michael for help, this helped me with my problem.
Regards
Harshvardhan Ojha
On Thu, Feb 13, 2014 at 8:51 PM, Michael McCandless <
[email protected]> wrote:
> The bloom filter is only used by the postings format wrapper, and
> we've had mixed results on whether it helps performanc
The bloom filter is only used by the postings format wrapper, and
we've had mixed results on whether it helps performance or not (seems
to depend heavily on the exact usage).
We have bit set / iterator abstractions (oal.util.Bits,
oal.search.DocIdSet/Iterator) to manage "sets" of documents, but mo
Hi Mike/Mikhail,
Don't you guys
think org.apache.lucene.codecs.bloom.FuzzySet.java, contains(BytesRef
value) methods returns probablity of having a field, and it is a place
where we are using hashing ?
Are there any other place in source which when given with document id,
could determine by calcu
Lucene only assigns its int docID during indexing.
Retrieving a previously stored document is a O(1), but that involves a
disk seek which can be very costly when the page is not in the OS's IO
cache. Lucene does not do any caching itself (relies on the OS
instead).
Have a look at the current def
Hi All,
I have a question regarding retrieval of documents by lucene.
I know lucene uses many files on disk to keep documents, each comprising
fields in it, and uses many IR algorithms, and inverted index to match
documents.
My question is :
1. How lucene stores these documents inside file system