I don't believe there is any b-tree strategy in Lucene. I would say
that it is segment based, I guess, in that it indexes documents in
memory based on your merge factors and then flushes to disk, at then
end you can choose to merge the segments together via optimize(). I
find it to have a structure similar to that described in section 8.2
of "Modern Information Retrieval" by Baeza-Yates, et. al with a fair
number of improvements for storing terms, positions and frequency
info, etc. in compact form.
References you might find useful:
http://lucene.apache.org/java/docs/fileformats.html
If you feel like helping w/ this part of the docs, I would love some
help on https://issues.apache.org/jira/browse/LUCENE-765
Probably the best way to know at this point is to trace through the
code.
-Grant
On Jan 26, 2007, at 5:11 AM, Sairaj Sunil wrote:
I went through that document. It mentions about the Lucene's Indexing
algorithm that it uses incremental algorithm. So, can i say that it
uses a
combination of segment-based and b-tree based strategies. If i am
wrong
please correct me.
On 1/26/07, Damien McCarthy <[EMAIL PROTECTED]> wrote:
This document should contain the information you need :
http://lucene.sourceforge.net/talks/inktomi/
Damien.
-----Original Message-----
From: Sairaj Sunil [mailto:[EMAIL PROTECTED]
Sent: 26 January 2007 03:22
To: java-user@lucene.apache.org
Subject: Re: Lucene Indexing
Hi
I was asking what exactly is the inverted indexing strategy used for
storing
the index. Is it batch-based index/b-tree based/segment-based data
structure
that is used as an index data structure.
On 1/25/07, Rajiv Roopan <[EMAIL PROTECTED]> wrote:
>
>
>
http://lucene.apache.org/java/docs/api/org/apache/lucene/search/
Similarity.h
tml
>
>
> On 1/24/07, Sairaj Sunil <[EMAIL PROTECTED]> wrote:
> >
> > Hi all,
> > Can you tell me the exact indexing algorithm used by Lucene.
or give
> some
> > links to the documents that describe the algorithm used by lucene
> > Thanks in advance
> > --
> > Sairaj Sunil
> >
> >
>
>
--
Sairaj Sunil
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Sairaj Sunil
II Mtech(CS)
SSSIHL
Prashanthi Nilayam
--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org
Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/
LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]