I don't believe there is any b-tree strategy in Lucene. I would say that it is segment based, I guess, in that it indexes documents in memory based on your merge factors and then flushes to disk, at then end you can choose to merge the segments together via optimize(). I find it to have a structure similar to that described in section 8.2 of "Modern Information Retrieval" by Baeza-Yates, et. al with a fair number of improvements for storing terms, positions and frequency info, etc. in compact form.

References you might find useful:
http://lucene.apache.org/java/docs/fileformats.html

If you feel like helping w/ this part of the docs, I would love some help on https://issues.apache.org/jira/browse/LUCENE-765

Probably the best way to know at this point is to trace through the code.

-Grant

On Jan 26, 2007, at 5:11 AM, Sairaj Sunil wrote:

I went through that document. It mentions about the Lucene's Indexing
algorithm that it uses incremental algorithm. So, can i say that it uses a combination of segment-based and b-tree based strategies. If i am wrong
please correct me.

On 1/26/07, Damien McCarthy <[EMAIL PROTECTED]> wrote:

This document should contain the information you need :

http://lucene.sourceforge.net/talks/inktomi/

Damien.
-----Original Message-----
From: Sairaj Sunil [mailto:[EMAIL PROTECTED]
Sent: 26 January 2007 03:22
To: java-user@lucene.apache.org
Subject: Re: Lucene Indexing

Hi
I was asking what exactly is the inverted indexing strategy used for
storing
the index. Is it batch-based index/b-tree based/segment-based data
structure
that is used as an index data structure.


On 1/25/07, Rajiv Roopan <[EMAIL PROTECTED]> wrote:
>
>
>

http://lucene.apache.org/java/docs/api/org/apache/lucene/search/ Similarity.h
tml
>
>
> On 1/24/07, Sairaj Sunil <[EMAIL PROTECTED]> wrote:
> >
> > Hi all,
> > Can you tell me the exact indexing algorithm used by Lucene. or give
> some
> > links to the documents that describe the algorithm used by lucene
> > Thanks in advance
> > --
> > Sairaj Sunil
> >
> >
>
>


--
Sairaj Sunil


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--
Sairaj Sunil
II Mtech(CS)
SSSIHL
Prashanthi Nilayam

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to