Re: BTree

2006-01-12 Thread Kan Deng
Thanks, Yonik. TermInfosReader is exactly the class I am looking for. Kan --- Yonik Seeley <[EMAIL PROTECTED]> wrote: > On 1/12/06, Kan Deng <[EMAIL PROTECTED]> wrote: > > Many thanks, Doug. > > > > A quick question, which class implements the > following > > logic? > > It looks to me like

Re: BTree

2006-01-12 Thread Yonik Seeley
On 1/12/06, Kan Deng <[EMAIL PROTECTED]> wrote: > Many thanks, Doug. > > A quick question, which class implements the following > logic? It looks to me like org.apache.lucene.index.TermInfosReader -Yonik - To unsubscribe, e-mail

Re: BTree

2006-01-12 Thread Kan Deng
Many thanks, Doug. A quick question, which class implements the following logic? org.apache.lucene.search.IndexSearcher? > For access, Lucene is equivalent to a B-Tree > with all but the leaves cached in memory, so > that accesses require only a single disk access. thanks, Kan --- Dou

Re: BTree

2006-01-12 Thread Doug Cutting
B-Tree's are best for random, incremental updates. They require log_b(N) disk accesses for inserts, deletes and accesses, where b is the number of entries per page, and N is the total number of entries in the tree. But that's too slow for text indexing. Rather Lucene uses a combination of fi

Re: BTree

2006-01-12 Thread Kan Deng
After reading into the source code, I think Lucene doeesn't use B+tree or other tree structure for index. A possible reason is that, since Lucene aims at handling gigabytes , it has to be cautious about the index file's size. B+tree may grow rapidly when the number of leaves grows. Hence, B+tre

Re: BTree

2006-01-12 Thread Daniel Naber
On Donnerstag 12 Januar 2006 05:47, shailesh kumar wrote: > I had   looked at the document you had listed as well as used a  Hex > editor to look at the segment files. .That is how I came to know about > the lexicographic sorting. But was not sure if BTree is used.  If I > understand correctly a B

Re: BTree

2006-01-12 Thread Kan Deng
I have similar problem about the internal indexing data structure According to Paolo Ferragina of Univ Pisa, B+tree with cluster is best for sorting. However, referring to the implementation of org.apache.lucene.search.IndexSearch, it looks like the impl doesn't take B+tree, never mention cluster

Re: BTree

2006-01-11 Thread shailesh kumar
I had looked at the document you had listed as well as used a Hex editor to look at the segment files. .That is how I came to know about the lexicographic sorting. But was not sure if BTree is used. If I understand correctly a Binary tree (i.e each node only 2 children) or a high order Ba

Re: BTree

2006-01-11 Thread Erik Hatcher
On Jan 11, 2006, at 7:23 AM, shailesh kumar wrote: Does Lucene use a BTree kind of structure for storing the index (atleast in the memory) .? or is it just a list. Based on the file format in the index directory ( where in the terms are are lexicographically sorted in one of the files ) I