That's no problem, I can regenerate my entire extra data structure upon periodic index optimization. That way the array size will be about the size of the index. What I find more difficult is to know the id of the last added/removed document. I need it to update the in-mem structure upon more fine-grained index changes. Any ideas?
TIA. Cheers, Carlos On 5/24/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
Document IDs will be re-utilized, after, say, optimization. One consequence of this is that optimization will change the IDs of *existing* documents. You're right, that numdocs may well be shorter than maxdocs. That's what I get for reading quickly... Best Erick On 5/24/07, Carlos Pita <[EMAIL PROTECTED]> wrote: > > > > > > > No. It will always be at least as large as the total documents. But that > > will also count deleted documents. > > > > Do you mean that deleted document ids won't be reutilized, so the index > maxDoc will grow more and more with time? Isn't there any way to compress > the range? It seems strange to me, considering that an example in the book > suggests to use the document id as an array index for an array of maxDoc > elements. > > Cheers, > Carlos > > Why wouldn't numdocs serve? > > > > Best > > Erick > > > > > > The motivation of this question is that I want to associate some info to > > > each document in the index, and in order to access this additional > data > > in > > > O(1) I would like to do this through an array indexing. But the array > > size > > > shouldn't be a lot greater than the total number of documents. I see > > that > > > something similar is done in the example of section 6.1 of Lucene in > > > Action, > > > but for sorting purposes, which is not my case. > > > > > > Related to this: how can update my array of extra data when documents > > are > > > added/removed to/from the index? Is there any feedback mechanism by > > means > > > of > > > callbacks or event handlers? > > > > > > Thank you in advance. > > > Regards, > > > Carlos > > > > > >