Re: Optimizing index takes too long

2007-11-12 Thread Barry Forrest
On Nov 12, 2007 1:15 PM, J.J. Larrea <[EMAIL PROTECTED]> wrote: > > 2. Since the full document and its longer bibliographic subfields are > being indexed but not stored, my guess is that the large size of the index > segments is due to the inverted index rather than the stored data fields. > But

Re: Optimizing index takes too long

2007-11-11 Thread Barry Forrest
is paid when optimizing > > anyway. You might as well amortize the cost with a lower merge factor. > > > > Grant seems to think the numbers are off anyway, so you may have > > more to do -- just a suggestion about the merge factor. How much RAM > > are you giving

Re: Optimizing index takes too long

2007-11-11 Thread Barry Forrest
e it would need to be split up that much, if at all, > depending on your hardware. > > The wiki has some excellent resources on improving both indexing and > search speed. > > -Grant > > > > On Nov 11, 2007, at 6:16 PM, Barry Forrest wrote: > > > Hi, > > &

Optimizing index takes too long

2007-11-11 Thread Barry Forrest
Hi, Optimizing my index of 1.5 million documents takes days and days. I have a collection of 10 million documents that I am trying to index with Lucene. I've divided the collection into chunks of about 1.5 - 2 million documents each. Indexing 1.5 documents is fast enough (about 12 hours), but t

Indexing time linear?

2007-08-23 Thread Barry Forrest
Hi list, I'm trying to estimate how long it will take to index 10 million documents. If I measure how long it takes to index say 10,000 documents, can I extrapolate? Will it take roughly 1000 times longer to do the whole set? Thanks Barry