On Nov 12, 2007 1:15 PM, J.J. Larrea <[EMAIL PROTECTED]> wrote:
>
> 2. Since the full document and its longer bibliographic subfields are
> being indexed but not stored, my guess is that the large size of the index
> segments is due to the inverted index rather than the stored data fields.
> But
is paid when optimizing
> > anyway. You might as well amortize the cost with a lower merge factor.
> >
> > Grant seems to think the numbers are off anyway, so you may have
> > more to do -- just a suggestion about the merge factor. How much RAM
> > are you giving
e it would need to be split up that much, if at all,
> depending on your hardware.
>
> The wiki has some excellent resources on improving both indexing and
> search speed.
>
> -Grant
>
>
>
> On Nov 11, 2007, at 6:16 PM, Barry Forrest wrote:
>
> > Hi,
> >
&
Hi,
Optimizing my index of 1.5 million documents takes days and days.
I have a collection of 10 million documents that I am trying to index
with Lucene. I've divided the collection into chunks of about 1.5 - 2
million documents each. Indexing 1.5 documents is fast enough (about
12 hours), but t
Hi list,
I'm trying to estimate how long it will take to index 10 million documents.
If I measure how long it takes to index say 10,000 documents, can I
extrapolate? Will it take roughly 1000 times longer to do the whole set?
Thanks
Barry