David Xiao wrote:
Consider reduce size of per file. Split them into smaller pieces will
definitely help indexer working faster.

A 50M pure text file is amazing size, very few text files reach that
size: 50M. It must be very reasonable if you have to keep all
information in such one big file.

What you think?

Not everyone using Lucene is writing a CMS. Some of us do have to deal with arbitrarily big data, if it appears. How do you split an arbitrary, large text file? Will breaking it in two make some queries not work, e.g. if the user enters +term1 +term2 and each one was on opposite sides of the split? etc.

That being said, I haven't had issues adding files of this size. But then, our application doesn't require the ability to read at the same time some other thread is writing (so our memory requirements are lower to begin with.)

Daniel

--
Daniel Noll

Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web: http://nuix.com/                               Fax: +61 2 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to