David Xiao wrote:
Consider reduce size of per file. Split them into smaller pieces will
definitely help indexer working faster.
A 50M pure text file is amazing size, very few text files reach that
size: 50M. It must be very reasonable if you have to keep all
information in such one big file.
What you think?
Not everyone using Lucene is writing a CMS. Some of us do have to deal
with arbitrarily big data, if it appears. How do you split an
arbitrary, large text file? Will breaking it in two make some queries
not work, e.g. if the user enters +term1 +term2 and each one was on
opposite sides of the split? etc.
That being said, I haven't had issues adding files of this size. But
then, our application doesn't require the ability to read at the same
time some other thread is writing (so our memory requirements are lower
to begin with.)
Daniel
--
Daniel Noll
Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia Ph: +61 2 9280 0699
Web: http://nuix.com/ Fax: +61 2 9212 6902
This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]