Re: best practice on too many files vs IO overhead

2009-11-27 Thread Michael McCandless
Phew :) Thanks for bringing closure! Mike On Fri, Nov 27, 2009 at 6:02 AM, Michael McCandless wrote: > If in fact you are using CFS (it is the default), and your OS is > letting you use 10240 descriptors, and you haven't changed the > mergeFactor, then something is seriously wrong.  I would tri

Re: best practice on too many files vs IO overhead

2009-11-27 Thread Istvan Soos
You were right, my bad... I have an async reader closing on a scheduled basis (after the writer refreshes the index, to not interrupt the ongoing searches), but while I've setup the scheduling for my first two index, I've forgotten it in my third... oh dear... Thanks anyway the info, it was usefu

Re: best practice on too many files vs IO overhead

2009-11-27 Thread Michael McCandless
If in fact you are using CFS (it is the default), and your OS is letting you use 10240 descriptors, and you haven't changed the mergeFactor, then something is seriously wrong. I would triple check that all readers are being closed. Or... if you list the index directory, how many files do you see?

Re: best practice on too many files vs IO overhead

2009-11-27 Thread Istvan Soos
On Fri, Nov 27, 2009 at 11:37 AM, Michael McCandless wrote: > Are you sure you're closing all readers that you're opening? Absolutely. :) (okay, never say this, but I had bugz because of this previously so I'm pretty sure that one is ok). > It's surprising with normal usage of Lucene that you'd

Re: best practice on too many files vs IO overhead

2009-11-27 Thread Michael McCandless
Are you sure you're closing all readers that you're opening? It's surprising with normal usage of Lucene that you'd run out of descriptors, with its default mergeFactor (have you increased the mergeFactor)? You can also enable compound file, which uses far fewer file descriptors, at some cost to

best practice on too many files vs IO overhead

2009-11-27 Thread Istvan Soos
Hi, I've a requirement that involves frequent, batched update of my Lucene index. This is done by a memory queue and process that periodically wakes and process that queue into the Lucene index. If I do not optimize my index, I'll receive "too many open files" exception (yeah, right, I can get th