Hi Jamie, How fast are you indexing (number of documents per second)? We also ran into this when trying to perf test heavy query throughput while doing rapid indexing under exactly these conditions: call getReader() every time a search is executed (so that it's "really real time").
The answer is that calling getReader() on every search request is really not a supported operation under heavy indexing load, currently. You are advised to cache the reader you get from this call, and only refresh once per some interval of time (determined by your need: if you need real-time up to every 5 seconds, refresh every 5s. If you need every second, refresh every second, etc). If you really want to have the functionality of being able to get a completely fresh view of the index on each search, look at the Zoie project ( http://zoie.googlecode.com ), an Apache-licensed realtime search system built on top of Lucene which we open-sourced out of LinkedIn a couple of years ago. Zoie allows for immediate reopen on request, and handles enormously heavy indexing and query load (I recently demo'ed zoie to the guys at Twitter at a tech-talk, and showed off indexing about 1500 tweets per second (it could have done more, but it was virtualized disk - EC2) while slamming it with full-throttle query throughput, all while reopening for each request, and it didn't fall over). If you want to try it out, drop me a line and I can help if you need any. -jake On Tue, Jan 26, 2010 at 10:11 PM, Jamie <ja...@stimulussoft.com> wrote: > Hi Jason > > I am calling it each time the search takes place. It is no only these > files, there are more. > In fact, the number of files increases quite frequently. I am seriously > worried that we will > run out of file handles after a period of time. > > I am calling getReader every time a search takes place. The writer stays > open all the time. > I am reluctant to think its a reader issue, as this happens even if I do > not execute any searches. > We are using Lucene 2.9.1. > > Are these files not left over from a merge process? Is lucene closing its > file handles before > deleting the files? Any further ideas? > > Jamie > > > On 2010/01/27 02:32 AM, Jason Rutherglen wrote: > >> Jamie, >> >> How often are you calling getReader? Is it only these files? >> >> Jason >> >> On Tue, Jan 26, 2010 at 12:58 PM, Jamie<ja...@stimulussoft.com> wrote: >> >> >>> Ok. I spoke too soon. The problem is not solved. I am still seeing these >>> file handles lying around. Is this something I should be worried about? >>> We are now closing the IndexReader but the IndexWriter remains open >>> through >>> out the running of the program. >>> >>> problem is not solved >>> s# lsof | grep index | awk '{n++}; END {print n+0}' >>> 730 >>> java 17558 root 898r REG 8,1 1690991 >>> 246658 /var/index/vol201001/_5q1.cfs >>> java 17558 root 899r REG 8,1 76354 >>> 246657 /var/index/vol201001/_5q1.nrm (deleted) >>> java 17558 root 900r REG 8,1 4886 >>> 246661 /var/index/vol201001/_5q2.cfs (deleted) >>> java 17558 root 901r REG 8,1 19859 >>> 246660 /var/index/vol201001/_5q3.cfs (deleted) >>> java 17558 root 902r REG 8,1 3213 >>> 246662 /var/index/vol201001/_5q4.cfs (deleted) >>> java 17558 root 903r REG 8,1 1294 >>> 246663 /var/index/vol201001/_5q5.cfs (deleted) >>> >>> On 2010/01/26 10:09 PM, Jamie wrote: >>> >>> >>>> >>>> >>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >