I've been tracking this list for a year or more, and this is the first I've ever heard of such a thing. Which leads me to wonder what *else* changed besides your index size. Classpath? jar files? Some sysadmin modified your search box? Is the program throwing an exception that you're masking somewhere in the code? Is it possible that you're getting an Out Of Memory exception?
Folks have routinely used MUCH larger indexes than 3G without anything like this happening. So, here's what I'd do. 1> verify your index. Look at it in, say Luke. 2> Log your queries. It's possible you're not querying what you think you are. 3> You should very easily be able to create a small, stupid program on your personal machine that will open this index and fire off the queries in question and see if your problem is environmental or programmatic. If you run it in an IDE, you should be able to break on exceptions if there are any. 4> Assuming that <2> exhibits the problem, start paring back your code. Take out one thing at a time until you don't see the problem. 5> really take a look at any code changes that are coincident with this anomaly. Are you totally sure that the only thing that's changed is the index? 6> why are you bothering to make everything final? Are your code snippets part of a class that's instantiated for each query? Note that this is more curiosity than thinking that it's the source of your problem <G>. Best Erick On Thu, Sep 4, 2008 at 5:46 PM, Justin Grunau <[EMAIL PROTECTED]> wrote: > Sorry, I forgot to include the visibility filters: > > final BooleanQuery visibilityFilter = new BooleanQuery(); > visibilityFilter.add(new TermQuery(new Term("isPublic", > "true")), > Occur.SHOULD); > visibilityFilter.add(new TermQuery(new Term("reader", > user.getId())), > Occur.SHOULD); > > > These visibility filters ensure that a user only sees files which he or she > has access to see. > > I am pretty certain nobody else has modified the index in the meantime, but > why is that important? We have several other servers -- whose only > difference is a smaller data set -- with dozens of concurrent users, and the > index on those servers gets modified and read concurrently all the time, but > none of these other servers have ever exhibited this bug. > > > > ----- Original Message ---- > From: Leonid M. <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Thursday, September 4, 2008 5:35:47 PM > Subject: Re: Problem with lucene search starting to return 0 hits when a > few seconds earlier it was returning hundreds > > * And what's about visibility filter? * Are you sure no one else accesses > IndexReader and modifies index? See reader.maxDocs() to be confident. > > On Fri, Sep 5, 2008 at 12:19 AM, Justin Grunau <[EMAIL PROTECTED]> wrote: > > > We have some code that uses lucene which has been working perfectly well > > for several months. > > > > Recently, a QA team in our organization has set up a server with a much > > larger data set than we have ever tested with in the past: the resulting > > lucene index is about 3G in size. > > > > On this particular server, the same lucene code which has been reliable > in > > the past is now exhibiting erratic behavior. The first time you do a > > search, it returns the correct number of hits. The second time you do a > > search, it may or may not return the correct set. By the third time you > do > > a search, it will return 0 hits even for a search that was returning > > hundreds of hits only a few seconds earlier. All subsequent searches > will > > return 0 hits until you stop and restart the java process. > > > > A snippet of the relevant code follows: > > > > // getReader() returns the singleton IndexReader > object > > final IndexReader reader = getReader(); > > > > // ANALYZER is another singleton > > final QueryParser queryParser = new QueryParser("text", > > ANALYZER); > > queryParser.setDefaultOperator(spec.getDefaultOp()); > > final Query query = > > queryParser.parse(spec.getSearchText()).rewrite( > > reader); > > final IndexSearcher searcher = new IndexSearcher(reader); > > > > final Hits hits = searcher.search(query, new > > CachingWrapperFilter( > > new QueryWrapperFilter(visibilityFilter))); > > total = hits.length(); > > > > > > > > I understand that Lucene should be able to handle very large datasets, so > > I'd be surprised if this were an actual Lucene bug. I'm hoping it's just > > that I'm doing something "wrong" which has gone unnoticed so far for > several > > months because we've never had an index this large. > > > > We're using lucene verison 2.2.0. > > > > Thanks! > > > > Justin Grunau > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > -- > Bests regards, > Leonid Maslov! > Personal blog: http://leonardinius.blogspot.com/ > > Random thought: > Princess Margaret - "I have as much privacy as a goldfish in a bowl." > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >