Re: optimized searching

2009-06-30 Thread Simon Willnauer
On Tue, Jun 30, 2009 at 3:21 PM, Ian Lea wrote: > Have you read the javadocs? What does collector.getTotalHits() return? >  Does it return the same when you use new TopDocCollector(1000) and > some other number?  Are you asking basically the same questions in 2 > different threads at the same time?

Re: optimized searching

2009-06-30 Thread Erick Erickson
<<>> Are you willing to pay me to do your job for you? Sorry to besnarky, but please be aware that we're volunteers here, it's pretty presumptuous to ask for this. You still haven't answered what it is you're trying to do. Why are you collecting 1,000 titles? What's the purpose? Are you just expe

Re: optimized searching

2009-06-30 Thread Ian Lea
Have you read the javadocs? What does collector.getTotalHits() return? Does it return the same when you use new TopDocCollector(1000) and some other number? Are you asking basically the same questions in 2 different threads at the same time? You are still iterating over many hits and that will s

Re: optimized searching

2009-06-30 Thread m.harig
Thanks eric in Ian's link, particularly see the section "Don't iterate over morehits than necessary". A couple of other things: 1> Loading the entire document just to get a field or two isn't very efficient, think about lazy loading (See FieldSelector) i done it , but have couple of ques

Re: optimized searching

2009-06-30 Thread Erick Erickson
in Ian's link, particularly see the section "Don't iterate over morehits than necessary". A couple of other things: 1> Loading the entire document just to get a field or two isn't very efficient, think about lazy loading (See FieldSelector) 2> What do you mean when you say "not very good"? Us

Re: optimized searching

2009-06-30 Thread Ian Lea
What exactly is the problem? Are you concerned about the time that your code snippet takes to run, or how much memory it uses? If you have a query that matches many documents then iterating through all of them, as your code does, is inevitably going to take time. See http://wiki.apache.org/lucen