Re: performance question - number of documents

2011-10-27 Thread Felipe Hummel
Thanks again. > > > > > - Original Message - > From: Erick Erickson > To: java-user@lucene.apache.org; sol myr > Cc: > Sent: Sunday, October 23, 2011 7:18 PM > Subject: Re: performance question - number of documents > > "Why would it matter...top 5 mat

Re: performance question - number of documents

2011-10-24 Thread sol myr
Thanks again. - Original Message - From: Erick Erickson To: java-user@lucene.apache.org; sol myr Cc: Sent: Sunday, October 23, 2011 7:18 PM Subject: Re: performance question - number of documents "Why would it matter...top 5 matches" Because Lucene has to calculate the

Re: performance question - number of documents

2011-10-23 Thread Antony Sequeira
This may not be directly relevant to Lucene, but I wanted to learn: How does a web search engine do something like this. Do they also "score every matching document on every query" OR do they pick a subset first based on some static/offlline ranking criteria then do what Lucene does OR do they sea

Re: performance question - number of documents

2011-10-23 Thread Erick Erickson
"Why would it matter...top 5 matches" Because Lucene has to calculate the score of all documents in order to insure that it returns those 5 documents. What if the very last document scored was the most relevant? Best Erick On Sun, Oct 23, 2011 at 3:06 PM, sol myr wrote: > Hi, > > We've noticed s

performance question - number of documents

2011-10-23 Thread sol myr
Hi, We've noticed some Lucene performance phenomenon, and would appreciate an explanation from anyone familiar with Lucene internals (I know Lucene as a user, but haven't looked under its hood). We have a Lucene index of about 30 million records. We ran 2 queries: "AND" and "OR" ("+john +doe" v