I forgot: - How did you measure query time? - Did you warm your index reader? - omit tf and norms is not needed for numeric fields, it is disabled by default
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Saturday, January 02, 2010 10:46 PM > To: java-user@lucene.apache.org; kuma...@kumanan.com > Subject: RE: NumericRangeQuery performance with 1/2 billion documents in > the index > > The information you gave us is a little spare. > - What JVM do you use, what processor,... > - How many documents match the query? NRQ is very fast, but if your range > hits e.g. one third of all documents, the hit collection of 166 mill docs > also takes lots of time. 7 seconds is normal for this case. Even with 50 > mio > docs in the result range, collection would take in the seconds area for > most > cpus. > - Why do you index and query with precision step 1? I would first try 6 or > 4 > with long fields. With too low precSteps, queries get slower because you > have a very, very large term index (64 terms per value!) and your query > has > to reposition the term index very often. > - Why do you index NULL values as an integer (not long!) field with value > 0? > Those fiels are useless for your query and will never match any range on > LONG values. So why not simply remove them? They also produce lots of > terms > with precStep=1 (32 terms). > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -----Original Message----- > > From: Kumanan [mailto:kuma...@gmail.com] > > Sent: Saturday, January 02, 2010 8:03 PM > > To: java-user@lucene.apache.org > > Subject: NumericRangeQuery performance with 1/2 billion documents in the > > index > > > > Hi, > > > > We have an index with 500 million documents in the index. Index size is > > 104 > > GB and 4 GB RAM for the search server. > > > > When we try to do NumericRangeQuery on document_date field, it takes > > around > > 7-10 seconds. Is this expected for this size index? > > > > Here is how I index that field. > > > > documentDateTimeField = new NumericField(DOCUMENT_DATE_TIME, > > 1, > > Field.Store.NO, true); > > documentDateTimeField.setOmitNorms(true); > > documentDateTimeField.setOmitTermFreqAndPositions(true); > > > > if(scoreDetails.getDocumentDate() != null) { > > > > > > > documentDateTimeField.setLongValue(scoreDetails.getDocumentDate().getTime( > > )); > > } else { > > documentDateTimeField.setIntValue(0); > > } > > doc.add(documentDateTimeField); > > > > Here is how I construct the range query. > > > > Long begin = esq.getBeginDate().getTime(); > > Long end = esq.getEndDate().getTime(); > > > > NumericRangeQuery rangeQuery = > > > NumericRangeQuery.newLongRange(WordSentenceDocumentFields.DOCUMENT_DATE_TI > > ME, > > 1, begin, end, > > esq.isBeginDateInclusive(), > > esq.isEndDateInclusive()); > > > > BooleanQuery bq = new BooleanQuery(); > > bq.add(query, BooleanClause.Occur.MUST); > > bq.add(rangeQuery, BooleanClause.Occur.MUST); > > > > query = bq; > > > > Am I doing something wrong? > > > > Thanks > > Kumanan > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org