do you mean get tf of the hited documents when doing search? it's a difficult problem because only TermScorer has TermDocs and using tf in score() function. but this we can't know whether these doc is selected because we use a priorityQueue in TopScoreDocCollector
public void collect(int doc) throws IOException { float score = scorer.score(); // This collector cannot handle these scores: assert score != Float.NEGATIVE_INFINITY; assert !Float.isNaN(score); totalHits++; if (score <= pqTop.score) { // Since docs are returned in-order (i.e., increasing doc Id), a document // with equal score to pqTop.score cannot compete since HitQueue favors // documents with lower doc Ids. Therefore reject those docs too. return; } pqTop.doc = doc + docBase; pqTop.score = score; pqTop = (ScoreDoc) pq.updateTop(); } Because the scorer in BooleanScorer2 is ConjuctionScorer(DisjuctionScorer). it can not get tf. If you want to do this. you have to modify many codes 2010/12/25 Ranjit Kumar <ranjit.ku...@otssolutions.com> > Hi, > > Merry Christmas!! > > > > In case of Boolean query like *’sql AND server’ *. > > I am using parser to get correct document containing both *sql* and * > server*. Inside for loop in below code I get correct documented and to get > frequency I need to sum frequency of ‘sql’ and ‘server’ individually with > the help of *termDocs.read(). * > > As I am searching through millions of document. So, to calculate frequency > it takes about *160 Second* for *80k* document. > > > > Is there any way to get frequency of ‘*Boolean query’ *directly without > manipulation. As it takes lots of time. In case of single term and phrase > query, I got frequency for 80k document within 10 Seconds. > > > > *QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, field, > analyzer);* > > *parser.setDefaultOperator(QueryParser.AND_OPERATOR);* > > *Query query = parser.parse(“sql AND server”);* > > *TopDocs docs = searcher.search(query, null, n);* > > *TermDocs termDocs = reader.termDocs();* > > * termDocs.seek(new Term(field, > query.toString(field).split(" ")[0].toLowerCase()));* > > * int[] ints = new int[docs.totalHits];* > > * int[] ints1 = new int[docs.totalHits];* > > * termDocs.read(ints, ints1);* > > * List lsts = Arrays.asList(ArrayUtils.toObject(ints)); > * > > * List lsts1 = > Arrays.asList(ArrayUtils.toObject(ints1));* > > * * > > * termDocs.seek(new Term(field, > query.toString(field).split(" ")[1].toLowerCase()));* > > * int[] inta = new int[docs.totalHits];* > > * int[] inta1 = new int[docs.totalHits];* > > * termDocs.read(inta, inta1);* > > * List lsta = Arrays.asList(ArrayUtils.toObject(inta)); > * > > * List lsta1 = > Arrays.asList(ArrayUtils.toObject(inta1));* > > * * > > * int totalFreq = 0;* > > * int docId = -1;* > > * String a = null;* > > * int contId = 0;* > > * String path=null;* > > *for (int i = 0; i < docs2.scoreDocs.length; i++) { * > > * docId = docs2.scoreDocs[i].doc;* > > * path=reader.document(docId).get("path");* > > * a = path.substring(path.lastIndexOf("\\") + 1, > path.lastIndexOf("."));* > > * try {* > > * if(a.indexOf("e")>-1){* > > * contId = > Integer.parseInt(a.substring(0, a.length()-1));* > > * }else{* > > * contId = Integer.parseInt(a);* > > * }* > > * } catch (Exception e) {* > > *// e.printStackTrace();* > > * }* > > * * > > * if ((lsts.indexOf(docId) > -1 && > lsta.indexOf(docId) > -1) || (lsta.indexOf(docId) > -1) && > lsts.indexOf(docId) > -1) {* > > * totalFreq = (Integer) > lsts1.get(lsts.indexOf(docId)) + (Integer) lsta1.get(lsta.indexOf(docId)); > * > > * } else if (lsts.indexOf(docId) > -1) {* > > * totalFreq = (Integer) > lsts1.get(lsts.indexOf(docId));* > > * } else if (lsta.indexOf(docId) > -1) {* > > * totalFreq = (Integer) > lsta1.get(lsta.indexOf(docId));* > > * }* > > * * > > * > w.write(contId+"\t"+ID+"\t"+totalFreq+"\t"+reader.document(docId).get("path")+"\n"); > * > > * }* > > > > > > Thanks & Regards, > > *Ranjit Kumar *** > > Associate Software Engineer > > > > [image: cid:image002.jpg@01CB7089.C0069B40] > > > > *US:* +1 408.540.0001 > > *UK:* +44 208.099.1660 > > *India:* +91 124.474.8100 | +91 124.410.1350 > > *FAX:* +1 408.516.9050 > > http://www.otssolutions.com > > > =================================================================================================== > Private, Confidential and Privileged. This e-mail and any files and > attachments transmitted with it are confidential and/or privileged. They are > intended solely for the use of the intended recipient. The content of this > e-mail and any file or attachment transmitted with it may have been changed > or altered without the consent of the author. If you are not the intended > recipient, please note that any review, dissemination, disclosure, > alteration, printing, circulation or Transmission of this e-mail and/or any > file or attachment transmitted with it, is prohibited and may be unlawful. > If you have received this e-mail or any file or attachment transmitted with > it in error please notify OTS Solutions at > i...@otssolutions.com=================================================================================================== > >