Re: High frequency terms in results document....

2015-02-19 Thread Tomoko Uchida
It seems to be the very similar discussion about this topic, I've just missed it. Number of approaches are there. http://mail-archives.apache.org/mod_mbox/lucene-java-user/201502.mbox/%3CCAON7oqQh4aXoKfWyn=7odzwc48h_vvjjaabpfadmqehstzz...@mail.gmail.com%3E > Looks like it goes thru every term and

Re: High frequency terms in results document....

2015-02-19 Thread Shouvik Bardhan
Thanks for your input Uchida. I will try that out. I wonder what is the magic sauce in Luke's set of calls which allows it to create say top 100 terms even from a index with 100 million docs (small docs though for me). Looks like it goes thru every term and puts them in a priority queue and takes t

Re: High frequency terms in results document....

2015-02-18 Thread Tomoko Uchida
Hi, I'm afraid there are no easy or straight way for your requirement. I would try create an temporary tiny index from search results on the fly in memory, and get top N terms from it by HighFreqTerms. http://lucene.apache.org/core/4_10_3/misc/org/apache/lucene/misc/HighFreqTerms.html (The logic i