OK, I've played with all this solutions and basically only one gave me satisfying results. Using build() with TermFreqPayload argument gave me horrible performance, because it takes more than 5 mins to iterate through all Terms in the index and to filter them based on the doc id. Not sure if this nested loop can be further optimized, but my index is barely 30MB and I have around 300K terms.
It turns out that Jack Krupansky's answer was way to go. I build AnalyzingSuggester using LuceneDictionary which is really fast and then filter suggestions further by issuing a query to the index. Here's the code in case anyone is interested : // generate AnalyzingSuggestions // use existing analyzer this.as = new AnalyzingSuggester(analyzer); as.load(new FileInputStream(new File(suggsPath))); if (as.sizeInBytes() == 0) { logger.info("Building analyzer suggester..."); as.build(new LuceneDictionary(reader, "contents")); as.store(new FileOutputStream(new File(suggsPath))); } -------------------------------------------------------- // now, in servlet, for each suggestion fire a query List<LookupResult> suggs = as.lookup(q, false, 10); // do not pass true as a second param! logger.info("Found "+suggs.size()+" suggestions"); List<LookupResult> filtered = new ArrayList<LookupResult>(); for (LookupResult sug : suggs) { if (searchSugg(sug.key.toString(), uid)) { filtered.add(sug); } } logger.info("Found "+filtered.size()+" filtered suggestions"); ----------------------------------------------------------------- public boolean searchSugg(String q, long uid) { ... if (q == null) { logger.warn("Query is null"); return false; } if (q.isEmpty()) { logger.warn("Query is empty"); return false; } Date start = new Date(); String qStr = q.trim(); //Query query = parser.parse(qStr); BooleanQuery query = new BooleanQuery(); query.add(new BooleanClause(new TermQuery(new Term("contents", qStr)), BooleanClause.Occur.MUST)); BytesRef ref = new BytesRef(); NumericUtils.longToPrefixCoded(uid, 0, ref); query.add(new BooleanClause(new TermQuery(new Term("userid", ref)), BooleanClause.Occur.MUST)); logger.info("Searching for: " + query.toString("contents")); TopDocs results = searcher.search(query, 1); ScoreDoc[] hits = results.scoreDocs; int numTotalHits = results.totalHits; logger.info(numTotalHits + " total matching documents"); Date end = new Date(); long qTime = end.getTime()-start.getTime(); logger.info("Search took "+qTime+" ms"); return numTotalHits > 0; ... On Sat, Mar 16, 2013 at 8:54 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Sat, Mar 16, 2013 at 7:47 AM, Bratislav Stojanovic > <bratislav1...@gmail.com> wrote: > > Hey Mike, > > > > Is this what I should be looking at? > > > https://builds.apache.org/job/Lucene-Artifacts-trunk/javadoc/suggest/org/apache/lucene/search/suggest/analyzing/package-summary.html > > > > Not sure how to call build(), i.e. what to pass as a parameter...Any > > examples? > > Where to specify my payload (which is "id" long field from the index)? > > build() takes a TermFreqPayload iterator, which iterates over the > weight/input text/payload that you provide. > > Have a look at AnalyzingSuggesterTest, eg testKeywordWithPayloads. > > Mike McCandless > > http://blog.mikemccandless.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Bratislav Stojanovic, M.Sc.