Hi All, I've been using hit highlighting for some time for non-english search. I'm indexing the fields using this,
Document doc = new Document(); doc.add(new Field(contentField, pageContent, Field.Store.YES, Field.Index.TOKENIZED)); doc.add(new Field(idField, pageId, Field.Store.YES, Field.Index.TOKENIZED)); and used the following for searching bundled wiht hit highlighting, # I'm using a phrase query for forming the query like this, PhraseQuery phrase = new PhraseQuery(); String[] termArray = queryTerms.split(" "); System.out.println("array size " + termArray.length); for (int i=0; i<termArray.length; i++) { System.out.println("adding " + termArray[i]); phrase.add(new Term("content", termArray[i])); } then instantiating a searcher as follows, with a given trueindexpath, String searchField = "content"; IndexSearcher searcher = new IndexSearcher(trueIndexPath); QueryParser queryParser = null; try { queryParser = new QueryParser(searchField, new WhitespaceAnalyzer()); } catch (Exception ex) { ex.printStackTrace(); } Hits hits = null; try { hits = searcher.search(phrase); } catch (Exception ex) { ex.printStackTrace(); } hitCount = hits.length(); and finally the following for hit highlighting, I'm putitng all the field values in a hashmap called earchresult and finally to a bigger map resutlMap, SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("<span class=\"highlight\">", "</span>"); QueryScorer scorer = new QueryScorer(phrase); Highlighter highlighter = new Highlighter(formatter, scorer); for (int i = 0; i < hits.length(); i++) { Map eachResult = new HashMap(); String content = hits.doc(i).get("content"); TokenStream stream = new WhitespaceAnalyzer().tokenStream("content", new StringReader(content)); String fragment = highlighter.getBestFragments(stream, content, 3, "..."); System.out.println(fragment); eachResult.put("id", hits.doc(i).get("id")); eachResult.put("content", fragment); resultList.add(eachResult); } Now I'm not able to limit the search results to a certain limit, because say we've 1000 results, we're not going to show all, we can limit that to some lower value say 30 or 50 like that. Can someone let me know how to limit the search results keeping the other things intact i.e highlighting. I googled and found something called TopDocs but could not figure out how to plug the same thing in the above code fragment, a good example will be helpful. As of now I thing its the highlighter thats taking the major part of the time consumed for search. So we can restrict the whole thing for only the part that we are going to show on the first page. Any idea on the same is very welcome. Thank you. --KK.