I’m having an issue searching for an exact phrase with Lucene 4.7.  My use case 
loaded the Declaration of Independence into 
a Lucene search database.  I search for “it becomes” and I get two hits; one 
for “it, becomes” and another for a line that just has
“becomes” at the end of the line.

Expected:

“When, in the course of human events, it becomes necessary for one people to 
dissolve the”

Not Expected:

“powers from the consent of the governed. That whenever any form of government 
becomes”

Below is my load code and search code:

Directory idxLinesDir = FSDirectory.open(“test lucene index”);
Analyzer analyzerLines = new StandardAnalyzer(Version.LUCENE_47);
IndexWriterConfig iwcLines = new IndexWriterConfig(Version.LUCENE_47, 
analyzerLines);
iwcLines.setOpenMode((idxLinesFile.exists()) ? 
IndexWriterConfig.OpenMode.CREATE_OR_APPEND : 
IndexWriterConfig.OpenMode.CREATE);

IndexWriter writerLines = new IndexWriter(idxLinesDir, iwcLines);

for (int i = 0; i < arrayListOfLines.size(); i++)
{
     Document docLine = new Document();
     docLine.add(new StringField("docIndex", String.format("%06d", pageNumber) 
+ ":" + String.format("%06d", i), Field.Store.YES));
     docLine.add(new TextField(“lineText", arrayListOfLines.get(i), 
Field.Store.YES));

     writerLines.addDocument(docLines);
}

// Search Code

Directory idxDir = FSDirectory.open(idxFile);
IndexReader reader = DirectoryReader.open(idxDir);
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_47);
QueryParser parser = new QueryParser(Version.LUCENE_47, “lineText”, analyzer);
parser.setDefaultOperator(QueryParser.AND_OPERATOR);
parser.setPhraseSlop(0);
                
Query query = parser.createPhraseQuery(“lineText”, “it becomes”);               
 
TotalHitCountCollector collector = new TotalHitCountCollector();
searcher.search(query, collector);
TopDocs results = searcher.search(query, Math.max(1, collector.getTotalHits()));
ScoreDoc[] hits = results.scoreDocs;

Reply via email to