One simple hack which may or may not meet your objectives: 1) index each paragraph as if it were a document (this would then not allow Boolean across paragraphs, which could be a problem)
2) set the position increment gap to, say, 100 and then index each sentence within the paragraph as another value in a multivalued field. This would then prevent phrasal matches across sentence boundaries if the user is searching for proximity < 100. Another hack along the lines you mention would be to add in an impossible token "SENTENCE" or "PARAGRAPH" and then wrap the user's query as a SpanNotQuery. LUCENE-5205's SpanOnlyParser might be of use for this. You may also want to look into the PostingsHighlighter's use of BreakIterator for ideas...It isn't immediately clear to me how that could be used for retrieval, but it does work for highlighting. -----Original Message----- From: Jigar Shah [mailto:jigaronl...@gmail.com] Sent: Monday, April 07, 2014 3:47 AM To: java-user@lucene.apache.org Subject: Proximity Search for SENTENCE and PARAGRAPH Hello all, I need to implement 2 features in my application: 1. "Proximity for words and phrases within the same sentence" 2. "Proximity for words and phrases within the same paragraph" Doing some research on internet if found following things. There is "ProximityQueryNode" which has some enum for this, but there seems no support in parser for it. As there are no out-of-the box support or some contrib, for such feature, except one https://github.com/markrmiller/qsol. which is not maintained. There are some workarounds suggested like marking sentence/paragraph boundaries. And then search using SpanQuery Api. Please let me know if some work done for such features, or some proven approach. Thanks Jigar Shah. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org