Hy All I'm using Lucene to extract keywords out of a text. The Lucene Index is built over a set of defined words (we call them keywords). Then a text is queried to search that index. The goal is to find out which keywords appear in the given text.
This works fine as long as the defined keywords are only "single" words and not phrases. But what to do if a keyword is a phrase and has to appear as whole in the text? (and no part of the phrase must match for it's own) Searching of phrases in quotes would do the job, but I have no knowledge of the text beeing searched. To make my problem clearer: Indexed: Document1 fieldName:economy Text to find keywords in: An economy consists of the economic system of a country or other area search processed: fieldName: economy, fieldName:consists, fieldName:economic ... etc result: Document1 -> keyword economy But now: Keyword phrase I want to index: Make peace not war Text to find keywords in: Two or three eggs for breakfast make no difference if indexed as before this text matches to "Make peace not war" because of the "make" in the text. This is what I'm trying to avoid. Am I overseeing something or is there a different approach for finding keywords and keywordPhrases in a text? Thanks for your help! Christian -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-and-searching-phrases-tp1084545p1084545.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org