I have a set of tags associated with content in my corpus. I also have normal text. Our system tries to figure out which "words" are tags and which are text, and falls back on text when tags fail. I'm wondering, is there anything in Lucene which might help disambiguate multi-word tags from text? Specific example:
Tags: "post office", "office" Search: post office mail In this case, I would like something that would indicate that the search could be one of (in scored order): "post office" + mail post + office + mail I realize it's a strange request and that I'm essentially asking Lucene to perform a combinatorically problematic operation. --Max