I know of no way of doing this with the standard analyzers, unless you do some fooling around..
I think you'd have to write your own analyzer/tokenizer that you use both at indexing time and query parsing time that broke the input streams up the way you want. In this case, A B would be a SINGLE token. A C likewise, and D would be a single token too. Your index would then contain what you want. You'd have to use the same analyzer when searching as indexing. Alternatively, you could substitute a special character (again on reading the input for both the indexing process and the searching process) that strung your input together, and then use normal analyzers. In this case, index A_B, A_C, and D. Searching for A_B, A_C and D should then be hits, while A would not. I like this quite a lot better than fooling around with a custom tokenizer now that I think of it. You have to be a bit careful though. If you use StandardAnalyzer in this case, I *think* it'll split the input on the underscore, so either use some other character that doesn't get broken up, or use a different analyzer, say the WhitespaceAnalyzer. Oh, and be sure to get a copy of Luke to look at your initial tries at this to see if what you actually index is what you *think* you're indexing. I've been confused by this more than once <G>.... Best Erick On 9/11/06, Leandro Saad <[EMAIL PROTECTED]> wrote:
Hi all, I have a field called "location" on my index. For example, this string: "A B" "A C" D was stored on my index When I search for "location: ", these are the results that I'd like to retrieve: 1) location: D -- 1 hit 2) location: A -- no hits 3) location: "A B" -- 1 hit 4) location: "A C" -- 1 hit Is there any way I can make this work? -- Leandro Rodrigo Saad Cruz software developer - certified scrum master :: scrum.com.br :: db.apache.org/ojb :: guara-framework.sf.net :: xingu.sf.net