I know of no way of doing this with the standard analyzers, unless you do
some fooling around..

I think you'd have to write your own analyzer/tokenizer that you use both at
indexing time and query parsing time that broke the input streams up the way
you want. In this case, A B would be a SINGLE token. A C likewise, and D
would be a single token too. Your index would then contain what you want.
You'd have to use the same analyzer when searching as indexing.

Alternatively, you could substitute a special character (again on reading
the input for both the indexing process and the searching process) that
strung your input together, and then use normal analyzers. In this case,
index A_B, A_C, and D. Searching for A_B, A_C and D should then be hits,
while A would not. I like this quite a lot better than fooling around with a
custom tokenizer now that I think of it.

You have to be a bit careful though. If you use StandardAnalyzer in this
case, I *think* it'll split the input on the underscore, so either use some
other character that doesn't get broken up, or use a different analyzer, say
the WhitespaceAnalyzer.

Oh, and be sure to get a copy of Luke to look at your initial tries at this
to see if what you actually index is what you *think* you're indexing. I've
been confused by this more than once <G>....

Best
Erick

On 9/11/06, Leandro Saad <[EMAIL PROTECTED]> wrote:

Hi all,

I have a field called "location" on my index. For example, this
string:  "A
B" "A C" D was stored on my index
When I search for "location: ", these are the results that I'd like to
retrieve:
1) location: D -- 1 hit
2) location: A -- no hits
3) location: "A B" -- 1 hit
4) location: "A C" -- 1 hit

Is there any way I can make this work?

--
Leandro Rodrigo Saad Cruz
software developer - certified scrum master
:: scrum.com.br
:: db.apache.org/ojb
:: guara-framework.sf.net
:: xingu.sf.net


Reply via email to