: Doc A has keywords "Car Dealer", "Car Repair"
: Doc B has keywords "Car Washing", "Car Clean"
:
: I have a "Optional Keywords" list that contains keywords like "Dealer".
:
: If my query is "Car Repair" should only match Doc A.
: If my query is "Car", should match "Car Dealer", because "Dealer" is an
: optional keyword, but if the query is only "Dealer", no documents should be
: matched.
You've provided a few examles of thing you want to see happen -- but with
odd usecases like this you really have to think hard about what *else* you
want to see happen, or not happen, in various situations.
For instance: in your example above, it sounds like you would expect "Car
Clean" to match DocB, correct? what about just "Clean"? ... if i'm
understanding you correctly you *don't8 want the word clean to match
either of those docs.
but what about "car clean" (lowercase) or "Car Clean " (extra
whitespace) what should those match?
I suspect that a way to restart your goals is...
1) i want to use some basic analysis (eg: standard tokneizer
or whitespace tokenizer + lowercase filter + stemming + ...)
2) there are a set of words i want to completley ignore if they
appear in a query
3) except for #1 & #2 i want documents to match only if they
have a field value which contains all of the words in the
query and no other words.
In which case my suggestion would be:
a) setup the tokenizer & token filters that you want
b) add a StopWordFilterFactory to your analyzer chain containing all
of your words to ignore.
c) add a final TokenFilter that concats all of the tokens i nthe stream
together using a single whitespace dlimiter
"c" is the only thing that Solr doesn't give you out of hte box (i though
we had something to do that, but i can't find it now) so you'd have to
write it as a custom plugin.
-Hoss