Hi Steve,
If I understand you right, I could use something like the Keyword
analyzer to tokenize the entire stream as a single token and store that
in the index. I could definitely the keyword analyzer while indexing
this particular field "categoryNames".
Now my questions is on how to search and boost this since this is part
of a bigger boolean query in my case.
My typical query actually looks like:
+(+content:digit +content:camera) +entity:product +(title:"digit
camera"~2^40.0 ((title:digit title:camera)^10.0) content:"digit
camera"~2^20.0 (content:digit content:camera) categoryNames:"digit
camera"^80.0)
As you can see i was trying to do a phrase query on the categoryNames
field and boosting it by 80.0.
Also I am using the potter stemming filter to stem while searching. (I
do this while indexing as well). If I go with the KeywordAnalyzer
approach I can index the categoryNames field using this analyzer .
Would I be using the QueryParser to create my query and specify the
keyword analyzer to it while searching on categoryNames ? (and then make
that query part of my global boolean query?)
-Thanks.
Steven Rowe wrote:
Mufaddal Khumri wrote:
lets say i do this while indexing:
doc.add(Field.Text("categoryNames", categoryNames));
Now while searching categoryNames, I do a search for "digital
cameras". I only want to match the exact phrase digital cameras with
documents who have exactly the phrase "digital cameras" in the
categoryNames field. I do not want results that have "digital camera
batteries" part of the result.
Whats the best way to accomplish this?
Hi Mufaddal,
One way to do this is to use the KeywordAnalyzer (in the Lucene
Subversion trunk, but not in v1.4.3; will be in forthcoming v1.9) for
the "categoryNames" field. This analyzer does not tokenize field
contents, so "digital cameras" would be a single token, and the only
thing that would match it would be the exact same single token. Be
careful when you search to construct the search tokens similarly.
If you have other fields you want to search, and you want to tokenize
their contents when you index them, you could use the
PerFieldAnalyzerWrapper, so that the KeywordAnalyzer is only used for
the "categoryNames" field.
Steve
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]