On Sat, Aug 20, 2011 at 7:00 PM, Robert Muir wrote:
> On Sat, Aug 20, 2011 at 3:34 AM, Trejkaz wrote:
>
>>
>> As an aside, Google's behaviour seems to follow the "old" way. For
>> instance, [[ 限定 ]] returns 640,000,000 hits and [[ 限 定 ]] returns
>> 772,000,000. (Interestingly, [[ "限定" ]] return
Hi Xlyang,
You should use KeywordAnalyzer() as it treats the entire string (multi-word
phrase in your case)
as it is without splitting the constituent words.
Thanks,
Govind
On Mon, Aug 22, 2011 at 1:23 AM, Xiyang Chen wrote:
> Hi,
>
> I have a dictionary of multi-word phrases and I'd like to a
Hi,
I have a dictionary of multi-word phrases and I'd like to analyze documents
such that anything that appears in the dictionary will be treated as one single
token.
For example, if the dictionary contains "brown fox", then the sentence
The quick brown fox jumps over the lazy dog.
Will be tok
Closed! TeeSinkTokenFilter and CachingTokenFilter seem to provide the
functionality/code examples I was looking for.
Thanks, graham.
-- Forwarded message --
From: Graham Sugden
Date: Thu, Aug 18, 2011 at 5:23 PM
Subject: Multiple fields derived from same source text?
To: java-use