You still have the query parser's parsing before analysis to deal with, no
matter what magic you code in your analyzer.
-- Jack Krupansky
-Original Message-
From: Tom
Sent: Friday, December 21, 2012 2:24 PM
To: java-user@lucene.apache.org
Subject: Re: Which token filter can combine 2
: Unfortunately, no...I am not combine every two term into one. I am
: combining a specific pair.
I'm confused ... you've already said that you expect you will need a
custom filter because your usecase is very special -- and you haven't
given us many details about exactly when/why/how you want t
On Fri, Dec 21, 2012 at 9:16 AM, Jack Krupansky wrote:
> And to be more specific, most query parsers will have already separated
> the terms and will call the analyzer with only one term at a time, so no
> term recombination is possible for those parsed terms, at query time.
>
Most analyzers will
And to be more specific, most query parsers will have already separated the
terms and will call the analyzer with only one term at a time, so no term
recombination is possible for those parsed terms, at query time.
-- Jack Krupansky
-Original Message-
From: Erick Erickson
Sent: Friday
If it's a fixed list and not excessively long, would synonyms work?
But if theres some kind of logic you need to apply, I don't think you're
going to find anything OOB.
The problem is that by the time a token filter gets called, they are
already split up, you'll probably
have to write a custom fil
On Thu, Dec 20, 2012 at 3:54 PM, Wu, Stephen T., Ph.D.
wrote:
>> If you stuff the end of the span into the payload you'd have to create
>> a custom variant of PhraseQuery to properly match based on the end
>> span.
>
> How different is this from the functionality already avaialable through
> SpanQ
Unfortunately, no...I am not combine every two term into one. I am
combining a specific pair.
E.g. the Token Stream: t1 t2 t2a t3
should be rewritten into t1 t2t2a t3
But the TS: t1 t2 t3 t2a
should not be rewritten, and it is already correct
On Fri, Dec 21, 2012 at 5:00 PM, Alan Woodward <
ala
Have a look at ShingleFilter:
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/analysis/shingle/ShingleFilter.html
On 21 Dec 2012, at 08:42, Xi Shen wrote:
> I have to use the white space and word delimiter to process the input
> first. I tried many combination, and it seems to me
I have to use the white space and word delimiter to process the input
first. I tried many combination, and it seems to me that it is inevitable
the term will be split into two :(
I think developing my own filter is the only resolution...but I just cannot
find a guide to help me understand what I n
Easiest way would be to pre-process your input and join those 2 tokens
before splitting them by white space.
But from given context I might miss some details...still worth a shot.
On Fri, Dec 21, 2012 at 9:50 AM, Xi Shen wrote:
> Hi,
>
> I am looking for a token filter that can combine 2 terms
10 matches
Mail list logo