I want to implement an analyzer which use WhitespaceAnalyzer first
then my tokenFilter. But my filter need not global information of
token such as how many times a token occur. So in tokenStream method.
I iterate the tokenStream to get all the things I need. Then
pass this information to my own To
> Is there a fundamental difference between
>
> PhraseQuery query = new PhraseQuery();
> query.add(term1, 0);
> query.add(term2, 0);
>
> and
>
> MultiPhraseQuery query = new MultiPhraseQuery();
> query.add( new Term[] { term1, term2 } );
>
> The only different I could think of is that MPQ som
> I'm having problem with searching phrase and using Surround
> Query Parser, so
> let look at input surround queries (test examples)
> 1. "yellow orange"
> 2. lemon 2n ("yellow orange") 4n banana
> where 2n, 4n are within connectors.
You don't need phrasequery when you already have spannear
Also, are you indexing largish documents? Lucene must fully index the
doc, and then flush, so for such large docs it can easily use more
than the 50 MB buffer you allotted.
There were some recent memory leak fixes for such large documents, as
well, that you might be hitting. Which Lucene version
Hello,
I am a bit confused by the two.
Is there a fundamental difference between
PhraseQuery query = new PhraseQuery();
query.add(term1, 0);
query.add(term2, 0);
and
MultiPhraseQuery query = new MultiPhraseQuery();
query.add( new Term[] { term1, term2 } );
The only different I could think of i
What is the problem you're seeing? Maybe a stack trace?
You haven't told us what the incorrect behavior is.
Best
Erick
On Fri, May 28, 2010 at 12:52 AM, Li Li wrote:
> I want to analyzer a text twice so that I can get some statistic
> information from this text
>
Hello
I'm having problem with searching phrase and using Surround Query Parser, so
let look at input surround queries (test examples)
1. "yellow orange"
2. lemon 2n ("yellow orange") 4n banana
where 2n, 4n are within connectors.
You see I surrounded yellow orange into quotes to let the par
It seems like there should be a formula for estimating the total
number of unique terms given that you know the unique term counts for
each segment, and make certain assumptions like random document
distribution across segments.
-Yonik
http://www.lucidimagination.com
On Thu, May 27, 2010 at 9:17