> :mime-version:content-transfer-encoding:content-type:subject
> :date; s=smtpout; bh=2jmj7l5rSw0yVb/vlWAYkK/YBwk=; b=YMQdnB5fyVu
> M7BgFjfLpyiCyrW9ZbebCz7SHZemKRIBLNpeFnkl+qxArEJ48MGYL+KoCiDxUqie
> 499UJyscpObD8K4Abkm6Rl1qlHvjggf/ZfHTgk9JkH3lj8XamAX/swkbfzlqqP
> (i'll test too that BooleanQuery works as presumed in my case...)
Indeed it works beautifully with a BooleanQuery.
I've updated the patch to LUCENE-1380
~mck
--
"If you have any trouble sounding condescending, find a Unix user to
show you how it's done." S
e...)
Would such a rewrite of the ShingleFilter patch be a substitute for the
custom Analyzer you talk about?
(i'm pushing to keep any patch restricted to the ShingleFilter since my
gut feeling is still that's where the change in behaviour is).
~mck
--
"Between two evils, I always
he tokens' positionIncrements)
equals one. That's even tougher to achieve.
~mck
--
"Driving ambition is the last refuge of the failure." Oscar Wilde
| semb.wever.org | sesat.no | sesam.no |
signature.asc
Description: This is a digitally signed message part
that presumption correct?
~mck
--
"Great spirits have always encountered violent opposition from mediocre
minds. The mediocre mind is incapable of understanding the man who
refuses to bow blindly to conventional prejudices and chooses instead to
express his opinions courageously
> [snip] The option thus should be named something like
> "coterminalPositionIncrement". This seems like a reasonable addition,
> and a patch likely would be accepted, if it included unit tests.
Done.
https://issues.apache.org/jira/browse/LUCENE-1380
~mck
--
"The only
want.
(It returns one hit if TextField and zero hits if StrField, the same
behaviour i mentioned before).
~mck
--
"Traveller, there are no paths. Paths are made by walking." Australian
Aboriginal saying
| semb.wever.org | sesat.no | sesam.no |
signature.asc
Description: This is a digit
ible and someone can say how this would be a real godsend.
Otherwise would a patch to ShingleFilter that offers an option
"unigramPositionIncrement" (that defaults to 1) likely be accepted into
trunk?
~mck
--
"Between two evils, I always pick the one I never tried bef
ts!)
When the index contains 9 entries:
"abcd efgh ijkl", "abcd efgh", "efgh ijkl", "abcd", "efgh", "ijkl", "ijkl
efgh", "efgh abcd", and "ijkl efgh abcd".
Does this MultiPhraseQuery actually require a matc
abcd efgh ijkl) (efgh efgh ijkl) ijkl"
I'm struggling to make sense of this.
How can the shingles be matched if they aren't quoted?
I would be expecting a Query instead like:
abcd "abcd efgh" "abcd efgh ijkl" efgh "efgh ijkl" ijkl
(This with the ShingleFilter disabled does indeed work perfectly).
Am i barking up the wrong tree?
Is there a way to get the shingles phrased?
Or, better yet, is there a way to get the shingles surrounded with ^ $
being/end markers for exact matching?
~mck
signature.asc
Description: This is a digitally signed message part
10 matches
Mail list logo