Actually ClassicTokenizer seems to do the job. Any side effects of using
ClassicTokenizer rather than StandardTokenizer ?
Regards.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Exclusion-List-for-standard-tokenizer-tp4306511p4306516.html
Sent from the Lucene - Java Users
Hi,
Is there any exclusion list of characters which can be defined for
StandardTokenizer ? In my case, i want to use StandardTokenizer(as it solves
many problems of when to tokenization across languages) but i don't want to
tokenize the stream on certain characters for example '@'. Is there a wa
So when a query arrives, you know the query is only allowed to match
either module:1 (analyzed terms) or module:2 (not analyzed) but never
both? If so, you should be fine.
Though relevance will be sort of wonky, in case that matters, because
you are polluting the unique term space; you would get
Hmm I didn't realize there was that change in behavior between versions.
But, in 6.3.0, can't you look for a token of type SYNONYM whose
posInc=0 and then know that the previous (posInc>0) token had caused
that synonym? You just need a bit of caching, until all synonyms for
a given token have bee
I think you've summed up exactly the differences!
And, yes, it would be possible to emulate hierarchical facets on top
of flat facets, if the hierarchy is fixed depth like year/month/day.
But if it's variable depth, it's trickier (but I think still
possible). See e.g. the Committed Paths drill-d
You can do this, Lucene will let you, but it's typically a bad idea
for search relevance because some documents will return only if you
search for precisely the same whole token, others if you search for an
analyzed token, giving the user a broken experience.
Mike McCandless
http://blog.mikemcca
In this work, we aim to improve the field weighting for structured doc-
ument retrieval. We first introduce the notion of field relevance as the
generalization of field weights, and discuss how it can be estimated using
relevant documents, which effectively implements relevance feedback for
f
Hi All,
Can anyone say, is it advisable to have index with both analyzed and
not_analyzed values in one field?
Use case: i have custom fields in my product which can be configured
differently ( ANALYZED and NOT_ANALYZED ) in different modules
--
Kumaran R
On Wed, Oct 26, 2016 at 12:0
Hi Nicholas,
Aha, I see that you are into field-based scoring, which is an unsolved problem.
Then, you might find BlendedTermQuery and SynonymQuery relevant.
Ahmet
On Friday, November 18, 2016 12:22 AM, Nicolás Lichtmaier
wrote:
That depends on what you want. In this case I want to use a
Am 18.11.2016 um 08:58 schrieb Bernd Fehling:
> Hi Mike,
>
> let me explain.
>
> First, after looking deeper inside I noticed that the Filters are used
> like a stack and called backwards. So the first incrementToken goes
> to the last filter in the chain. That one also uses incrementToken and
Hi Mike,
let me explain.
First, after looking deeper inside I noticed that the Filters are used
like a stack and called backwards. So the first incrementToken goes
to the last filter in the chain. That one also uses incrementToken and
and calls its predecessor in the chain and so on.
So everythin
11 matches
Mail list logo