I assume you already know this but just to make sure what I meant was clear
- on tokenization but still indexing just means that the entire field's text
becomes a single unchanged token. I believe this is exactly what
SingleTokenTokenStream can buy you - a single token, for which you can pre
set a
Thanks for the info. But do you know where this is actually perform in
Lucene? I mean the method involved, that will calculate the value before
storing it into the index. I track it to one method known as lengthNorm() in
DefaultSimilarity.java, but the value is different from what is stored in
the
Norms information comes mainly from lengths of documents - allowing the
search time scoring to take into account the effect of document lengths
(actually
field length within a document). In practice, norms stored within the index
may include
other information, such as index time boosts - for a docu
Hi,
I am currently using Lucene for indexing. After a index a file, I will use
LUKE to open it and check the index. And there is 1 part that I am curious
about. In Luke, under the Document tab, I randomly select a document and
display it. At the bottom will be 4 columns, Field, ITSVopLBC, Norm an
Hi.
Yes, that method is in lucene.
I'm sorry about I did misunderstand your words.
I hope that you will find the way for you want.
bye.:)
2008/8/16, Mr Shore <[EMAIL PROTECTED]>:
>
> thanks,Jang
> but I didn't find the method isTokenChar
> maybe it's in lucene,right?
> but I'm using nutch this t
>
> Implementing payloads via Tokens explicitly prevents the use of payloads
> for untokenized fields, as they only support field.stringValue(). There
> seems no way to override this.
I assume you already know this but just to make sure what I meant was clear
- on tokenization but still indexing