On Fri, Nov 16, 2012 at 5:18 PM, Tom Burton-West wrote:
> Hi Otis,
>
> I hope this is not off-topic,
>
> Apparently in Lucene similarity does not have to be set at index time:
>
Actually in the general case it does. IndexWriter calls the Similarity's
computeNorm method at index-time.
Its just th
Does anyone resove this ?
thx
--
View this message in context:
http://lucene.472066.n3.nabble.com/Retrieval-of-the-position-of-indexed-terms-tp4015079p4020835.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
Nope! This slang term only exists in the plural. The kind of prose with this
usage may not follow standard grammatical and spelling rules anyway.
Historically, text search has been funded mostly by the US intelligence
agencies because they want to analyze formal and technical prose. And, it is
but if "dogs" are feet (and I guess I fall into the not-perfect group
here)... and "feet" is the plural form of "foot", then shouldn't "dogs"
be stemmed to "dog" as a base, singular form?
On 11/16/2012 2:32 PM, Tom Burton-West wrote:
Hi Mike,
Honestly I've never heard of anyone using "dog
Hi Mike,
>>Honestly I've never heard of anyone using "dogs" to mean feet either, but
hey nobody's perfect.
This is really off topic but I couldn't resist. This usage of "dogs" to
mean feet occurs in old blues lyrics such as Blind Lemon Jefferson's "Hot
Dogs"
http://www.youtube.com/watch?v=v670qV
Hi Otis,
I hope this is not off-topic,
Apparently in Lucene similarity does not have to be set at index time:
See http://lucene.apache.org/core/4_0_0/changes/Changes.html under Lucene
2959
"All models default to the same index-time norm encoding as
DefaultSimilarity, so you can easily try these
Yes, this is possible using Lucene's grouping APIs.
It looks like index time grouping won't work, since you get the same
parent spread out across time, but you can use the two-pass grouping
instead ... run the FirstPassGroupingCollector on each shard, get the
top groups from each, merge those and
The format is unfortunately rather intricate ...
FST = finite state transducer (see eg
http://blog.mikemccandless.com/2010/12/using-finite-state-transducers-in.html
). We use that to hold the terms index (*.tip), which is loaded into
RAM.
The blocks are because we encode a block of between 25 -
I'm study deeply in the index format,
write java utils to log all of it.
And now I have successfully logged .si, .fnm, .fdx, .fdt,
but the .tim and .tiq is too complicated...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Lucene-Index-File-Format-tp4011133p4020685.html
S
me too !
Could you explain how you solved it ??
--
View this message in context:
http://lucene.472066.n3.nabble.com/Lucene-4-0-Get-All-Index-Terms-tp3686023p4020683.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
--
10 matches
Mail list logo