david, thanks for your input..
initially i was hoping to be able to use FST somehow in this process, but
my knowledge in this area is fairly manageable..
i will give it a second thought anyway... ;-)
krj
*Jürgen Jakobitsch*
Innovation Director
Semantic Web Company GmbH
EU: +43-1-4021235-0
Mobile
michael, thanks for your input..
i already extended the defaultCodec to return the
BlockTreeOrdsPostingFormat for testing and this works nicely and i can
access terms via ordinal.
speed is not really the issue ( some things simply take a while... ;-) ) .
i also don't want to index shingles, becau
Hi Chitra,
It sounds like things work for you in 6.4.1 but not in 4.10.4? Why not
just upgrade to 6.4.x?
DrillDownQuery is final because the class is not meant to be subclassed (it
doesn't have any extensions points) and is really just "sugar" for
rewriting to simpler queries.
Mike McCandless
Or you could encode those term/ ngram frequencies one FST and then
reuse it. This would be memory-saving and fairly fast (~comparable to
a hash table).
Dawid
On Fri, Mar 10, 2017 at 11:41 AM, Michael McCandless
wrote:
> Yes, this is a reasonable way to use Lucene (to see terms statistics across
Yes, this is a reasonable way to use Lucene (to see terms statistics across
the corpus) but it may not be performant enough for your needs.
E.g. wasting memory and making a giant hash table for one time or periodic
corpus analysis may be faster.
If you are looking for word N gram stats, you could
Why don't we fix this in Lucene? It sounds like your fix (overriding
toQueryString for the range query nodes) is contained? Could you open an
issue and add a patch?
I agree it's silly to produce [ts:X ts:Y] syntax.
Mike McCandless
http://blog.mikemccandless.com
On Thu, Mar 9, 2017 at 8:59 PM,