You have to use the TokenStream retrieved by Analyzer in the specified
order, otherwise it will not work correctly and will behave as described by
you:
reset()
while (incrementToken())
end()
close()
You have to call reset() also when using for first time! That's specified in
the specs. If you do
Hi Simon,
I'm trying to reuse a custom analyzer and it's not working unless I manually
call reset() on the TokenStream. Basically the analyzer will work on the
first string, but complete fail on any string after that. The weird part is
that this is only necessary when using the SynonymFilter.
I
I was hoping I didn't have to iterate through the short documents.
I have about ~1M of them currently and this process needs to be very fast.
So I understand there is not such functionality available in lucene.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Find-documents-c
Can't see how you could do it with standard queries, but you could
reverse the process and use a MemoryIndex.
Add the single target phrase to the memory index then loop round all
docs executing a search for each one. Maybe use PrefixQuery although
I'd worry about performance. Try it and see.
Bu
Hmm, actually, we only warm newly merged (not newly flushed) segments,
today. We don't warm flushed segments today because, in an NRT
setting, it's just an added latency on turning around updates to the
index (vs merging which is purely replacing old segments with new
ones).
But one hack you coul