Hi Sirish,
StandardTokenizer does not produce a token from '#', as you suspected.
Something that fits the "word" definition, but which won't ever be encountered
in your documents, is what you should use for the delimiter - something like
a1b2c3c2b1a .
Sentence boundary handling is clunky in L
Hi Steven,
I have implemented sentence specific proximity search as suggested below.
However, unfortunately it still doesn't identify the sentence boundaries for
my search.
I am using # as a delimiter between my sentences while indexing the content:
ArrayList sentencesList = senten
Hi Yannis,
Thanks for your reply.
It fixed the problem.
Thanks again.
On Thu, Oct 7, 2010 at 10:52 AM, Yannis Pavlidis wrote:
> I would recommend you use NIOFSDirectory. We had similar issues and after
> we switched to NIOFSDirectory these issues disappeared (dramatically
> reduced).
>
> A