On 15/09/13 11:41, Michael McCandless wrote:

Your understanding is correct: there are two ways to affect the
indexed position.

Thanks for the confirmation, took me a while to figure that out :-)

Either approach would work, but if you do the single-field approach,
the challenge is in making a TokenFilter that knows when one chunk
ended so it could set the position increment.

Yes, I'd have to find a way to pass some metadata into the tokenizer before feeding it each chunk. Kinda messy.

I think it'd be easier to just add multiple field instances?

Yes, that's the conclusion I came to. It's easy enough to do, I'm using JavaMail to recursively traverse the mail file so I can separate out each mail and also deal with multipart mails as well as attachments, which I'm then feeding into Tika.

Thank you for the information :-)

--
Alan Burlison
--

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to