Hi Erick,
Thanks for the response. I think I'm starting to get the hang of
this. That's a really good insight, but I'm wondering how to handle that
if a document can have multiple instances of the same field. So, instead
of Author, say, City names that are mentioned. But, as you said, I co
Our mails are crossing
Not that I know of. But why don't you just index (or maybe just store)
a separate field containing your offset information? Something like
title_offset with, say, a comma-separated pair denoting char position
and length that you then read in at search time and parse.
What is your analyzer doing? Let's assume you're trying
to index the title and that your entire text is
"this is a book and HERE IS THE TITLE."
I *think* your underlying analyzer should be returning
4 tokens with starts of 20 for HERE, 25 for IS,
28 for THE and 32 for TITTLE, with appropriate en
OK, I think I understand what's going on - it looks like I am able to set
the token for the full author name (Say, "Steve Suppe") with the correct
offsets, but the analyzer takes it once step further and tokenizes 'Steve'
and 'Suppe' which is giving me a lot more generated offsets and is
confus
Hi all,
I'm trying to index documents so that a) I have all the documents indexed
'normally' (in that I can search for documents that match certain words,
and b) parts of the document that I consider important, such as author and
title are ALSO stored in their own indexed fields.
I have (a)