Indexing and search questions

Fred Rahmanian Tue, 20 Apr 2010 13:43:48 -0700

I'd like to use lucene to search text documents for the existence of a large
list of search terms. I have a file that contains thousands of entries, one
word per line. I was thinking about to writing a specialized analyzer
that tokenizes the document by  looking up each token in the source document
in my list of words and return terms for words that exist in my file. I'm
hoping that using this approach the index file will contain only items that
exist in my document. So once the index is created I should be able to ask
the index for all of its terms and whatever is returned is the list of items
I'm interested in.


I'm new to Lucene so I'm not sure if I'm going about this the right way.
lastly, I want to be able run this process for thousands of documents and
store the matches ( and their offset ) in a db. So it should be fairly
efficient.

I appreciate any comments.

TIA,

FR

Indexing and search questions

Reply via email to