Don't interpret my reponses as *recommending* a database, since I don't know
much about your problem space. It may or may not be the right choice.
Mostly, I was thinking that your particular use of lucene as stated wasn't
playing to lucene's strengths.
It may well be that lucene is a fine choice
Thanks, Erick! I'll try to use LIKE query to database.
Sure, anything's possible. Whether Lucene is your best bet may be another
question . But in this example, you're not using Lucene to do anything
except store the strings. By storing all the data as UN_TOKENIZED, all
you're doing is a regex match on the entire HTML text of each document. You
might
My crawler indexing crawled pages with these code:
Document doc = new Document();
doc.add(new Field("body", page.getHtmlData(), Store.YES, Index.UN_TOKENIZED
));
doc.add(new Field("url", page.getUrl(), Store.YES, Index.UN_TOKENIZED));
doc.add(new Field("title", page.getTitle(), Store.YES, Index.TO
I guess the thundering silence is rooted in the problem statement. I have a
hard time understanding how this index is used. By storing things this way,
you'll force the user to know the *exact* format of anything she's looking
for. That is, it's hard to search for and
get docs containing both an