hi, i'm using lucene 2.0.
To index a very long text i use Field.Index.TOKENIZED & Field.Store.NO. I
don't know how to get its content (actually, only need words near keywords,
like google's results: ... found this keyword here...) without querying
database?
Someone told me using term vector to p
hi, all,
I wrote my own html parser because it just meets my require and do not
depend on 3rd part's lib. and i'd like to share it (in attachment).
This class provides some static methods to do html <-> text convertion:
HtmlUtil.html2text(String html);
HtmlUtil.text2html(String text);
a
hi,
i wrote my own html parser to do html2text and it works well. i can send you
my code if it matches your require.
-Original Message-
From: John Wang [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 21, 2006 1:40 PM
To: java-user@lucene.apache.org
Subject: HTML text extraction
Can someo
hi, I'm new to lucene.
Now I want to add full-text search for my website to search articles, images
and bbs topics. I'm not sure to use only one index to search all types of
these, or create 3 indexes for each of type.
If I use only one index, do I have to add a 'type' field to identify
document