Hi Maryam,

You can index the content of specific field as UN_TOKENIZED and then you can do phrase search on that field..
It will search for only phrases not tokens...
To index HTML pages you can use any HTML parser...
this may be useful to you..
http://lucene.apache.org/java/docs/api/org/apache/lucene/demo/html/HTMLParser.html

Thanks.
Bhavin pandya


----- Original Message ----- From: "Maryam" <[EMAIL PROTECTED]>
To: <java-user@lucene.apache.org>
Sent: Thursday, March 15, 2007 7:55 AM
Subject: Indexing HTML pages and phrases


Hi,

I am wondering if we can index a phrase (not term) in
Lucene? Also, I am not usre if it can index HTML
pages? I need to have access to the text of some of
tags, I am not sure if this can be done in Lucene. I
would be so glad if you help me in this case.

Thanks




____________________________________________________________________________________
Expecting? Get great news right away with email Auto-Check.
Try the Yahoo! Mail Beta.
http://advision.webevents.yahoo.com/mailbeta/newmail_tools.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to