Re: Indexing HTML pages and phrases

2007-03-16 Thread Doron Cohen
For search phrases there's no need to "detect the phrases" at indexing time - the position of each "word" is saved in the index and then used at search time to match phrase queries. (also see 'query syntax document'.) Lucene takes plain text as document input - extraction of content text and prope

Re: Indexing HTML pages and phrases

2007-03-14 Thread Bhavin Pandya
/apache/lucene/demo/html/HTMLParser.html Thanks. Bhavin pandya - Original Message - From: "Maryam" <[EMAIL PROTECTED]> To: Sent: Thursday, March 15, 2007 7:55 AM Subject: Indexing HTML pages and phrases Hi, I am wondering if we can index a phrase (not term) in Lucene? Als

Re: Indexing HTML pages and phrases

2007-03-14 Thread Bhavin Pandya
- Original Message - From: "Maryam" <[EMAIL PROTECTED]> To: Sent: Thursday, March 15, 2007 7:55 AM Subject: Indexing HTML pages and phrases Hi, I am wondering if we can index a phrase (not term) in Lucene? Also, I am not usre if it can index HTML pages? I need t

Indexing HTML pages and phrases

2007-03-14 Thread Maryam
Hi, I am wondering if we can index a phrase (not term) in Lucene? Also, I am not usre if it can index HTML pages? I need to have access to the text of some of tags, I am not sure if this can be done in Lucene. I would be so glad if you help me in this case. Thanks