On Fri, Mar 11, 2011 at 5:06 PM, Li Li [via Lucene] <
ml-node+2664380-1940163870-376...@n3.nabble.com> wrote:

>   But I think the parser will most be used when crawling. So you can use
> these parsers when crawling and save parsed result only.
>

Consider we've offline HTML pages, no parsing while crawling, now what ?
Any tokenizer someone has built for this ?


How does Solr do it ?


-- 
Regards
Shrinath.M


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Which-is-the-best-fast-HTML-parser-tokenizer-that-I-can-use-with-Lucene-for-indexing-HTML-content-to-tp2664316p2664411.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

Reply via email to