Zhou,

Lucene is a back-end library, it's very useful for developer but it is not a complete site-search-engine.
A lucene-based site-search-engine is Nutch, it does crawl.
Solr also provides functions close to these with a large amount of thoughts on flexible integration; crawling methods are rather based on feeds or other acquisition methods (see DIH for example).

paul




Le 08-janv.-10 à 08:08, <jyzhou...@yahoo.com> a écrit :

Hi ,

I am new in Lucene.

To build a web search function, it need to have a backendc indexing function. But, before that, should run a Crawler? because Lucene index based on Html documents, while Crawler can change the website pages to Html documents. Am i right?

If so, please anyone suggest to me a Crawler? like Nutch?
Thanks
Zhou




     New Email names for you!
Get the Email name you&#39;ve always wanted on the new @ymail and @rocketmail.
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to