Zhou,
Lucene is a back-end library, it's very useful for developer but it is
not a complete site-search-engine.
A lucene-based site-search-engine is Nutch, it does crawl.
Solr also provides functions close to these with a large amount of
thoughts on flexible integration; crawling methods are rather based on
feeds or other acquisition methods (see DIH for example).
paul
Le 08-janv.-10 à 08:08, <jyzhou...@yahoo.com> a écrit :
Hi ,
I am new in Lucene.
To build a web search function, it need to have a backendc indexing
function. But, before that, should run a Crawler? because Lucene
index based on Html documents, while Crawler can change the website
pages to Html documents. Am i right?
If so, please anyone suggest to me a Crawler? like Nutch?
Thanks
Zhou
New Email names for you!
Get the Email name you've always wanted on the new @ymail and
@rocketmail.
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org