Hi Paul,

Thanks. 
Use Nutch to do crawling. and integrate Lucene to the web application, so that 
can do search online.

BTW, Nutch seems to have only Linux version, what my development is on Windows. 
Am i right?

Zhou

--- On Fri, 8/1/10, Paul Libbrecht <p...@activemath.org> wrote:

From: Paul Libbrecht <p...@activemath.org>
Subject: Re: a complete solution for building a website search with lucene
To: java-user@lucene.apache.org
Date: Friday, 8 January, 2010, 4:27 PM

Zhou,

Lucene is a back-end library, it's very useful for developer but it is not a 
complete site-search-engine.
A lucene-based site-search-engine is Nutch, it does crawl.
Solr also provides functions close to these with a large amount of thoughts on 
flexible integration; crawling methods are rather based on feeds or other 
acquisition methods (see DIH for example).

paul




Le 08-janv.-10 à 08:08, <jyzhou...@yahoo.com> a écrit :

> Hi ,
> 
> I am new in Lucene.
> 
> To build a web search function, it need to have a backendc indexing function. 
> But, before that, should run a Crawler? because Lucene index based on Html 
> documents, while Crawler can change the website pages to Html documents. Am i 
> right?
> 
> If so, please anyone suggest to me a Crawler? like Nutch?
> Thanks
> Zhou
> 
> 
> 
> 
>      New Email names for you!
> Get the Email name you've always wanted on the new @ymail and @rocketmail.
> Hurry before someone else does!
> http://mail.promotions.yahoo.com/newdomains/sg/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




      New Email names for you! 
Get the Email name you&#39;ve always wanted on the new @ymail and @rocketmail. 
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/

Reply via email to