Hi Paul, Thanks. Use Nutch to do crawling. and integrate Lucene to the web application, so that can do search online.
BTW, Nutch seems to have only Linux version, what my development is on Windows. Am i right? Zhou --- On Fri, 8/1/10, Paul Libbrecht <p...@activemath.org> wrote: From: Paul Libbrecht <p...@activemath.org> Subject: Re: a complete solution for building a website search with lucene To: java-user@lucene.apache.org Date: Friday, 8 January, 2010, 4:27 PM Zhou, Lucene is a back-end library, it's very useful for developer but it is not a complete site-search-engine. A lucene-based site-search-engine is Nutch, it does crawl. Solr also provides functions close to these with a large amount of thoughts on flexible integration; crawling methods are rather based on feeds or other acquisition methods (see DIH for example). paul Le 08-janv.-10 à 08:08, <jyzhou...@yahoo.com> a écrit : > Hi , > > I am new in Lucene. > > To build a web search function, it need to have a backendc indexing function. > But, before that, should run a Crawler? because Lucene index based on Html > documents, while Crawler can change the website pages to Html documents. Am i > right? > > If so, please anyone suggest to me a Crawler? like Nutch? > Thanks > Zhou > > > > > New Email names for you! > Get the Email name you've always wanted on the new @ymail and @rocketmail. > Hurry before someone else does! > http://mail.promotions.yahoo.com/newdomains/sg/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org New Email names for you! Get the Email name you've always wanted on the new @ymail and @rocketmail. Hurry before someone else does! http://mail.promotions.yahoo.com/newdomains/sg/