Nutch is written in Java, so Nutch itself *should* work on other non-Linux OSs that the JVM supports. But it does contain some shell scripts, as does Hadoop that Nutch uses. Oh, I guess Windows people run it under Cygwin? Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
----- Original Message ---- > From: "jyzhou...@yahoo.com" <jyzhou...@yahoo.com> > To: java-user@lucene.apache.org > Sent: Fri, January 8, 2010 5:03:41 AM > Subject: Re: a complete solution for building a website search with lucene > > Hi Paul, > > Thanks. > Use Nutch to do crawling. and integrate Lucene to the web application, so > that > can do search online. > > BTW, Nutch seems to have only Linux version, what my development is on > Windows. > Am i right? > > Zhou > > --- On Fri, 8/1/10, Paul Libbrecht wrote: > > From: Paul Libbrecht > Subject: Re: a complete solution for building a website search with lucene > To: java-user@lucene.apache.org > Date: Friday, 8 January, 2010, 4:27 PM > > Zhou, > > Lucene is a back-end library, it's very useful for developer but it is not a > complete site-search-engine. > A lucene-based site-search-engine is Nutch, it does crawl. > Solr also provides functions close to these with a large amount of thoughts > on > flexible integration; crawling methods are rather based on feeds or other > acquisition methods (see DIH for example). > > paul > > > > > Le 08-janv.-10 à 08:08, a écrit : > > > Hi , > > > > I am new in Lucene. > > > > To build a web search function, it need to have a backendc indexing > > function. > But, before that, should run a Crawler? because Lucene index based on Html > documents, while Crawler can change the website pages to Html documents. Am i > right? > > > > If so, please anyone suggest to me a Crawler? like Nutch? > > Thanks > > Zhou > > > > > > > > > > New Email names for you! > > Get the Email name you've always wanted on the new @ymail and @rocketmail. > > Hurry before someone else does! > > http://mail.promotions.yahoo.com/newdomains/sg/ > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > New Email names for you! > Get the Email name you've always wanted on the new @ymail and @rocketmail. > Hurry before someone else does! > http://mail.promotions.yahoo.com/newdomains/sg/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org