Thanks. --- On Sat, 9/1/10, Simon Willnauer <simon.willna...@googlemail.com> wrote:
From: Simon Willnauer <simon.willna...@googlemail.com> Subject: Re: a complete solution for building a website search with lucene To: java-user@lucene.apache.org Date: Saturday, 9 January, 2010, 6:16 PM I don't know that much about nutch but hadoop shouldn't really run under windows in production. If you use windows for development this should not be a big issue. Oatis is right you should use cygwin together with hadoop. look at http://wiki.apache.org/hadoop/FAQ for initial info. simon On Sat, Jan 9, 2010 at 5:20 AM, Otis Gospodnetic <otis_gospodne...@yahoo.com> wrote: > Nutch is written in Java, so Nutch itself *should* work on other non-Linux > OSs that the JVM supports. > But it does contain some shell scripts, as does Hadoop that Nutch uses. Oh, > I guess Windows people run it under Cygwin? > Otis > -- > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch > > > > ----- Original Message ---- >> From: "jyzhou...@yahoo.com" <jyzhou...@yahoo.com> >> To: java-user@lucene.apache.org >> Sent: Fri, January 8, 2010 5:03:41 AM >> Subject: Re: a complete solution for building a website search with lucene >> >> Hi Paul, >> >> Thanks. >> Use Nutch to do crawling. and integrate Lucene to the web application, so >> that >> can do search online. >> >> BTW, Nutch seems to have only Linux version, what my development is on >> Windows. >> Am i right? >> >> Zhou >> >> --- On Fri, 8/1/10, Paul Libbrecht wrote: >> >> From: Paul Libbrecht >> Subject: Re: a complete solution for building a website search with lucene >> To: java-user@lucene.apache.org >> Date: Friday, 8 January, 2010, 4:27 PM >> >> Zhou, >> >> Lucene is a back-end library, it's very useful for developer but it is not a >> complete site-search-engine. >> A lucene-based site-search-engine is Nutch, it does crawl. >> Solr also provides functions close to these with a large amount of thoughts >> on >> flexible integration; crawling methods are rather based on feeds or other >> acquisition methods (see DIH for example). >> >> paul >> >> >> >> >> Le 08-janv.-10 à 08:08, a écrit : >> >> > Hi , >> > >> > I am new in Lucene. >> > >> > To build a web search function, it need to have a backendc indexing >> > function. >> But, before that, should run a Crawler? because Lucene index based on Html >> documents, while Crawler can change the website pages to Html documents. Am i >> right? >> > >> > If so, please anyone suggest to me a Crawler? like Nutch? >> > Thanks >> > Zhou >> > >> > >> > >> > >> > New Email names for you! >> > Get the Email name you've always wanted on the new @ymail and @rocketmail. >> > Hurry before someone else does! >> > http://mail.promotions.yahoo.com/newdomains/sg/ >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> >> >> New Email names for you! >> Get the Email name you've always wanted on the new @ymail and @rocketmail. >> Hurry before someone else does! >> http://mail.promotions.yahoo.com/newdomains/sg/ > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org