Thanks.

--- On Wed, 13/1/10, Otis Gospodnetic <[email protected]> wrote:

From: Otis Gospodnetic <[email protected]>
Subject: Re: how to follow intranet: configuration in nutch website
To: [email protected]
Date: Wednesday, 13 January, 2010, 12:07 PM

Zhou,

Your question will get more attention if you send it to 
[email protected] list instead.  This list is for Lucene Java.

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: "[email protected]" <[email protected]>
> To: [email protected]
> Sent: Tue, January 12, 2010 10:51:59 PM
> Subject: how to follow intranet: configuration in nutch website
> 
> Hi,
> 
> I try to following the instruction from 
> http://lucene.apache.org/nutch/tutorial8.html
> .....
> Intranet: Configuration
> To configure things for intranet crawling you must:1. Create a directory with 
> a 
> flat file of root urls.  For example, to
> crawl the nutch site you might start with a file named
> urls/nutch containing the url of just the Nutch home
> page.  All other Nutch pages should be reachable from this page.  The
> urls/nutch file would thus contain:
> http://lucene.apache.org/nutch/
> 
> ....
> 
> not understand. Can anyone help me out. 
> 
> Thanks.
> zhou
> 
> 
>       New Email addresses available on Yahoo!
> Get the Email name you've always wanted on the new @ymail and @rocketmail. 
> Hurry before someone else does!
> http://mail.promotions.yahoo.com/newdomains/sg/


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]




      

Reply via email to