Hi, I try to following the instruction from http://lucene.apache.org/nutch/tutorial8.html ..... Intranet: Configuration To configure things for intranet crawling you must:1. Create a directory with a flat file of root urls. For example, to crawl the nutch site you might start with a file named urls/nutch containing the url of just the Nutch home page. All other Nutch pages should be reachable from this page. The urls/nutch file would thus contain: http://lucene.apache.org/nutch/
.... not understand. Can anyone help me out. Thanks. zhou New Email addresses available on Yahoo! Get the Email name you've always wanted on the new @ymail and @rocketmail. Hurry before someone else does! http://mail.promotions.yahoo.com/newdomains/sg/