On Mon, May 31, 2010 at 3:16 PM, Anand Balachandran Pillai < abpil...@gmail.com> wrote:
> On Sun, May 30, 2010 at 9:56 PM, JAGANADH G <jagana...@gmail.com> wrote: > > > Dear All I was trying to run Harvestman(A Python tool for web > harvesting). > > I got the following error > > http://pastebin.com/uPzUs0Xw > > > > My configuration file is http://pastebin.com/dfhiy2Q6 > > > > Can any body help me regarding this. > > > > I was trying to harvest my blog with a word filter 'Python' > > > > There is no word filter anymore. You hit upon a bug which seems to > still apply the word-filter code :) > > For filtering based on words or regular expressions on the page content, > you can implement a custom crawler. It is pretty easy and a sample > already exists. Just modify the code to suit the keyword(s) you want > to filter. > > Look for "searchingcrawler.py" inside apps/samples folder and > modify the code. > > Thanks Anand . I will try this -- ********************************** JAGANADH G http://jaganadhg.freeflux.net/blog _______________________________________________ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers