That's interesting.
I've been working in python recently, not crawling though.
But, as ever, the more you get into it the more curious you get.
Did you come up with a solution to a node error?
Are you really talking about a broken link, or are you just saying the
bottom of the tree has been reached
On Wed, Mar 4, 2009 at 4:41 PM, Grant Ingersoll wrote:
> You might have a look at Droids (http://incubator.apache.org/droids/) or
> Nutch (http://lucene.apache.org/nutch) and their communities. They are much
> more focused on crawling (not to say there aren't people here who crawl,
> just saying
You might have a look at Droids (http://incubator.apache.org/droids/)
or Nutch (http://lucene.apache.org/nutch) and their communities. They
are much more focused on crawling (not to say there aren't people here
who crawl, just saying those projects are (mostly) about crawling)
On Mar 4, 2
Hi...
Sorry that this is a bit off track. Ok, maybe way off track!
But I don't have anyone to bounce this off of..
I'm working on a crawling project, crawling a college website, to extract
course/class information. I've built a quick test app in python to crawl the
site. I crawl at the top level