Has anyone integrated a crawler with lucene that they had success with? I cannot use Nutch, since 60% of our searchable content is contained in a database. I need to do a hybrid between database indexing and website crawling. I would be just crawling one domain with a given set of directories.

I found this list of crawlers, but nothing that quite seems to fit my needs. One problem with a couple of the libraries that may work is that they use a GNU license.
http://www.manageability.org/blog/stuff/open-source-web-crawlers-java/view

Thanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to