If it has an API that let's you get the content that needs to be indexed, then, sure, you can index from the spider. If it doesn't have an API, presumably, you would need to somehow extract the docs from the files it builds. This is, of course, assuming it stores the crawled files in some proprietary format. If it just downloads them as the original files on your disk, then it is just like any indexing problem of extracting the content from the files.

-Grant

On Jun 24, 2008, at 11:35 PM, yugana wrote:


Hi Otis,

Thanks for the reply. So you mean it is not possible to use Lucene to index
the fetched (Verity Spider Content) content.

Yug


Otis Gospodnetic wrote:

It sounds like you want to check out Nutch - fetched, indexer, searcher,
etc. in one lovely package.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


----- Original Message ----
From: yugana <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Tuesday, June 24, 2008 3:25:03 AM
Subject: Indexing the spider content


Hi All,

I am new to Lucene Search. Can you let me know if it is possible to index the "Verity Spider" content. If possible please let me know how to create
a
index form it and search on it. Also share some code snippets on how to proceed on indexing and searching. I donot have much time to go throught
the
Lucene API. Your help is appreciated.

Thanks,
Yug
--
View this message in context:
http://www.nabble.com/Indexing-the-spider-content-tp18085388p18085388.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--
View this message in context: 
http://www.nabble.com/Indexing-the-spider-content-tp18085388p18104348.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to