Hi All,

I have to develop a protoype of a search/indexation system with the following characteristics, 1) High volume of data indexation but only with add and delete functionality (approximatively 10 PDF) => scalable architecture HDFS seems good.
2) Specific analysis chain and a given set of meta-data indexation.
3) Language Recognition
4) No graphical interface for searching is needed, no crawling is needed, Indexation and Search are performed with HTTP Request to a Servlet

What is the best starting choice for this : Lucene or Nutch ?

As far as I know Lucene is a good choice for 2 and 4, Nutch is a better choice for 1 and 3.

Is Nutch as configurable as Lucene regarding the indexation and search process and is it possible to write plug-in for specific analysis ?

Bruno

        

        
                
___________________________________________________________________________ Nouveau : téléphonez moins cher avec Yahoo! Messenger ! Découvez les tarifs exceptionnels pour appeler la France et l'international.
Téléchargez sur http://fr.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to