On 4/5/06, Bruno Grilheres <[EMAIL PROTECTED]> wrote: > 1) High volume of data indexation but only with add and delete > functionality (approximatively 10 PDF) => scalable architecture HDFS > seems good. > 2) Specific analysis chain and a given set of meta-data indexation. > 3) Language Recognition > 4) No graphical interface for searching is needed, no crawling is > needed, Indexation and Search are performed with HTTP Request to a Servlet > > What is the best starting choice for this : Lucene or Nutch ? > > As far as I know Lucene is a good choice for 2 and 4, Nutch is a better > choice for 1 and 3.
Solr would also be good for 2 and 4 As far as 1, what type of scalability requirements are we talking? (# documents, size of docs, etc) -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]