Re: indexing unsupported mime types using Lucene

2008-06-20 Thread Otis Gospodnetic
L PROTECTED]> > To: java-user@lucene.apache.org > Sent: Friday, June 20, 2008 2:42:55 AM > Subject: Re: indexing unsupported mime types using Lucene > > > hi Otis > > I haven't tried Tiks? > Is it open source? > > had u heard about LIUS before or is it tal

Re: indexing unsupported mime types using Lucene

2008-06-19 Thread Gaurav Sharma
hi Otis I haven't tried Tiks? Is it open source? had u heard about LIUS before or is it talked aroung industry? And what about Solr. It seems you worked on Solr and Nutch. Otis Gospodnetic wrote: > > Gaurav, have you tried Tika? (sub-project of Apache Lucene) > > > Otis > -- > Sematext -- ht

Re: indexing unsupported mime types using Lucene

2008-06-19 Thread Otis Gospodnetic
Gaurav, have you tried Tika? (sub-project of Apache Lucene) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Gaurav Sharma <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Wednesday, June 18, 2008 10:07:22 AM > Subject: indexing

RE: indexing unsupported mime types using Lucene

2008-06-19 Thread Steven A Rowe
Hi Gaurav, To which mime types are you referring? I can't think of a tool designed for this, but one thing you might try is checking whether the input is compressed/packed, and if so first decompressing/unpacking it, and then using the "strings" program (available on Linux and Cygwin) to extra