RE: indexing pdfs

2007-03-09 Thread Kainth, Sachin
7 02:48 To: java-user@lucene.apache.org Subject: Re: indexing pdfs hi sachin the link wat u gave me only a zip file and an exe file for downoad. and this zip file also contains no class files.but wouldn't we be requiring a jar file or class file ??? On 3/8/07, Kainth, Sachin <[EMAIL PROTECT

Re: indexing pdfs

2007-03-08 Thread ashwin kumar
---Original Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 13:07 To: java-user@lucene.apache.org Subject: Re: indexing pdfs hi again do we have to download any jar files to run this program if so can u give me the link pls ashwin On 3/8/07, Kainth, Sachin <[E

RE: indexing pdfs

2007-03-08 Thread Kainth, Sachin
Hi, Here it is: http://www.seekafile.org/ -Original Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 13:07 To: java-user@lucene.apache.org Subject: Re: indexing pdfs hi again do we have to download any jar files to run this program if so can u give me the

Re: indexing pdfs

2007-03-08 Thread ashwin kumar
t in-memory. The only other way I have heard of is to use Ifilters. I believe SeekAFile does indexing of pdfs. Sachin -Original Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 11:35 To: java-user@lucene.apache.org Subject: Re: indexing pdfs Is the only way index

RE: indexing pdfs

2007-03-08 Thread Kainth, Sachin
kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 11:35 To: java-user@lucene.apache.org Subject: Re: indexing pdfs Is the only way index pdfs is to convert it into a text and then only index it ??? On 3/8/07, Kainth, Sachin <[EMAIL PROTECTED]> wrote: > > Hi Aswin, > > You can tr

Re: indexing pdfs

2007-03-08 Thread ashwin kumar
ipper = new PDFTextStripper(); // get text from doc using stripper return stripper.getText(doc); } Sachin -Original Message- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 09:37 To: java-user@lucene.apache.org Subject: indexing pdfs hi can some one

Re: indexing pdfs

2007-03-08 Thread Ulf Dittmer
For DOC files you can use the Jakarta POI library. Text extraction is outlined here: http://jakarta.apache.org/poi/hwpf/quick-guide.html Ulf On 08.03.2007, at 10:37, ashwin kumar wrote: hi can some one help me by giving any sample programs for indexing pdfs and .doc files

RE: indexing pdfs

2007-03-08 Thread Kainth, Sachin
ssage- From: ashwin kumar [mailto:[EMAIL PROTECTED] Sent: 08 March 2007 09:37 To: java-user@lucene.apache.org Subject: indexing pdfs hi can some one help me by giving any sample programs for indexing pdfs and .doc files thanks regards ashwin This message has been scanned for viruses by MailCo

indexing pdfs

2007-03-08 Thread ashwin kumar
hi can some one help me by giving any sample programs for indexing pdfs and .doc files thanks regards ashwin