RE: Lucene search in attachments

2015-02-10 Thread Uwe Schindler
Hi, > -Original Message- > From: sreedevi s [mailto:sreedevi.payik...@gmail.com] > Sent: Tuesday, February 10, 2015 10:46 AM > To: java-user@lucene.apache.org > Subject: Re: Lucene search in attachments > > Hi Uwe, > Thank you for the info update.I will remove the

Re: Lucene search in attachments

2015-02-10 Thread sreedevi s
Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: sreedevi s [mailto:sreedevi.payik...@gmail.com] > > Sent: Tuesday, February 10, 2015

RE: Lucene search in attachments

2015-02-10 Thread Uwe Schindler
.payik...@gmail.com] > Sent: Tuesday, February 10, 2015 9:53 AM > To: java-user@lucene.apache.org > Subject: Re: Lucene search in attachments > > Thank you David. Yes, it has a restriction of characters to 1. > But for large files, what could be done in that case? > > Best Reg

Re: Lucene search in attachments

2015-02-10 Thread sreedevi s
No David. By increasing the value or I can set to -1 to make it unlimited but still I cannot assure that my whole text can be searchable, which is still a problem with large files because only the part which is indexed will be searchable. Was looking for some alternatives. Best Regards, Sreedevi S

Re: Lucene search in attachments

2015-02-10 Thread David Pilato
I don’t understand. If you don’t raise this restriction to a higher value (or to -1), all the text won’t be extracted so only a subset of the text will be indexed. Non indexed parts of the text won’t be searchable. Did I misunderstand your question? -- David Pilato | Technical Advocate | Elasti

Re: Lucene search in attachments

2015-02-10 Thread sreedevi s
Thank you David. Yes, it has a restriction of characters to 1. But for large files, what could be done in that case? Best Regards, Sreedevi S On Tue, Feb 10, 2015 at 2:04 PM, David Pilato wrote: > If you don’t index content, you won’t be able to search for it I guess. > That said, Tika can

Re: Lucene search in attachments

2015-02-10 Thread David Pilato
If you don’t index content, you won’t be able to search for it I guess. That said, Tika can have this extracted characters limit. See indexedChars below: tika().parseToString(new BytesStreamInput(content, false), metadata, indexedChars); [1] https://github.com/elasticsearch/elasticsearch-mappe

Lucene search in attachments

2015-02-10 Thread sreedevi s
Hi, Which is the best method to search in attachments in lucene? I am new to lucene and I am using version 4.10.2. By making use of Tika, I know I can convert files to text and then index it as another field. But for large files that will not be the ideal solution. I believe the maximum charact