I don’t understand. If you don’t raise this restriction to a higher value (or to -1), all the text won’t be extracted so only a subset of the text will be indexed. Non indexed parts of the text won’t be searchable.
Did I misunderstand your question? -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr <https://twitter.com/elasticsearchfr> | @scrutmydocs <https://twitter.com/scrutmydocs> > Le 10 févr. 2015 à 09:52, sreedevi s <sreedevi.payik...@gmail.com> a écrit : > > Thank you David. Yes, it has a restriction of characters to 10000. > But for large files, what could be done in that case? > > Best Regards, > Sreedevi S > > On Tue, Feb 10, 2015 at 2:04 PM, David Pilato <da...@pilato.fr> wrote: > >> If you don’t index content, you won’t be able to search for it I guess. >> That said, Tika can have this extracted characters limit. See indexedChars >> below: >> >> tika().parseToString(new BytesStreamInput(content, false), metadata, >> indexedChars); >> >> [1] >> https://github.com/elasticsearch/elasticsearch-mapper-attachments/blob/master/src/main/java/org/elasticsearch/index/mapper/attachment/AttachmentMapper.java#L456 >> < >> https://github.com/elasticsearch/elasticsearch-mapper-attachments/blob/master/src/main/java/org/elasticsearch/index/mapper/attachment/AttachmentMapper.java#L456 >>> >> >> -- >> David Pilato | Technical Advocate | Elasticsearch.com >> @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr < >> https://twitter.com/elasticsearchfr> | @scrutmydocs < >> https://twitter.com/scrutmydocs> >> >> >> >>> Le 10 févr. 2015 à 09:24, sreedevi s <sreedevi.payik...@gmail.com> a >> écrit : >>> >>> Hi, >>> Which is the best method to search in attachments in lucene? I am new >>> to lucene and I am using version 4.10.2. By making use of Tika, I know I >>> can convert files to text and then index it as another field. But for >> large >>> files that will not be the ideal solution. I believe the maximum >> characters >>> per field is 10,000. So, what can be ideal method to search attachments >> then >>> >>> >>> Best Regards, >>> Sreedevi S >> >>