Re: Regarding Compression Tool

2013-09-18 Thread Jebarlin Robertson
Hi, Thanks Mark Miller for your advise. I had missed some of the part, thats why I could not get the proper value. I should get the binaryvalue instead of get() for compressed content. I tested all the scnarious and I have some doubts, 1. I observed that while searching with highlighter tool, it

Re: Regarding Compression Tool

2013-09-17 Thread Jebarlin Robertson
Thanks Mark. I know all this scenarios about battery and space. But at the same I am just checking the feasibility only. Actually I started this to ask how to use the CompressionTool to compress the data and store it in index. I observed the below things and I tried using this way * Field field =

Re: Regarding Compression Tool

2013-09-16 Thread Mark Miller
Have you considered storing your indexes server-side? I haven't used compression but usually the trade-off of compression is CPU usage which will also be a drain on battery life. Or maybe consider how important the highlighter is to your users - is it worth the trade-off of either disk space or bat

Re: Regarding Compression Tool

2013-09-16 Thread Jebarlin Robertson
I am using Apache Lucene in Android. I have around 1 GB of Text documents (Logs). When I Index these text documents using this *new Field(ContentIndex.KEY_TEXTCONTENT, contents, Field.Store.YES, Field.Index.ANALYZED,TermVector.WITH_POSITIONS_OFFSETS)*, the index directory is consuming 1.59GB memory

Re: Regarding Compression Tool

2013-09-14 Thread Erick Erickson
bq: I thought that I can use the CompressionTool to minimize the memory size. This doesn't make a lot of sense. Highlighting needs the raw data to figure out what to highlight, so I don't see how the CompressionTool will help you there. And unless you have a huge document and only a very few of t

Re: Regarding Compression Tool

2013-09-14 Thread Jebarlin Robertson
Thank you very much Erick. Actually I was using Highlighter tool, that needs the entire data to be stored to get the relevant searched sentence. But when I use that, It was consuming more memory (Indexed data size + Store.YES - the entire content) than the actual documents size. I thought that I c

Re: Regarding Compression Tool

2013-09-13 Thread Erick Erickson
Compression is for the _stored_ data, which is not searched. Ignore the compression and insure that you index the data. The compressing/decompressing for looking at stored values is, I believe, done at a very low level that you don't need to care about at all. If you index the data in the field,

Re: Regarding Compression Tool

2013-09-13 Thread Ian Lea
Are you talking about CompressionTools as in http://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/document/CompressionTools.html? They've long been superseded by a completely different, low-level, transparent compression method. Anyway, use them to compress stored fields, not fields you