Re: Compression algorithm for posting lists

2016-04-03 Thread Vishwas Jain
increase significantly. Le jeu. 31 mars 2016 à 14:08, Vishwas Jain a écrit : > ​Hi Adrien, >Thanks for the help, actually we are trying to compress ​the > actual posting lists. Our main aim here is to save the disk space as much > as possible occupied by the index crea

Re: Compression algorithm for posting lists

2016-03-31 Thread Vishwas Jain
options? Yours, Vishwas Jain 13CS10053 Computer Science and Engineering IIT Kharagpur Contact - +91 9800168231 On Tue, Mar 29, 2016 at 1:41 PM, Adrien Grand wrote: > BlockTreeTermsWriter.TermsWriter.finish writes a FST that serves as an > index of the terms dictionary. It will be used at

Re: Compression algorithm for posting lists

2016-03-28 Thread Vishwas Jain
BlockTreeTermsWriter.java file do? I have come to undertstand that the posting lists(document ID, frequency, etc) is mainly is mainly written using WriteBlock method in the ForUtil.java file... Thanks.. On Mon, Mar 28, 2016 at 5:31 PM, Vishwas Jain wrote: > Thanks for the reply and informat

Re: Compression algorithm for posting lists

2016-03-28 Thread Vishwas Jain
(and one of my personal projects is aimed at doing exactly > this), but I doubt LZ4 compressing the posting list would help all that > much. > > Hope this helps > > On Mon, Mar 28, 2016, at 10:51 AM, Vishwas Jain wrote: > > Hello , > > > > We are try

Compression algorithm for posting lists

2016-03-28 Thread Vishwas Jain
Hello , We are trying to implement better compression techniques in lucene54 codec of Apache Lucene. Currently there is no such compression for posting lists in lucene54 codec but LZ4 compression technique is used for stored fields. Does anyone know why there is no compression technique

Re: Single string automaton causes NPE on Terms.intersect( CompiledAutomaton, BytesRef term )

2016-03-26 Thread Vishwas Jain
Hello , We are trying to implement better compression techniques in lucene54 codec of Apache Lucene. Currently there is no such compression for posting lists in lucene54 codec but LZ4 compression technique is used for stored fields. Does anyone know why there is no compression technique f