I am exploring alternative field compression schemes, designed to perform more 
effectively on small documents.  In particular to be more effective compressing 
stored fields that show repetitive structure across fields, but not necessarily 
within a field.  I have been working with a significant index from the 
ShopStyle 
search/shopping engine, and have achieved compression rates exceeding that of 
gzip.  The work related to this project is at 
https://github.com/gtoubassi/femtozip.

I'd like to connect with lucene users who would benefit from more effective 
compression of their stored fields to assess applicability of femtozip.  Note 
as 
part of the project I have a simple tool which can analyze an index to compare 
compression rates of femtozip vs gzip 
(https://github.com/gtoubassi/femtozip/wiki/Indexanalyzer).

gt


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to