I am exploring alternative field compression schemes, designed to perform more effectively on small documents. In particular to be more effective compressing stored fields that show repetitive structure across fields, but not necessarily within a field. I have been working with a significant index from the ShopStyle search/shopping engine, and have achieved compression rates exceeding that of gzip. The work related to this project is at https://github.com/gtoubassi/femtozip.
I'd like to connect with lucene users who would benefit from more effective compression of their stored fields to assess applicability of femtozip. Note as part of the project I have a simple tool which can analyze an index to compare compression rates of femtozip vs gzip (https://github.com/gtoubassi/femtozip/wiki/Indexanalyzer). gt --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org