I haven't heard from any other core contributors, but this sounds like a
worthy project to me.  Someone from the ZFS team should follow through
to create the project on os.org[1]

Its sounds like like Domingos and Roland might constitute the initial
"project team".

- Eric

[1] 
http://www.opensolaris.org/os/community/ogb/policies/project-instantiation.txt

On Sun, Oct 07, 2007 at 03:56:04PM -0300, Domingos Soares wrote:
> Hi,
> 
> No news. I received some very good suggestions, but unfortunately I
> didn't get as much discussion as I had hoped it would. I'm sending the
> project proposal again. I think that there are a lot of interesting
> things to research and develop regarding the subject and I hope this
> time we discuss a bit more about it. I would like to point out the
> Adam Leventhal's suggestion of an adaptive compression scheme: I think
> it would be a challenging and interesting direction to take. Besides,
> there are some new results about BWT that I'm sure would be of
> interest in this context.
> 
> 
> Kind Regards,
> 
> Domingos.
> 
> Follows the text of my original proposal:
> 
> -----------------------------------------------------------------------------------------------------------------
> 
> Bellow, follows a proposal for a new opensolaris project. Of course,
> this is open to change since I just wrote down some ideas I had months
> ago, while researching the topic as a graduate student in Computer
> Science, and since I'm not an opensolaris/ZFS expert at all. I would
> really appreciate any suggestion or comments.
> 
> PROJECT PROPOSAL: ZFS Compression Algorithms.
> 
> The main purpose of this project is the development of new
> compression schemes for the ZFS file system. We plan to start with
> the development of a fast implementation of a Burrows Wheeler
> Transform based algorithm (BWT). BWT is an outstanding tool
> and the currently known lossless compression algorithms
> based on it outperform the compression ratio of algorithms derived from the 
> well
> known Ziv-Lempel algorithm, while being a little more time and space
> expensive. Therefore, there is space for improvement: recent results
> show that the running time and space needs of such algorithms can be
> significantly reduced and the same results suggests that BWT is
> likely to become the new standard in compression
> algorithms[1]. Suffixes Sorting (i.e. the problem of sorting suffixes of a
> given string) is the main bottleneck of BWT and really significant
> progress has been made in this area since the first algorithms of
> Manbers and Myers[2] and Larsson and Sadakane[3], notably the new
> linear time algorithms of Karkkainen and Sanders[4]; Kim, Sim and
> Park[5] and Ko e aluru[6] and also the promising O(nlogn) algorithm of
> Karkkainen and Burkhardt[7].
> 
> As a conjecture, we believe that some intrinsic properties of ZFS and
> file systems in general (e.g. sparseness and data entropy in blocks)
> could be exploited in order to produce brand new and really efficient
> compression algorithms, as well as the adaptation of existing ones to
> the task. The study might be extended to the analysis of data in
> specific applications (e.g. web servers, mail servers and others) in
> order to develop compression schemes for specific environments and/or
> modify the existing Ziv-Lempel based scheme to deal better with such
> environments.
> 
> [1] "The Burrows-Wheeler Transform: Theory and Practice". Manzini,
> Giovanni. Proc. 24th Int. Symposium on Mathematical Foundations of
> Computer Science
> 
> [2] "Suffix Arrays: A New Method for
> On-Line String Searches". Manber, Udi and Myers, Eugene W.. SIAM
> Journal on Computing, Vol. 22 Issue 5. 1990
> 
> [3] "Faster suffix sorting". Larsson, N Jasper and Sadakane,
> Kunihiko. TECHREPORT, Department of Computer Science, Lund University,
> 1999
> 
> [4] "Simple Linear Work Suffix Array Construction". Karkkainen, Juha
> and Sanders,Peter. Proc. 13th International Conference on Automata,
> Languages and Programming, 2003
> 
> [5]"Linear-time construction of suffix arrays" D.K. Kim, J.S. Sim,
> H. Park, K. Park, CPM, LNCS, Vol. 2676, 2003
> 
> [6]"Space ecient linear time construction of sux arrays",P. Ko and
> S. Aluru, CPM 2003.
> 
> [7]"Fast Lightweight Suffix Array Construction and
> Checking". Burkhardt, Stefan and K?rkk?inen, Juha. 14th Annual
> Symposium, CPM 2003,
> 
> 
> Domingos Soares Neto
> University of Sao Paulo
> Institute of Mathematics and Statistics
> 
> and
> 
> IBM Software Group.
> 
> __________________________________________________________________________
> 
> 
> On 10/7/07, roland <[EMAIL PROTECTED]> wrote:
> > any news on additional compression-schemes for zfs ?
> >
> > this is interesting research-topic, imho :)
> >
> > so, some more real-world tests with zfs-fuse + lzo patch :
> >
> > -LZO------------------------------------------------
> > zfs set compression=lzo mypool
> >
> > time cp /vmware/vserver1/vserver1.vmdk /mypool
> >
> > real    7m8.540s
> > user    0m0.708s
> > sys     0m24.839s
> >
> > zfs get compressratio mypool
> > NAME    PROPERTY       VALUE   SOURCE
> > mypool  compressratio  1.74x   -
> >
> > 1.7G    vserver1.vmdk  compressed
> > 3.0G    vserver1.vmdk  uncompressed
> >
> > -LZJB------------------------------------------------
> > zfs set compression=lzjb mypool
> >
> > time cp /vmware/vserver1/vserver1.vmdk /mypool
> >
> > real    7m16.392s
> > user    0m0.709s
> > sys     0m25.107s
> >
> > zfs get compressratio mypool
> > NAME    PROPERTY       VALUE   SOURCE
> > mypool  compressratio  1.47x   -
> >
> > 2.0G    vserver1.vmdk compressed
> > 3.0G    vserver1.vmdk uncompressed
> >
> > -GZIP------------------------------------------------
> > zfs set compression=gzip mypool
> >
> > time cp /vmware/vserver1/vserver1.vmdk /mypool/
> >
> > real    12m54.183s
> > user    0m0.653s
> > sys     0m24.933s
> >
> > zfs get compressratio
> > NAME    PROPERTY       VALUE   SOURCE
> > mypool  compressratio  2.02x   -
> >
> > 1.5G    vserver1.vmdk    compressed
> > 3.0G    vserver1.vmdk    uncompressed
> >
> >
> > btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources) is at 
> > http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4
> >
> >
> > This message posted from opensolaris.org
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss@opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to