I haven't heard from any other core contributors, but this sounds like a worthy project to me. Someone from the ZFS team should follow through to create the project on os.org[1]
Its sounds like like Domingos and Roland might constitute the initial "project team". - Eric [1] http://www.opensolaris.org/os/community/ogb/policies/project-instantiation.txt On Sun, Oct 07, 2007 at 03:56:04PM -0300, Domingos Soares wrote: > Hi, > > No news. I received some very good suggestions, but unfortunately I > didn't get as much discussion as I had hoped it would. I'm sending the > project proposal again. I think that there are a lot of interesting > things to research and develop regarding the subject and I hope this > time we discuss a bit more about it. I would like to point out the > Adam Leventhal's suggestion of an adaptive compression scheme: I think > it would be a challenging and interesting direction to take. Besides, > there are some new results about BWT that I'm sure would be of > interest in this context. > > > Kind Regards, > > Domingos. > > Follows the text of my original proposal: > > ----------------------------------------------------------------------------------------------------------------- > > Bellow, follows a proposal for a new opensolaris project. Of course, > this is open to change since I just wrote down some ideas I had months > ago, while researching the topic as a graduate student in Computer > Science, and since I'm not an opensolaris/ZFS expert at all. I would > really appreciate any suggestion or comments. > > PROJECT PROPOSAL: ZFS Compression Algorithms. > > The main purpose of this project is the development of new > compression schemes for the ZFS file system. We plan to start with > the development of a fast implementation of a Burrows Wheeler > Transform based algorithm (BWT). BWT is an outstanding tool > and the currently known lossless compression algorithms > based on it outperform the compression ratio of algorithms derived from the > well > known Ziv-Lempel algorithm, while being a little more time and space > expensive. Therefore, there is space for improvement: recent results > show that the running time and space needs of such algorithms can be > significantly reduced and the same results suggests that BWT is > likely to become the new standard in compression > algorithms[1]. Suffixes Sorting (i.e. the problem of sorting suffixes of a > given string) is the main bottleneck of BWT and really significant > progress has been made in this area since the first algorithms of > Manbers and Myers[2] and Larsson and Sadakane[3], notably the new > linear time algorithms of Karkkainen and Sanders[4]; Kim, Sim and > Park[5] and Ko e aluru[6] and also the promising O(nlogn) algorithm of > Karkkainen and Burkhardt[7]. > > As a conjecture, we believe that some intrinsic properties of ZFS and > file systems in general (e.g. sparseness and data entropy in blocks) > could be exploited in order to produce brand new and really efficient > compression algorithms, as well as the adaptation of existing ones to > the task. The study might be extended to the analysis of data in > specific applications (e.g. web servers, mail servers and others) in > order to develop compression schemes for specific environments and/or > modify the existing Ziv-Lempel based scheme to deal better with such > environments. > > [1] "The Burrows-Wheeler Transform: Theory and Practice". Manzini, > Giovanni. Proc. 24th Int. Symposium on Mathematical Foundations of > Computer Science > > [2] "Suffix Arrays: A New Method for > On-Line String Searches". Manber, Udi and Myers, Eugene W.. SIAM > Journal on Computing, Vol. 22 Issue 5. 1990 > > [3] "Faster suffix sorting". Larsson, N Jasper and Sadakane, > Kunihiko. TECHREPORT, Department of Computer Science, Lund University, > 1999 > > [4] "Simple Linear Work Suffix Array Construction". Karkkainen, Juha > and Sanders,Peter. Proc. 13th International Conference on Automata, > Languages and Programming, 2003 > > [5]"Linear-time construction of suffix arrays" D.K. Kim, J.S. Sim, > H. Park, K. Park, CPM, LNCS, Vol. 2676, 2003 > > [6]"Space ecient linear time construction of sux arrays",P. Ko and > S. Aluru, CPM 2003. > > [7]"Fast Lightweight Suffix Array Construction and > Checking". Burkhardt, Stefan and K?rkk?inen, Juha. 14th Annual > Symposium, CPM 2003, > > > Domingos Soares Neto > University of Sao Paulo > Institute of Mathematics and Statistics > > and > > IBM Software Group. > > __________________________________________________________________________ > > > On 10/7/07, roland <[EMAIL PROTECTED]> wrote: > > any news on additional compression-schemes for zfs ? > > > > this is interesting research-topic, imho :) > > > > so, some more real-world tests with zfs-fuse + lzo patch : > > > > -LZO------------------------------------------------ > > zfs set compression=lzo mypool > > > > time cp /vmware/vserver1/vserver1.vmdk /mypool > > > > real 7m8.540s > > user 0m0.708s > > sys 0m24.839s > > > > zfs get compressratio mypool > > NAME PROPERTY VALUE SOURCE > > mypool compressratio 1.74x - > > > > 1.7G vserver1.vmdk compressed > > 3.0G vserver1.vmdk uncompressed > > > > -LZJB------------------------------------------------ > > zfs set compression=lzjb mypool > > > > time cp /vmware/vserver1/vserver1.vmdk /mypool > > > > real 7m16.392s > > user 0m0.709s > > sys 0m25.107s > > > > zfs get compressratio mypool > > NAME PROPERTY VALUE SOURCE > > mypool compressratio 1.47x - > > > > 2.0G vserver1.vmdk compressed > > 3.0G vserver1.vmdk uncompressed > > > > -GZIP------------------------------------------------ > > zfs set compression=gzip mypool > > > > time cp /vmware/vserver1/vserver1.vmdk /mypool/ > > > > real 12m54.183s > > user 0m0.653s > > sys 0m24.933s > > > > zfs get compressratio > > NAME PROPERTY VALUE SOURCE > > mypool compressratio 2.02x - > > > > 1.5G vserver1.vmdk compressed > > 3.0G vserver1.vmdk uncompressed > > > > > > btw - lzo-patch for zfs-fuse (does apply to latest zfs-fuse sources) is at > > http://groups.google.com/group/zfs-fuse/attach/a489f630aa4aa189/zfs-lzo.diff.bz2?part=4 > > > > > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss