Hi, I have a software product being built a few times a day (continuous integration style). The end product is an installable tar.gz with many java jars.
Since the content of the tar.gz's is mostly the same, I want to use a filesystem that would dedupe the duplicated content. As I see it, it's s FUSE filesystem that: 1. When a file with .tar.gz extension stored, it untar it and store it in a folder (keeping the file order in a list). 2. When it is read again, it will tar gz the underlying folder, and will give the gzip'd result. 3. It will keep a list of file hashes, and would replace the file with a symlink to another file if possible. 4. Bonus: do the same for jars. Java is linked at runtime, so if a .java file didn't change - neither does its class. Is there anything like that available? Is there a smarter solution? (It is theoretically possible to save a folder instead of a tar.gz, and dedupe at higher level, but it's much easier to use a tar.gz, since it plays well with existing Java software (ie, nexus/artifactory, maven etc).
_______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il