Hi, third summary of the proposal
1. The new field Files-Excluded in debian/copyright contains a space separated list of globs (as used by find and for specifying file lists in machine readable debian/control files). The deletion process will loop over every expression rm -rf ${MAIN_SOURCE_DIR}/<expression> An example copyright file would look like this: Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Source: http://susy.oddbird.net/ Repackaged, excluding non-DFSG licensed fonts and source-less JavaScript Files-Excluded: docs/source/fonts/* docs/source/javascripts/jquery-1.7.1.min.js docs/source/javascripts/modernizr-2.5.3.min.js 2. If files matching are contained in the source tarball this will be repackaged except if the option --no-exclusion is given at uscan command line or if USCAN_NO_EXCLUSION is set in /etc/devscripts.conf or ~/.devscripts. 3. If the tarball did not contained any of the globs in debian/copyright::Files-Excluded it should be left untouched (except if the repackaging is needed because of compression method anyway if the user forces --repack). 4. In case something was removed the version string will be appended by '+dfsg' to express the fact that the content of the original source was changed. This discussion brought up additional new wishlist features for uscan: a) Configurable option when repacking (this is somehow related to the suggestion above but I would like to split up this to a different task). b) Uscan should be enabled to download VCS repositories (and once it does deletion of files should be possible according to the same method above (this is an interesting feature in principle but once uscan is able to delete files it can do it for any download method). c) The suggested repackaging method was requested for non-uscan based downloads (for instance from VCS) which might have an influence for the final implementation as a separate tool which could simply called by uscan (and others). d) Enable confirguration of compression method. I'd consider this an unrelated feature which also could be useful for --repack. I admit once we are repackaging anyway it might be reasonable to be able to influence the compression method but I also would like to split this up to a different task. Regarding the implementation there was some uncertainity about the actual Perl module to use. In the attached example script I decided to stick to Dpkg::Control and left the code for Parse::DebControl as a comment which could pretty easily could replace the other parser. The code works for me however, there might be some remaining empty directories which I'm tempted to delete these as well via an "educated" find tmp -type d -empty -delete which means I would care for deleting only those directories that became empty by the previous removal process and not those directories which were originally empty in the tarball. On the other hand we might simply go with those empty dirs that finally do not harm. Any further hints / remarks? Kind regards Andreas. -- http://fam-tille.de
#!/usr/bin/perl use strict; use warnings; my $parsefile = 'debian/copyright'; # Dpkg::Control::Hash prefered by James McCoy (who did the last three uscan.pl edits using a debian.org e-mail address) use Dpkg::Control::Hash; my $data = Dpkg::Control::Hash->new(); $data->load($parsefile); # Parse::DebControl suggested by Jonas Smedegaard # use Parse::DebControl; # my $parser = new Parse::DebControl(1); # my $data = $parser->parse_file($parsefile, {discardCase=>1,singleBlock=>1,}); my $okformat = qr'http://www.debian.org/doc/packaging-manuals/copyright-format/1.0'; my $main_source_dir = $ARGV[0] ; die unless ($data->{'format'} =~ m{^$okformat/?$}); if ( $data->{'files-excluded'} ) { my $nfiles_before = `find $main_source_dir | wc -l`; foreach (grep {/\//} split /\s+/, $data->{"files-excluded"}) { `find $main_source_dir -path "$main_source_dir/$_" -delete`; }; foreach (grep {/^[^\/]+$/} split /\s+/, $data->{"files-excluded"}) { `find $main_source_dir -type f -name $_ -delete`; }; my $nfiles_after = `find $main_source_dir | wc -l`; if ( $nfiles_before == $nfiles_after ) { print "Source tree remains identical - no need for repacking.\n" } }