Hi, this all started when I learned that an architecture wildcard like "any-armhf" is in fact invalid and in contrast to my earlier belief does not match the debian architecture "armhf":
$ dpkg-architecture -iany-armhf -aarmhf && echo yes || echo no no Since I thought it would be nice if lintian could warn about usage of such invalid architecture wildcards in package meta data I started writing a script that checks for the magnitude of the problem. In case this is useful for anybody, attached please find a script which, given a Sources file prints: - invalid architecture wildcards in build-depends and conflicts (prefix ID), the Architecture field (prefix IA) and generated binary packages as listed in the Package-List field (prefix IB) - superfluous wildcards (lists of wildcards that match any architecture more than once) in build-depends and conflicts (prefix DD), the Architecture field (prefix DA) and generated binary packages as listed in the Package-List field (prefix DB) Some (maybe) interesting results: Most common invalid architectures: $ ./findarchwildcardproblems.pl Sources | egrep '(ID|IA|IB)' | cut -d ' ' -f 3 | sort | uniq -c | sort -n 1 avr 1 darwin-any 1 disabled 1 freebsd-any 1 knetbsd-any 1 kopensolaris-amd64 1 kopensolaris-any 1 linux-ia64 1 linux-ppc64el 1 netbsd-any 1 openbsd-any 1 sh3eb 1 sh4eb 1 solaris-amd64 1 solaris-any 1 solaris-i386 2 any-armeb 2 any-armel 2 any-armhf 2 any-avr32 2 any-m32r 2 any-s390 2 any-sh3 2 any-sh3eb 2 any-sh4eb 2 m32r 2 netbsd-alpha 2 sh3 6 netbsd-i386 9 kfreebsd-alpha 9 knetbsd-alpha 10 knetbsd-i386 10 kopensolaris-i386 10 or1k 15 any-ia64 17 any-ppc64el 17 hurd-amd64 23 avr32 31 armeb 32 hurd-alpha 37 lpia 74 ppc64el 123 arm 434 mips64 434 mips64el 441 mipsn32 441 mipsn32el 634 ia64 676 s390 Packages with superfluous wildcards: $ ./findarchwildcardproblems.pl Sources | egrep '(DD|DA|DB)' DD: ettercap arm64 libluajit-5.1-dev DD: gcc-3.3 hurd-i386 locales DB: gcc-defaults s390x DD: gcc-snapshot sh4 gnat-4.9 DA: hyperestraier amd64 DA: hyperestraier i386 DA: kfreebsd-10 kfreebsd-amd64 DA: kfreebsd-10 kfreebsd-i386 DA: kfreebsd-10 kfreebsd-amd64 DA: kfreebsd-10 kfreebsd-i386 DA: kfreebsd-9 kfreebsd-amd64 DA: kfreebsd-9 kfreebsd-i386 So in summary, invalid wildcards are mostly produced from either old architectures that are not in the archive anymore or new architectures that are not in the archive yet or some creative other cases like "any-armhf" or "disabled". Superfluous wildcards seem to be a much smaller problem and only concern seven source packages. In my code I counted all debian architectures as "valid" which are listed on packages.debian.net. Is there a better way to retrieve "valid" architectures in this context? Does it make sense to bugreport some of these problems or do you think that would be a wasted effort? cheers, josch
#!/usr/bin/perl use strict; use warnings; # print invalid architecture wildcards (doesnt match any existing architecture) # and duplicate wildcards (an architecture is matched by more than one # wildcard) in build dependencies, conflicts, the architecture field and in # binary packages listed in the Package-List field use Dpkg::Control; use Dpkg::Compression::FileHandle; use Dpkg::Deps; use List::MoreUtils qw{any}; use List::Util qw{first}; use Dpkg::Arch qw(debarch_is); my $desc = $ARGV[0]; # /home/josch/gsoc2012/bootstrap/tests/sid-sources-20140101T000000Z if (not defined($desc)) { die "need filename"; } my $fh = Dpkg::Compression::FileHandle->new(filename => $desc); my @debarches = ("amd64", "armel", "armhf", "hurd-i386", "i386", "kfreebsd-amd64", "kfreebsd-i386", "mips", "mipsel", "powerpc", "s390x", "sparc", "alpha", "arm64", "hppa", "m68k", "powerpcspe", "ppc64", "sh4", "sparc64", "x32"); while (1) { my $cdata = Dpkg::Control->new(type => CTRL_INDEX_SRC); last if not $cdata->parse($fh, $desc); my $pkgname = $cdata->{"Package"}; next if not defined($pkgname); my @depfields = ('Build-Depends', 'Build-Depends-Indep', 'Build-Depends-Arch', 'Build-Conflicts', 'Build-Conflicts-Indep', 'Build-Conflicts-Arch'); # search for invalid arches in the dependency and conflict fields foreach my $depfield (@depfields) { my $dep_line = $cdata->{$depfield}; next if not defined($dep_line); foreach my $dep_and (split(/\s*,\s*/m, $dep_line)) { my @or_list = (); foreach my $dep_or (split(/\s*\|\s*/m, $dep_and)) { my $dep_simple = Dpkg::Deps::Simple->new($dep_or); my $depname = $dep_simple->{package}; next if not defined($depname); my $arches = $dep_simple->{arches}; next if not defined($arches); # find wildcards that do not match any existing architecture foreach my $arch (@{$arches}) { $arch =~ s/^!//; next if (any {debarch_is($_,$arch)} @debarches); print "ID: $pkgname $arch $depname\n"; } # search for duplicate arches in arch restrictions # set match frequency to zero for all arches my %matchfreq = (); foreach my $arch (@debarches) { $matchfreq{$arch} = 0; } # find duplicates foreach my $arch (@{$arches}) { $arch =~ s/^!//; foreach my $a (@debarches) { if (debarch_is($a, $arch)) { $matchfreq{$a} += 1; } } } # print duplicate matches foreach my $arch (@debarches) { if ($matchfreq{$arch} > 1) { print "DD: $pkgname $arch $depname\n"; } } } } } # search for invalid arches in Architecture field my $architecture = $cdata->{"Architecture"}; if (defined($architecture)) { # find wildcards that do not match any existing architecture foreach my $arch (split(/\s+/m, $architecture)) { next if ($arch eq "all"); next if (any {debarch_is($_,$arch)} @debarches); print "IA: $pkgname $arch\n"; } # search for duplicate arches in Architecture field # set match frequency to zero for all arches my %matchfreq = (); foreach my $arch (@debarches) { $matchfreq{$arch} = 0; } # find duplicates foreach my $arch (split(/\s+/m, $architecture)) { next if ($arch eq "all"); foreach my $a (@debarches) { if (debarch_is($a, $arch)) { $matchfreq{$a} += 1; } } } # print duplicate matches foreach my $arch (@debarches) { if ($matchfreq{$arch} > 1) { print "DA: $pkgname $arch\n"; } } } # gather the architectures of the generated binary packages my $packagelist = $cdata->{"Package-List"}; if (defined($packagelist)) { foreach my $line (split(/\n/m, $packagelist)) { my $architecture = first { /^arch=/ } split(/\s+/m, $line); next if (not defined($architecture)); $architecture =~ s/^arch=//; # find wildcards that do not match any existing architecture foreach my $arch (split(/,/m, $architecture)) { next if ($arch eq "all"); next if (any {debarch_is($_,$arch)} @debarches); print "IB: $pkgname $arch\n"; } # search for duplicate arches in Architecture field # set match frequency to zero for all arches my %matchfreq = (); foreach my $arch (@debarches) { $matchfreq{$arch} = 0; } # find duplicates foreach my $arch (split(/,/m, $architecture)) { next if ($arch eq "all"); foreach my $a (@debarches) { if (debarch_is($a, $arch)) { $matchfreq{$a} += 1; } } } # print duplicate matches foreach my $arch (@debarches) { if ($matchfreq{$arch} > 1) { print "DB: $pkgname $arch\n"; } } } } }