Package: dpkg-dev
Version: 1.14.8
Severity: normal
Tags: patch

dpkg-shlibdeps's latest incarnation (as of 1.14.8 and its experimental
predecessors) introduces a performance regression: it runs dpkg
--search once per executable or library being examined, rather than
caching its results any fashion.  As each call requires scanning every
package's contents AFAICT, the resulting procedure can take a LONG
time on systems with many packages installed.  (It would be great if
dpkg-query could itself run faster, but that's a separate issue.)

I previously reported the same high-level problem as #421290, but I'm
opening a new bug because the underlying code base is so different.
As before, I have put together a patch that I believe DTRT.  This
time, I've even tested it against complicated cases such as libkcal2b,
to avoid a repeat of #425641, for which I do sincerely apologize.  (At
any rate, the rewrite at least resulted in much clearer and more
readily patched logic.)

Could you please review and apply the attached patch (against 1.14.10)
when you get a chance?

Thanks!

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.22
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
--- dpkg-shlibdeps.1.14.10      2007-11-23 13:47:46.000000000 -0500
+++ dpkg-shlibdeps.optimized    2007-11-23 13:53:20.000000000 -0500
@@ -483,9 +483,22 @@
     return undef;
 }
 
+my %cached_pkgmatch = ();
+
 sub find_packages {
-    my @files = (@_);
+    my @files;
     my $pkgmatch = {};
+
+    foreach (@_) {
+       if (exists $cached_pkgmatch{$_}) {
+           $pkgmatch->{$_} = $cached_pkgmatch{$_};
+       } else {
+           push @files, $_;
+           $cached_pkgmatch{$_} = [""]; # placeholder to cache misses too.
+       }
+    }
+    return $pkgmatch unless @files;
+
     my $pid = open(DPKG, "-|");
     syserr(_g("cannot fork for dpkg --search")) unless defined($pid);
     if (!$pid) {
@@ -503,7 +516,7 @@
            print(STDERR " $_\n")
                || syserr(_g("write diversion info to stderr"));
        } elsif (m/^([^:]+): (\S+)$/) {
-           $pkgmatch->{$2} = [ split(/, /, $1) ];
+           $cached_pkgmatch{$2} = $pkgmatch->{$2} = [ split(/, /, $1) ];
        } else {
            warning(_g("unknown output from dpkg --search: '%s'"), $_);
        }

Reply via email to