Package: debhelper
Version: 7.0.15
Tag: patch

Hi,

I've recently encountered a weird situation.  Routine rebuilding
of Debian packages during daily development started taking
enormous amounts of time compared to past behaviour.

Debugging narrowed the slowdown down to calling `rm -f' from
within dh_movefiles via xargs.

(Now, please do not question usage of dh_movefiles instead of
dh_install, that's a separate topic.)

The part of dh_movefiles that is responsible for removal of
files after copying them does that in a very inefficient way.

The approach tracks back to a patch submitted by Yann Dirson
with the bug report #233226.  The goal was noble, the solution
was... inferior (other expletives suppressed).

    tr \\n \\0 < movelist | xargs -0 -i rm -f '{}'

A separate pair of shell and `rm' processes gets spawned per
individual file to be removed.  The `tr' is redundant, but
insignificant.

I've researched the problem in a Lenny environment, but the same
code is present in the latest debhelper 8.1.6 (wheezy).  It may
have been going unnoticed for many years, perhaps adding to the
worldwide resistance to Moore's law.  Indeed, fast modern
computers may have masked the inefficiency of this code.  In my
case, a certain set of circumstances had to align:

    - ~1300 files under dh_movefiles attention in one package;
    - building within a networked AFS filespace;
    - other processes taking much CPU in parallel (it was this
      factor that had to grow over a certain threshold to
      avalanche the problem).

As a result, the removal of ~1300 files was taking up to ~150
seconds.  Removal of the same files on a local disk took up to
~10 seconds under the same other circumstances, which still is
too much comparatively.  Hence, the AFS aspect of the
environment wasn't decisive, it merely added a factor of about
15 to the worst case (much less slowdown normally).

Here's a trivial fix to the dh_movefiles:

--- /usr/bin/dh_movefiles~      2011-05-24 14:17:37.000000000 +1200
+++ /usr/bin/dh_movefiles       2011-05-24 14:20:06.000000000 +1200
@@ -145,7 +145,7 @@
                complex_doit("(cd $sourcedir >/dev/null ; tar --create 
--files-from=$pwd/debian/movelist --file -) | (cd $tmp >/dev/null ;tar xpf -)");
                # --remove-files is not used above because tar then doesn't
                # preserve hard links
-               complex_doit("(cd $sourcedir >/dev/null ; tr '\\n' '\\0' < 
$pwd/debian/movelist | xargs -0  -i rm -f '{}')");
+               complex_doit("(cd $sourcedir >/dev/null ; < 
$pwd/debian/movelist xargs -rd'\\n' rm -f)");
                doit("rm","-f","debian/movelist");
        }
 }

Minimal number of processes spawn.  The effect is removal of the
same ~1300 files within ~0.6 seconds (speedup factor of 250)
from AFS filespace and within ~0.15 seconds from local disk
(speedup factor of 67) under the same other circumstances.

Below I put another trivial patch, following the one above.

--- /usr/bin/dh_movefiles~      2011-05-24 15:18:50.000000000 +1200
+++ /usr/bin/dh_movefiles       2011-05-24 15:32:19.000000000 +1200
@@ -142,10 +142,10 @@
                }
                my $pwd=`pwd`;
                chomp $pwd;
-               complex_doit("(cd $sourcedir >/dev/null ; tar --create 
--files-from=$pwd/debian/movelist --file -) | (cd $tmp >/dev/null ;tar xpf -)");
+               complex_doit("tar c -C $sourcedir -T $pwd/debian/movelist | tar 
xp -C $tmp");
                # --remove-files is not used above because tar then doesn't
                # preserve hard links
-               complex_doit("(cd $sourcedir >/dev/null ; < 
$pwd/debian/movelist xargs -rd'\\n' rm -f)");
+               complex_doit("cd $sourcedir && < $pwd/debian/movelist xargs 
-rd'\\n' rm -f");
                doit("rm","-f","debian/movelist");
        }
 }

It doesn't have to to with the main problem I'm reporting, but
makes further enhancements to the same area of dh_movefiles:

    - spawns no redundant shell copies;
    - removes useless `tar' options and abridges useful ones;
    - does away with useless suppression of stdout of `cd'
      builtin (it doesn't output anything to stdout);
    - doesn't proceed to copy or remove any files if changing to
      the prescribed directory fails for any reason (currently
      it's possible to litter the filespace or remove important
      files inadvertently).

Feel free to use just the first one or both.

Cheers,

-- 
/Dennis Vshivkov <wal...@amur.ru>



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to