-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello,
I run svnserve under Linux under inetd, with low memory allowances for TCP send buffers. Checkout on the client side causes svnserve to burn CPU cycles, apparently endlessly without making any progress. Here are the gory details: My Setup ======== I'm running a Debian Lenny box, a i386 installation, with inetd out of the openbsd-inetd Debian package 0.20080125-2, this inetd starts svnserve from Debian package subversion 1.5.1dfsg1-4 (I don't think it matters much:) listening on a non-standard port (I have another repository on the standard port), through the inetd.conf line > 3691 stream tcp nowait svn /usr/bin/svnserve svnserve -i -r /path-to-repository The repository is plain fsfs; /path-to-repository/format has a single "5", followed by a line feed. I'm the only person who ever uses this particular repository. (It's mostly for backup and transfering certain files from work to home and back.) While the problem occurs, there is no other concurrent access to the repository, besides the failing svn co. There is virtualization at work here. The entire system is one slice of an OpenVZ installation. I try to get by with fairly little memory devoted to this particular slice. Thus far, I've been describing the guest. The host also runs Debian Lenny, namely, the current linux-image-2.6.26-2-openvz-686 kernel. I don't think this matters much, but, for the record (and to make slightly more interesting reading), there is another level of virtualization at work here: My OpenVZ host happens to be a Xen guest. So virtualisation is stacked on top of virtualization. I do not control and have no access to the Xen host. I'm using Debian's libc6-xen version 2.7-18lenny2 on both my vz host and my vz slice. The problem =========== I'm doing a plain vanilla "svn co" on the client. It starts to show some checkout activity, adding some files, then stops and seems to hang. No reaction for as long as I care to wait. In the meantime, I obtain a shell prompt at the vz server slice, look around and see: There is this svnserve process that uses what CPU cycles it can get. I can kill it, then, back on the client, remove thedirectory with the half-baked checkout, and reproduce the problem at will. For good luck, I try "svnadmin verify" on the server. Ooops - shouldn't have done that as root. After fixing the permissions, the problem reappears. Killing the svn process on the client seems to need "kill -9" and does not help, the CPU waste on the server continues. Looking at /proc/user_beancounters on the server, I see failure counts at both tcpsndbuf and (probably unrelated) at tcprcvbuf. The failure count on tcpsndbuf increases each time I exercise the problem, not by 1, but by 3. In this particular situation, the barrier on tcpsndbuf is set to 319488 and the limit to 524288, the maxheld seen is 320320. Becoming curious, I take an strace of the svnserve. I see a more or less endless loop: First, svnserve keeps reopening /path-to-repository/db/revs/0/3 For the record: That entire repository boasts 31 revisions, but /0/3, with a 8162358 byte file, seems to be the largest. Most revisions add new files. There are relatively few changes of existing material in this particular repository. Revision 3 was adding a whole lot of new files, with no other changes. The largest individual file added in that revision was 807993 byte long. This revision file .../db/revs/0/3 gets opened and closed several times (weird in itself). I see some seek and read activity of that file, and also some writes to file descriptor 1. All writes to file descriptor 1 write the full number of bytes intended (up to the last one, which was interrupted by me killing the process). I also see several poll timeouts poll([{fd=0, events=POLLIN}], 1, 0) = 0 (Timeout) and several successful brk calls, before it finally all starts over again. More or less over again, that is. There seems to be an _llseek into that revision file that seeks differently each time. The precise numbers do not grow, they fluctuate. I finally increase this vz slice's tcpsndbuf allowance to 614400:921600 and, voila, that solves the problem. Removing the client side partial checkout directory one more time and starting over again, the checkout goes through this time. For the record: Afterwards, on the server, the tcpsndbuf shows a new maxheld of 416640. I set the tcpsndbuf barrier and limit back to the old, smaller values and replace openbsd-inetd with rlinetd and then with inetutils-inetd. With these inetd implementations, I can also reproduce the CPU waste behavior of svnserve. So it doesn't seem to be a bug in the particular inetd implementation. What's next? ============ I'd be happy to open a bug over at http://subversion.tigris.org/issue-tracker.html it that's what would help solve this issue. I could of course also reproduce the problem and investigate further, if that'd help. Regards, and thank you all for providing fine software, Andreas - -- Dr. Andreas Krüger, Berater, DV-RATIO NORDWEST GmbH andreas.krue...@dv-ratio.com GPG/PGP Fingerprint 8063 4A9B 362D 4220 A546 14C1 EA19 AADC FD44 5EB7 DV-RATIO NORDWEST GmbH Tel: +49 (0)211 / 577 996-0 Fax: +49 (0)211 / 577 996-26 http://www.dv-ratio.com <http://www.dv-ratio.com> Sitz der Gesellschaft Habsburgerstraße 12, 40547 Düsseldorf Registergericht Düsseldorf HRB 34330 USt-IdNr.: DE811321837 Steuer-Nr.: 809/44031 Geschäftsführung: Günter Gerstmann Prokura: Trudbert Vetter, Uwe Wolfram DV-RATIO - "Kompetenz und Zuverlässigkeit seit 1980" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkuFUbgACgkQ6hmq3P1EXrccagCgmlvhg8av4eFEjmx0EtzSYtbZ +iEAnj46GlmRvAK3W3IXgx9/2/J/Neu6 =ix+a -----END PGP SIGNATURE-----