Hello, some may have noticed our linux-32 buildbot fails quite often. [1] Here an analysis: (tl;dr jump to solutions) * always fails in first buildbot step: svn updating * failed step takes around 6 minutes, a successfull step uses ~37 minutes to complete * the commands in the step take much time and often a timeout triggers
The commands and their timeouts (seconds) are: 1) svn --version (1200) 2) rm -rf /home/buildslave20/slave20/openoffice-linux32-nightly/build (120) 3) chmod -Rf u+rwx /home/buildslave20/slave20/openoffice-linux32-nightly/build (120) ah, why? 4) rm -rf /home/buildslave20/slave20/openoffice-linux32-nightly/build (120) huh, again? 5) svn info --xml --non-interactive --no-auth-cache (1200) 6) svn update --non-interactive --no-auth-cache (1200) 7) cp -R -P -p -v /home/buildslave20/slave20/openoffice-linux32-nightly/source /home/buildslave20/slave20/openoffice-linux32-nightly/build (120) 8) svn info --xml (1200) Their results: 1) Always finishes in ~15 seconds 2) No output, almost always fails with command timed out: 120 seconds without output, attempting to kill 3) No output, almost always fails with command timed out: 120 seconds without output, attempting to kill 4) No output, finishes sometimes. *if we fail here the build process is stopped and this the reason for the often failures* 5) Local command completes in a sec. 6) Can take a while depending in source changes. Gives tons of output, so timeout never triggers. 7) Takes *very* long (over 20 minutes) but never triggers timeout as '-v' the output spams the log. 8) Local command again takes a sec. Conclusions: *file operations don't have enough time to finish* Solutions: Edit 'svn updating' buildstep a) Remove rm and chmod commands and replace cp with 'rsync -q -t -p -r --delete /home/buildslave20/slave20/openoffice-linux32-nightly/source /home/buildslave20/slave20/openoffice-linux32-nightly/build' This is much faster as very few copies needed and it's delete is faster than rm command. But increase the timeout anyway just in case. (*preferred* solution but needs rsync on the box) b) increase the timeouts and shut up cp by removing '-v' c) remove unversioned files when updating and build in this folder d) Make rm and chmod verbose by adding '-v' (or -c' for chmod). Spam the log even more, but the timeouts won't trigger. Who doesn't like 50MB logfiles? Yes, the log for this step of every succesfull build is over 50MB currently! Starting build #127 [1] (before this build there was only a build folder but no source Not a serious solution! *I suggest we fix this soon because the huge log files will blow up a server sooner or later.* Regards Jochen [1] https://ci.apache.org/builders/openoffice-linux32-nightly note: on linux64 buildbot the file operations are *much* faster. cp takes 90 secs isn't verbose but in the 120 sec timeout limit.
signature.asc
Description: OpenPGP digital signature