Package: dash
Version: 0.5.8-2.1ubuntu2
Severity: important
Dear Maintainer,
*** Reporter, please consider answering these questions, where appropriate ***
* What led up to the situation?
[VZ]I use a shell script to supervise processes in a docker/kubernetes
container. I noticed steady growth
in the cgroup's CPU utilization from 15 to 35 millicores within 17
days in absence of any external
stimuli (dry run). If container got restarted, the pattern reappeared
from the base level of 15 millicores.
* What exactly did you do (or not do) that was effective (or
ineffective)?
[VZ]I reproduced the issue in local docker, sampled cgroup's threads with
perf, and ran perf-diff on reports.
It emerged that the CPU cost grew at dash's freejob() and kernel's
copy_page() at page_fault().
The findings were consistent with dash growing a data structure,
possibly the one indexed by nprocs at
freejob(), on every iteration of the loop of the shell script. That
data structure would have to be
cloned at fork() happening along the loop, hence kernel page_faults.
To verify my suspicions I reduced sleep inteval in the loop and
monitored Resident Set Size.
Reproduction steps without containers. The shell script:
-----------------------------------------------------------------
#!/bin/sh
cd /tmp
while true
do
[ "`pgrep rsyslogd`" ] || service rsyslog start
[ "`pgrep cron`" ] || service cron start
[ "`pgrep sshd`" ] || service ssh start
env >/root/env
sleep .5
done
-----------------------------------------------------------------
Save it as a file called woof, chmod it executable and run as
# nohup ./woof &
Monitoring with ps unveils unbound growth, e.g. (note the first
figure):
1628 wait S ? 0:00 /bin/sh ./woof
3456 wait S ? 1:02 /bin/sh ./woof
5188 wait S ? 2:59 /bin/sh ./woof
6584 wait S ? 5:19 /bin/sh ./woof
7076 wait S ? 6:17 /bin/sh ./woof
9300 wait S ? 23:54 /bin/sh ./woof
* What was the outcome of this action?
[VZ]I had to resort to #!/bin/bash shebang as it rendered expected
behaviour:
2684 wait S ? 0:00 /bin/bash ./woof
3896 wait S ? 1:32 /bin/bash ./woof
3896 wait S ? 3:10 /bin/bash ./woof
3896 wait S ? 3:45 /bin/bash ./woof
3896 pipe_w S ? 15:20 /bin/bash ./woof
RSS remains constant at 3896
* What outcome did you expect instead?
[VZ]I expect RSS of dash remains bound in the case of an infinite shell
loop the same way as RSS of bash does.
*** End of the template - remove these template lines ***
-- System Information:
Debian Release: stretch/sid
APT prefers xenial-updates
APT policy: (500, 'xenial-updates'), (500, 'xenial-security'), (500,
'xenial'), (100, 'xenial-backports')
Architecture: amd64 (x86_64)
Kernel: Linux 4.4.0-1074-aws (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=ANSI_X3.4-1968)
(ignored: LC_ALL set to C)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages dash depends on:
ii debianutils 4.7
ii dpkg 1.18.4ubuntu1.4
ii libc6 2.23-0ubuntu10
dash recommends no packages.
dash suggests no packages.
-- debconf-show failed