cf-natali opened a new pull request #388:
URL: https://github.com/apache/mesos/pull/388
When its future is discarded, the cgroup task killer would stop immediately.
This potentially leaves the cgroup frozen, therefore leaving all processes in
uninterruptible state, which is quite bad - for example the tasks would be
unkillable and the cgroup couldn't be destroyed.
Instead, when discarded, wait a bit to give the task killer a chance to
finish cleanly, killing all the tasks and thawing the cgroup.
I started seeing this after updating my kernel - dozens of tests would fail
randomly when trying to destroy cgroups, e.g.:
```
[ RUN ] SlaveRecoveryTest/0.ExecutorDanglingLatestSymlink
I0517 22:39:10.577574 573547 exec.cpp:164] Version: 1.12.0
I0517 22:39:10.591667 573550 exec.cpp:237] Executor registered on agent
a7c3a499-2968-4eb6-91dd-1020fedc1522-S0
I0517 22:39:10.594743 573552 executor.cpp:190] Received SUBSCRIBED event
I0517 22:39:10.595999 573552 executor.cpp:194] Subscribed executor on
thinkpad
I0517 22:39:10.596259 573552 executor.cpp:190] Received LAUNCH event
I0517 22:39:10.597656 573552 executor.cpp:722] Starting task
39b1539d-9c30-41b0-b643-3ece3e63bea5
I0517 22:39:10.617466 573552 executor.cpp:740] Forked command at 573554
../../src/tests/mesos.cpp:782: Failure
Failed to wait 15secs for cgroups::destroy(hierarchy, cgroup)
*** Aborted at 1621287565 (unix time) try "date -d @1621287565" if you are
using GNU date ***
PC: @ 0x56129f5437e7 testing::UnitTest::AddTestPartResult()
*** SIGSEGV (@0x0) received by PID 573473 (TID 0x7fb79b027b00) from PID 0;
stack trace: ***
@ 0x7fb79c287140 (unknown)
@ 0x56129f5437e7 testing::UnitTest::AddTestPartResult()
@ 0x56129f5366f3 testing::internal::AssertHelper::operator=()
@ 0x56129e60421b
mesos::internal::tests::ContainerizerTest<>::TearDown()
@ 0x56129f5633cb
testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x56129f55d634
testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x56129f53d38d testing::Test::Run()
@ 0x56129f53dbb2 testing::TestInfo::Run()
@ 0x56129f53e1f9 testing::TestCase::Run()
@ 0x56129f544ce0 testing::internal::UnitTestImpl::RunAllTests()
@ 0x56129f564349
testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x56129f55e24a
testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x56129f543a0a testing::UnitTest::Run()
@ 0x56129e0f6aba RUN_ALL_TESTS()
@ 0x56129e0f6505 main
@ 0x7fb79c0d4d0a __libc_start_main
@ 0x56129d23545a _start
Erreur de segmentation (core dumped)
```
Comparing a successful run:
```
I0516 23:58:25.918263 466660 cgroups.cpp:2934] Freezing cgroup
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
I0516 23:58:25.918447 466657 cgroups.cpp:1323] Successfully froze cgroup
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
after 157952ns
I0516 23:58:25.918900 466660 cgroups.cpp:2952] Thawing cgroup
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
I0516 23:58:25.919064 466656 cgroups.cpp:1352] Successfully thawed cgroup
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
after 134912ns
```
To an unsuccessful one:
```
I0516 23:58:42.638830 466897 linux_launcher.cpp:606] Destroying cgroup
'/sys/fs/cgroup/freezer/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558'
I0516 23:58:42.639083 466898 composing.cpp:343] Finished recovering all
containerizers
I0516 23:58:42.639549 466898 cgroups.cpp:2934] Freezing cgroup
/sys/fs/cgroup/freezer/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
I0516 23:58:42.714794 466899 cgroups.cpp:2952] Thawing cgroup
/sys/fs/cgroup/freezer/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
W0516 23:58:42.745649 466903 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:42.746997 466902 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:42.749405 466898 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:42.754009 466900 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:42.762802 466901 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:42.779531 466904 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:42.812022 466897 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:42.877116 466903 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:43.006693 466898 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:43.264366 466899 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:43.777858 466901 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:44.803258 466897 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
W0516 23:58:46.851841 466901 cgroups.cpp:294] Removal of cgroup
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
failed with EBUSY, will try again
```
We can see that when the problem occur, it's because the cgroup doesn't
freeze quickly. Since the tests quickly destroys the slave/containeriser after
the task finished, the cgroups task killer gets interrupted after the cgroup
has started to be frozen but before it's been thawed (after killing the tasks)
- https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp#L1445
```
void killTasks() {
// Chain together the steps needed to kill all tasks in the cgroup.
chain = freeze() // Freeze the cgroup.
.then(defer(self(), &Self::kill)) // Send kill signal.
.then(defer(self(), &Self::thaw)) // Thaw cgroup to deliver signal.
.then(defer(self(), &Self::reap)); // Wait until all pids are reaped.
chain.onAny(defer(self(), &Self::finished, lambda::_1));
}
```
The problem is that the process sets up a discard callback which immediately
terminates the process -
https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp#L1407
```
void initialize() override
{
// Stop when no one cares.
promise.future().onDiscard(lambda::bind(
static_cast<void (*)(const UPID&, bool)>(terminate), self(), true));
killTasks();
}
```
Which means that the task killer can be interrupted after it's frozen the
cgroup, but before killing the tasks and thawing.
Which means that the cgroup stays frozen, tasks are stuck in uninterruptible
state, can't be killed and the cgroup can't be destroyed, as can be seen above.
I'm not adding a specific test because it's causing dozens of tests to fail
already.
I'm not quite sure why this only started happening on recent kernels, but my
guess is that for whatever reason freezing seems to sometime take longer than
before, which means we're much more likely to interrupt the killer in the
middle of its work.
In any case it's a long-standing bug which has potentially bad consequences
so good to fix.
@asekretenko
@qianzhangxa
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]