cf-natali opened a new pull request #388:
URL: https://github.com/apache/mesos/pull/388


   When its future is discarded, the cgroup task killer would stop immediately. 
This potentially leaves the cgroup frozen, therefore leaving all processes in 
uninterruptible state, which is quite bad - for example the tasks would be 
unkillable and the cgroup couldn't be destroyed.
   Instead, when discarded, wait a bit to give the task killer a chance to 
finish cleanly, killing all the tasks and thawing the cgroup.
   
   I started seeing this after updating my kernel - dozens of tests would fail 
randomly when trying to destroy cgroups, e.g.:
   ```
   [ RUN      ] SlaveRecoveryTest/0.ExecutorDanglingLatestSymlink
   I0517 22:39:10.577574 573547 exec.cpp:164] Version: 1.12.0
   I0517 22:39:10.591667 573550 exec.cpp:237] Executor registered on agent 
a7c3a499-2968-4eb6-91dd-1020fedc1522-S0
   I0517 22:39:10.594743 573552 executor.cpp:190] Received SUBSCRIBED event
   I0517 22:39:10.595999 573552 executor.cpp:194] Subscribed executor on 
thinkpad
   I0517 22:39:10.596259 573552 executor.cpp:190] Received LAUNCH event
   I0517 22:39:10.597656 573552 executor.cpp:722] Starting task 
39b1539d-9c30-41b0-b643-3ece3e63bea5
   I0517 22:39:10.617466 573552 executor.cpp:740] Forked command at 573554
   ../../src/tests/mesos.cpp:782: Failure
   Failed to wait 15secs for cgroups::destroy(hierarchy, cgroup)
   *** Aborted at 1621287565 (unix time) try "date -d @1621287565" if you are 
using GNU date ***
   PC: @     0x56129f5437e7 testing::UnitTest::AddTestPartResult()
   *** SIGSEGV (@0x0) received by PID 573473 (TID 0x7fb79b027b00) from PID 0; 
stack trace: ***
       @     0x7fb79c287140 (unknown)
       @     0x56129f5437e7 testing::UnitTest::AddTestPartResult()
       @     0x56129f5366f3 testing::internal::AssertHelper::operator=()
       @     0x56129e60421b 
mesos::internal::tests::ContainerizerTest<>::TearDown()
       @     0x56129f5633cb 
testing::internal::HandleSehExceptionsInMethodIfSupported<>()
       @     0x56129f55d634 
testing::internal::HandleExceptionsInMethodIfSupported<>()
       @     0x56129f53d38d testing::Test::Run()
       @     0x56129f53dbb2 testing::TestInfo::Run()
       @     0x56129f53e1f9 testing::TestCase::Run()
       @     0x56129f544ce0 testing::internal::UnitTestImpl::RunAllTests()
       @     0x56129f564349 
testing::internal::HandleSehExceptionsInMethodIfSupported<>()
       @     0x56129f55e24a 
testing::internal::HandleExceptionsInMethodIfSupported<>()
       @     0x56129f543a0a testing::UnitTest::Run()
       @     0x56129e0f6aba RUN_ALL_TESTS()
       @     0x56129e0f6505 main
       @     0x7fb79c0d4d0a __libc_start_main
       @     0x56129d23545a _start
   Erreur de segmentation (core dumped)
   ```
   
   Comparing a successful run:
   ```
   I0516 23:58:25.918263 466660 cgroups.cpp:2934] Freezing cgroup 
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
   I0516 23:58:25.918447 466657 cgroups.cpp:1323] Successfully froze cgroup 
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
 after 157952ns
   I0516 23:58:25.918900 466660 cgroups.cpp:2952] Thawing cgroup 
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
   I0516 23:58:25.919064 466656 cgroups.cpp:1352] Successfully thawed cgroup 
/sys/fs/cgroup/freezer/mesos_test_8851f242-b14a-4580-88b8-5d3dd552bfde/e2bf97c3-14a1-4b2b-b399-12dbd0f1c99d
 after 134912ns
   ```
   
   To an unsuccessful one:
   ```
   I0516 23:58:42.638830 466897 linux_launcher.cpp:606] Destroying cgroup 
'/sys/fs/cgroup/freezer/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558'
   I0516 23:58:42.639083 466898 composing.cpp:343] Finished recovering all 
containerizers
   I0516 23:58:42.639549 466898 cgroups.cpp:2934] Freezing cgroup 
/sys/fs/cgroup/freezer/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
   I0516 23:58:42.714794 466899 cgroups.cpp:2952] Thawing cgroup 
/sys/fs/cgroup/freezer/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
   W0516 23:58:42.745649 466903 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:42.746997 466902 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:42.749405 466898 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:42.754009 466900 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:42.762802 466901 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:42.779531 466904 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:42.812022 466897 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:42.877116 466903 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:43.006693 466898 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:43.264366 466899 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:43.777858 466901 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:44.803258 466897 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   W0516 23:58:46.851841 466901 cgroups.cpp:294] Removal of cgroup 
/sys/fs/cgroup/memory/mesos_test_98d2814b-03ad-4c14-a3f9-832ab688e386/3139257f-4e7a-4828-9af4-75c867846558
 failed with EBUSY, will try again
   ```
   
   We can see that when the problem occur, it's because the cgroup doesn't 
freeze quickly. Since the tests quickly destroys the slave/containeriser after 
the task finished, the cgroups task killer gets interrupted after the cgroup 
has started to be frozen but before it's been thawed (after killing the tasks) 
- https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp#L1445
   
   ```
     void killTasks() {
       // Chain together the steps needed to kill all tasks in the cgroup.
       chain = freeze()                     // Freeze the cgroup.
         .then(defer(self(), &Self::kill))  // Send kill signal.
         .then(defer(self(), &Self::thaw))  // Thaw cgroup to deliver signal.
         .then(defer(self(), &Self::reap)); // Wait until all pids are reaped.
   
       chain.onAny(defer(self(), &Self::finished, lambda::_1));
     }
   ```
   
   The problem is that the process sets up a discard callback which immediately 
terminates the process - 
https://github.com/apache/mesos/blob/master/src/linux/cgroups.cpp#L1407
   ```
     void initialize() override
     {
       // Stop when no one cares.
       promise.future().onDiscard(lambda::bind(
           static_cast<void (*)(const UPID&, bool)>(terminate), self(), true));
   
       killTasks();
     }
   ```
   
   Which means that the task killer can be interrupted after it's frozen the 
cgroup, but before killing the tasks and thawing.
   Which means that the cgroup stays frozen, tasks are stuck in uninterruptible 
state, can't be killed and the cgroup can't be destroyed, as can be seen above.
   
   I'm not adding a specific test because it's causing dozens of tests to fail 
already.
   
   I'm not quite sure why this only started happening on recent kernels, but my 
guess is that for whatever reason freezing seems to sometime take longer than 
before, which means we're much more likely to interrupt the killer in the 
middle of its work.
   In any case it's a long-standing bug which has potentially bad consequences 
so good to fix.
   
   @asekretenko 
   @qianzhangxa 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to