-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56895/#review166284
-----------------------------------------------------------
A few superficial suggestions attached -- will take a closer look shortly.
I see this intermittently:
```
libc++abi.dylib: terminating with uncaught exception of type
std::__1::system_error: recursive_mutex lock failed: Invalid argument
*** Aborted at 1487726754 (unix time) try "date -d @1487726754" if you are
using GNU date ***
PC: @ 0x7fff9271cdd6 __pthread_kill
*** SIGABRT (@0x7fff9271cdd6) received by PID 90593 (TID 0x70000362d000) stack
trace: ***
@ 0x7fff927fbbba _sigtramp
@ 0x41221cd4c (unknown)
@ 0x7fff92682420 abort
@ 0x7fff911dd85a abort_message
@ 0x7fff91202c37 default_terminate_handler()
@ 0x7fff91d0cf33 _objc_terminate()
@ 0x7fff911ffd69 std::__terminate()
@ 0x7fff911ff7de __cxa_throw
@ 0x7fff911cd441 std::__1::__throw_system_error()
@ 0x112631a49
_ZZ11synchronizeINSt3__115recursive_mutexEE12SynchronizedIT_EPS3_ENKUlPS1_E_clES6_
@ 0x112631a28
_ZZ11synchronizeINSt3__115recursive_mutexEE12SynchronizedIT_EPS3_ENUlPS1_E_8__invokeES6_
@ 0x112631af9 Synchronized<>::Synchronized()
@ 0x1126319fd Synchronized<>::Synchronized()
@ 0x1125fe49a synchronize<>()
@ 0x1155690f2 process::ProcessBase::enqueue()
@ 0x115583aad process::ProcessManager::deliver()
@ 0x115583696 process::ProcessManager::deliver()
@ 0x115591aa2 process::internal::dispatch()
@ 0x11386f147 process::dispatch<>()
@ 0x11386f04f
_ZZN7process5delayIN5mesos8internal5slave5SlaveEEENS_5TimerERK8DurationRKNS_3PIDIT_EEMSA_FvvEENKUlvE_clEv
@ 0x11386f00d
_ZNSt3__128__invoke_void_return_wrapperIvE6__callIJRZN7process5delayIN5mesos8internal5slave5SlaveEEENS3_5TimerERK8DurationRKNS3_3PIDIT_EEMSE_FvvEEUlvE_EEEvDpOT_
@ 0x11386edc9
_ZNSt3__110__function6__funcIZN7process5delayIN5mesos8internal5slave5SlaveEEENS2_5TimerERK8DurationRKNS2_3PIDIT_EEMSD_FvvEEUlvE_NS_9allocatorISJ_EEFvvEEclEv
@ 0x11223765e std::__1::function<>::operator()()
@ 0x11554e489 process::Timer::operator()()
@ 0x11554e1c9 process::timedout()
@ 0x11560df89
_ZNSt3__128__invoke_void_return_wrapperIvE6__callIJRNS_6__bindIPFvRKNS_4listIN7process5TimerENS_9allocatorIS6_EEEEEJRNS_12placeholders4__phILi1EEEEEESB_EEEvDpOT_
@ 0x11560dc79
_ZNSt3__110__function6__funcINS_6__bindIPFvRKNS_4listIN7process5TimerENS_9allocatorIS5_EEEEEJRNS_12placeholders4__phILi1EEEEEENS6_ISH_EESB_EclESA_
@ 0x112233591 std::__1::function<>::operator()()
@ 0x115210067 process::clock::tick()
@ 0x11521774f
_ZNSt3__128__invoke_void_return_wrapperIvE6__callIJRNS_6__bindIRFvRKN7process4TimeEEJS7_EEEEEEvDpOT_
@ 0x1152175c9
_ZNSt3__110__function6__funcINS_6__bindIRFvRKN7process4TimeEEJS6_EEENS_9allocatorIS9_EEFvvEEclEv
@ 0x11223765e std::__1::function<>::operator()()
```
Seems like the review bot ran into a similar problem.
src/tests/slave_recovery_tests.cpp (line 2310)
<https://reviews.apache.org/r/56895/#comment238213>
Can we rename `_ack` to something that identifies we're waiting for the
_agent_ to see the status update acknowledgment?
src/tests/slave_recovery_tests.cpp (line 2401)
<https://reviews.apache.org/r/56895/#comment238206>
Words in variable names should not be separated with underscores.
src/tests/slave_recovery_tests.cpp (line 2402)
<https://reviews.apache.org/r/56895/#comment238211>
Seems like we should lookup
`state.frameworks[frameworkId].executors[executorId]` once and then reuse it.
src/tests/slave_recovery_tests.cpp (line 2405)
<https://reviews.apache.org/r/56895/#comment238205>
Should probably be `EXPECT`, here and below.
src/tests/slave_recovery_tests.cpp (line 2420)
<https://reviews.apache.org/r/56895/#comment238204>
Indentation.
- Neil Conway
On Feb. 21, 2017, 9:44 p.m., Megha Sharma wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56895/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2017, 9:44 p.m.)
>
>
> Review request for mesos, Neil Conway and Jiang Yan Xu.
>
>
> Bugs: MESOS-6223
> https://issues.apache.org/jira/browse/MESOS-6223
>
>
> Repository: mesos
>
>
> Description
> -------
>
> With partition awareness, the agents are now allowed to re-register
> after they have been marked Unreachable. The executors are anyway
> terminated on the agent when it reboots so there is no harm in
> letting the agent keep its SlaveID, re-register with the master
> and reconcile the lost executors. This is a pre-requisite for
> supporting persistent/restartable tasks in mesos.
>
>
> Diffs
> -----
>
> src/slave/slave.hpp 3b0aea4e3e9a17501077beccbccaab4abbe11af2
> src/slave/slave.cpp 7564e8d39530794131dbbc928fcbc59fb65ef471
> src/slave/state.hpp a497ce1f58fb8dc7718ee5bb10bc62dd7479efa5
> src/slave/state.cpp f8e7cdd4df0a3c5d62d89edd11844527084f2baa
> src/tests/slave_recovery_tests.cpp 0e295915fea0a7314e173857249bd8726eeccd76
>
> Diff: https://reviews.apache.org/r/56895/diff/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Megha Sharma
>
>