> On April 2, 2020, 6:27 a.m., Qian Zhang wrote: > > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp > > Lines 715 (patched) > > <https://reviews.apache.org/r/72305/diff/2/?file=2216777#file2216777line715> > > > > We already get max memory usage at L671, so I think we should directly > > use it rather than getting usage here.
I'm concerned that using the max usage with result in many false positives, where we send REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED when it's not correct. A container may exceed its memory request at one point in time, leading to 'max_usage > soft_limit', but that doesn't mean it was using that much memory at the time it was OOM-killed. My rationale for using 'usage_in_bytes' is that while there is some uncertainty in that value, I prefer that race to the false positives which would be caused by relying on 'max_usage_in_bytes'. - Greg ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72305/#review220183 ----------------------------------------------------------- On April 2, 2020, 5:10 p.m., Greg Mann wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72305/ > ----------------------------------------------------------- > > (Updated April 2, 2020, 5:10 p.m.) > > > Review request for mesos and Qian Zhang. > > > Repository: mesos > > > Description > ------- > > When a container is OOM-killed and its memory usage is over its > soft memory limit but below its hard memory limit, then we send > schedulers REASON_CONTAINER_MEMORY_REQUEST_EXCEEDED to indicate > that the scheduler's task was preferentially OOM-killed because > it had exceeded its memory request. > > > Diffs > ----- > > src/common/protobuf_utils.cpp 723d85a8656e61f77ab99e5e63f844ec95303ff0 > src/slave/containerizer/mesos/isolators/cgroups/subsystems/memory.cpp > 15f87ba8c0a1b44fb3380beb0e739af566ab08fc > > > Diff: https://reviews.apache.org/r/72305/diff/3/ > > > Testing > ------- > > `make check` > > > Thanks, > > Greg Mann > >
