Re: Nuked sandbox directory and failing to run finalizers on disk exceeded

Bill Farner Thu, 09 Apr 2015 11:10:45 -0700

Would you mind filing a mesos ticket for this?  Seems like a pretty major
wart in the integration.


-=Bill

On Thu, Apr 9, 2015 at 10:45 AM, Hussein Elgridly <
huss...@broadinstitute.org> wrote:

> I believe the root of the Docker filesystem for any given container goes in
> /var/lib/docker/something/container_id/... on the host filesystem.
>
> The gc executor cleans out the sandbox directory but anything written
> anywhere else will stick around on the host filesystem until docker rm is
> called. This is handled by Mesos after DOCKER_REMOVE_DELAY = 6h later. [1]
>
> [1]
>
> https://github.com/apache/mesos/blob/2985ae05634038b70f974bbfed6b52fe47231418/src/slave/constants.cpp#L52
>
> I think the takeaway here is to use /mnt/mesos/sandbox as your scratch
> space, since that's all that Thermos watches. Mesos is not merciful if you
> anger it.
>
> Hussein Elgridly
> Senior Software Engineer, DSDE
> The Broad Institute of MIT and Harvard
>
>
> On 9 April 2015 at 12:25, Bill Farner <wfar...@apache.org> wrote:
>
> > I'm fairly ignorant to some of the practicalities here - if you don't
> write
> > to /mnt/mesos/sandbox, where do files land?  Some other ephemeral
> directory
> > that dies with the container?
> >
> > -=Bill
> >
> > On Thu, Apr 9, 2015 at 7:11 AM, Hussein Elgridly <
> > huss...@broadinstitute.org
> > > wrote:
> >
> > > Thanks, that's helpful. I've also just discovered that Thermos only
> > > monitors disk usage in the sandbox location, so if we launch a Docker
> job
> > > and write to anywhere that's not /mnt/mesos/sandbox, we can exceed our
> > disk
> > > quota. I can work around this by turning our scratch space directories
> > into
> > > symlinks located under the sandbox, though.
> > >
> > > Hussein Elgridly
> > > Senior Software Engineer, DSDE
> > > The Broad Institute of MIT and Harvard
> > >
> > >
> > > On 8 April 2015 at 19:43, Zameer Manji <zma...@apache.org> wrote:
> > >
> > > > Hey,
> > > >
> > > > The deletion of sandbox directories is done by the Mesos slave not
> the
> > GC
> > > > executor. You will have to ask Mesos devs on the relationship between
> > low
> > > > disk and sandbox deletion.
> > > >
> > > > The executor enforces disk usage by running `du` in the background
> > > > periodically. I suspect in your case your process fails before the
> > > executor
> > > > notices the disk usage has been exceeded and marks the task as
> failed.
> > > This
> > > > explains why the disk usage message is not there.
> > > >
> > > > I'm not sure why the finalizers are not running, but you should note
> > that
> > > > they are best effort by the executor. The executor won't be able to
> run
> > > > them if Mesos tears down the container from underneath it for
> example.
> > > >
> > > > On Mon, Apr 6, 2015 at 10:30 AM, Hussein Elgridly <
> > > > huss...@broadinstitute.org> wrote:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > I've just had my first task fail due to exceeding disk capacity,
> and
> > > I've
> > > > > run into some strange behaviour.
> > > > >
> > > > > It's a Java process that's running inside a Docker container
> > specified
> > > in
> > > > > the task config. The Java process is failing with
> > java.io.IOException:
> > > No
> > > > > space left on device when attempting to write a file.
> > > > >
> > > > > Three things are (or aren't) then happening which I think are just
> > > plain
> > > > > wrong:
> > > > >
> > > > > 1. The task is being marked as failed (good!) but isn't reporting
> > that
> > > it
> > > > > exceeded disk limits (bad). I was expecting to see the "Disk limit
> > > > > exceeded.  Reserved X bytes vs used Y bytes." message, but neither
> > the
> > > > > Mesos nor Aurora web interfaces are telling me this.
> > > > > 2. The task's sandbox directory is being nuked. All of it,
> > immediately.
> > > > > There while the job is running, vanished as soon as it fails (I
> > > happened
> > > > to
> > > > > be watching it live). This makes debugging difficult, and the
> > > > > Aurora/Thermos web UI clearly has trouble because it reports the
> > > resource
> > > > > requests as all zero when they most definitely weren't.
> > > > > 3. Finalizers aren't running. No finalizers = no error log = no
> > > > debugging =
> > > > > sadface. :(
> > > > >
> > > > > I think what's actually happening here is that the process is
> running
> > > out
> > > > > of disk on the machine itself and that IOException is propagating
> up
> > > from
> > > > > the kernel, rather than Mesos killing the process from its disk
> usage
> > > > > monitoring.
> > > > >
> > > > > As such, we're going to try configuring the Mesos slaves with
> > > > > --resources='disk:some_smaller_value' to leave a little overhead in
> > the
> > > > > hope that the Mesos disk monitor catches the overusage before the
> > > process
> > > > > attempts to claim the last free block on disk.
> > > > >
> > > > > I don't know why it'd be nuking the sandbox, though. And is the GC
> > > > executor
> > > > > more aggressive about cleaning out old sandbox directories if the
> > disk
> > > is
> > > > > low on free space?
> > > > >
> > > > > If it helps, we're on Aurora commit
> > > > > 2bf03dc5eae89b1e40bfd47683c54c185c78a9d3.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Hussein Elgridly
> > > > > Senior Software Engineer, DSDE
> > > > > The Broad Institute of MIT and Harvard
> > > > >
> > > > > --
> > > > > Zameer Manji
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: Nuked sandbox directory and failing to run finalizers on disk exceeded

Reply via email to