On Tue, 2013-09-24 at 21:51 +0100, Christian Seiler wrote: 
> Hi there,
> 
> >> Yep, we discussed this at Plumbers and I think it's really the way 
> >> to
> >> go, basically remove all of that fs pinning code and just do a
> >> bind-mount of the rootfs on itself in the container's mountns before
> >> starting it.
> >
> >> That way if the container decideds to remount / ro at any point, 
> >> it'll
> >> succeed and will give the user a read-only / but without affecting 
> >> the
> >> outside world.
> >
> > Ideally, I think that's the way to go and I use to do that manually 
> > when
> > setting up my containers but I was thinking there was some breakage
> > between that and the way we were working around the pivot_root 
> > problem
> > introduced by systemd (Fedora, Suse, Arch, et al).  If we can verify
> > that works with all the init flavors without breaking, that could be
> > part of the general cleanup of the mount tables in the containers as
> > well, maybe...

> Just a short comment about what I found out when looking at the
> auto-mount stuff I just sent to the list when it comes to
> bind-mounts and remounting ro:

> Take the following example:

> mount --bind /foo /bar
> mount -o remount,ro /bar

> In kernels up to at least 3.2 (but not much later) this would make the
> mount /bar read-only, but keep /foo read-write.

> But: in kernel from at most 3.8 (possibly earlier), this would actually
> remount the entire filesystem read-only or give a busy message. There
> was apparently some kind of change here.

No.  There's a change there, all right, and thank you for reminding me
of that, but (afaik) it's NOT in the kernel itself.  It's a mount
option.  It's that bloody MS_SHARED option and, to a lessor extent,
MS_SLAVE option that are behind how those things are propagated.
MS_SHARED will propagate certain things from a child mount to the mount
point and to other children, IIRC, while MS_SLAVE propagates in one
direction and MS_PRIVATE restricts it.  I think the trouble maker is
MS_SHARED and that's what caused all the "pivot_root" calls to face
plant when systemd started mounting everything with MS_SHARED in the
host system.  I was using bind mounts to avoid some of these problems
but then they changed systemd and its default mount options and broke a
number of things I had running.

> In order to properly remount bind-mounts read-only in newer kernels,
> you have to do the following:

> mount -o remount,bind,ro /bar

Check your mount point options and read the man page for mount and
"shared subtrees options".  Some of the distros have been changing the
defaults.  I don't believe it's a kernel default issue but I could be
wrong.
> This will also work in older kernels (I could only test 2.6.32, not
> earlier), so in that sense it's portable.
> 
> BUT: the typical bind-mount trick one could use to keep the container
> from remounting / ro at shutdown will apparently, as far as I can
> tell, not work anymore in 3.8, possibly earlier, since typical
> shutdown will do the equivalent of remount,ro and not add the bind
> option there.

> So unfortunately, I think we'll have to stick with pinning... :(

Actually, there, I think I agree with you, unfortunately.  I think we're
stuck with it due to ill behavior in some distros and their defaults, in
particular with regards to systemd based distros.  We need to do things
in a way that do not break on a distro running the host and in a way
that doesn't allow an arbitrary distro running in a container to
propagate random acts of terrorism to the host or other containers.  But
that's probably a good paradigm for us, anyways.

> -- Christian

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  m...@wittsend.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!

Attachment: signature.asc
Description: This is a digitally signed message part

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel

Reply via email to