Hi, On Tue, Nov 27, 2018 at 7:13 PM Dan Prince <dpri...@redhat.com> wrote:
> On Tue, 2018-11-27 at 16:24 +0100, Bogdan Dobrelya wrote: > > Changing the topic to follow the subject. > > > > [tl;dr] it's time to rearchitect container images to stop incluiding > > config-time only (puppet et al) bits, which are not needed runtime > > and > > pose security issues, like CVEs, to maintain daily. > > I think your assertion that we need to rearchitect the config images to > container the puppet bits is incorrect here. > > After reviewing the patches you linked to below it appears that you are > proposing we use --volumes-from to bind mount application binaries from > one container into another. I don't believe this is a good pattern for > containers. On baremetal if we followed the same pattern it would be > like using an /nfs share to obtain access to binaries across the > network to optimize local storage. Now... some people do this (like > maybe high performance computing would launch an MPI job like this) but > I don't think we should consider it best practice for our containers in > TripleO. > > Each container should container its own binaries and libraries as much > as possible. And while I do think we should be using --volumes-from > more often in TripleO it would be for sharing *data* between > containers, not binaries. > > > > > > Background: > > 1) For the Distributed Compute Node edge case, there is potentially > > tens > > of thousands of a single-compute-node remote edge sites connected > > over > > WAN to a single control plane, which is having high latency, like a > > 100ms or so, and limited bandwith. Reducing the base layer size > > becomes > > a decent goal there. See the security background below. > > The reason we put Puppet into the base layer was in fact to prevent it > from being downloaded multiple times. If we were to re-architect the > image layers such that the child layers all contained their own copies > of Puppet for example there would actually be a net increase in > bandwidth and disk usage. So I would argue we are already addressing > the goal of optimizing network and disk space. > > Moving it out of the base layer so that you can patch it more often > without disrupting other services is a valid concern. But addressing > this concern while also preserving our definiation of a container (see > above, a container should contain all of its binaries) is going to cost > you something, namely disk and network space because Puppet would need > to be duplicated in each child container. > > As Puppet is used to configure a majority of the services in TripleO > having it in the base container makes most sense. And yes, if there are > security patches for Puppet/Ruby those might result in a bunch of > containers getting pushed. But let Docker layers take care of this I > think... Don't try to solve things by constructing your own custom > mounts and volumes to work around the issue. > > > > 2) For a generic security (Day 2, maintenance) case, when > > puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to > > be > > updated and all layers on top - to be rebuild, and all of those > > layers, > > to be re-fetched for cloud hosts and all containers to be > > restarted... > > And all of that because of some fixes that have nothing to OpenStack. > > By > > the remote edge sites as well, remember of "tens of thousands", high > > latency and limited bandwith?.. > > 3) TripleO CI updates (including puppet*) packages in containers, not > > in > > a common base layer of those. So each a CI job has to update puppet* > > and > > its dependencies - ruby/systemd as well. Reducing numbers of packages > > to > > update for each container makes sense for CI as well. > > > > Implementation related: > > > > WIP patches [0],[1] for early review, uses a config "pod" approach, > > does > > not require to maintain a two sets of config vs runtime images. > > Future > > work: a) cronie requires systemd, we'd want to fix that also off the > > base layer. b) rework to podman pods for docker-puppet.py instead of > > --volumes-from a side car container (can't be backported for Queens > > then, which is still nice to have a support for the Edge DCN case, > > at > > least downstream only perhaps). > > > > Some questions raised on IRC: > > > > Q: is having a service be able to configure itself really need to > > involve a separate pod? > > A: Highly likely yes, removing not-runtime things is a good idea and > > pods is an established PaaS paradigm already. That will require some > > changes in the architecture though (see the topic with WIP patches). > > I'm a little confused on this one. Are you suggesting that we have 2 > containers for each service? One with Puppet and one without? > > That is certainly possible, but to pull it off would likely require you > to have things built like this: > > |base container| --> |service container| --> |service container w/ > Puppet installed| > > The end result would be Puppet being duplicated in a layer for each > services "config image". Very inefficient. > > Again, I'm ansering this assumping we aren't violating our container > constraints and best practices where each container has the binaries > its needs to do its own configuration. > > > > > Q: that's (fetching a config container) actually more data that > > about to > > download otherwise > > A: It's not, if thinking of Day 2, when have to re-fetch the base > > layer > > and top layers, when some unrelated to openstack CVEs got fixed > > there > > for ruby/puppet/systemd. Avoid the need to restart service > > containers > > because of those minor updates puched is also a nice thing. > > Puppet is used only for configuration in TripleO. While security issues > do need to be addressed at any layer I'm not sure there would be an > urgency to re-deploy your cluster simply for a Puppet security fix > alone. Smart change management would help eliminate blindly deploying > new containers in the case where they provide very little security > benefit. > > I think the focus on Puppet, and Ruby here is perhaps a bad example as > they are config time only. Rather than just think about them we should > also consider the rest of the things in our base container images as > well. This is always going to be a "balancing act". There are pros and > cons of having things in the base layer vs. the child/leaf layers. > It's interesting as puppet is required for config time only, but it is kept in every image whole its life. There is a pattern of side cars in Kubernetes where side container configures what's needed for main container and dies. > > > > > > Q: the best solution here would be using packages on the host, > > generating the config files on the host. And then having an all-in- > > one > > container for all the services which lets them run in an isolated > > mannner. > > A: I think for Edge cases, that's a no go as we might want to > > consider > > tiny low footprint OS distros like former known Container Linux or > > Atomic. Also, an all-in-one container looks like an anti-pattern > > from > > the world of VMs. > > This was suggested on IRC because it likely gives you the smallest > network/storage footprint for each edge node. The container would get > used for everything: running all the services, and configuring all the > services. Sort of a golden image approach. It may be an anti-pattern > but initially I thought you were looking to optimize these things. > It is antipattern indeed. The smaller container is the better. Less chance of security issues, less data to transfer over network, less storage. In programming there are a lot of patterns to reuse code (OOP is a sample). So the same pattern should be applied to containers rather than blindly copy data to every container. > > I think a better solution might be to have container registries, or > container mirrors (reverse proxies or whatever) that allow you to cache > things as you deploy to the edge and thus optimize the network traffic. > This solution is good addition but containers should be tiny and not fat. > > > > > > [0] https://review.openstack.org/#/q/topic:base-container-reduction > > [1] > > https://review.rdoproject.org/r/#/q/topic:base-container-reduction > > > > > Here is a related bug [1] and implementation [1] for that. PTAL > > > folks! > > > > > > [0] https://bugs.launchpad.net/tripleo/+bug/1804822 > > > [1] https://review.openstack.org/#/q/topic:base-container-reduction > > > > > > > Let's also think of removing puppet-tripleo from the base > > > > container. > > > > It really brings the world-in (and yum updates in CI!) each job > > > > and each > > > > container! > > > > So if we did so, we should then either install puppet-tripleo and > > > > co on > > > > the host and bind-mount it for the docker-puppet deployment task > > > > steps > > > > (bad idea IMO), OR use the magical --volumes-from <a-side-car- > > > > container> > > > > option to mount volumes from some "puppet-config" sidecar > > > > container > > > > inside each of the containers being launched by docker-puppet > > > > tooling. > > > > > > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at > > > redhat.com> > > > wrote: > > > > We add this to all images: > > > > > > > > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35 > > > > > > > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2 > > > > python > > > > socat sudo which openstack-tripleo-common-container-base rsync > > > > cronie > > > > crudini openstack-selinux ansible python-shade puppet-tripleo > > > > python2- > > > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB > > > > > > > > Is the additional 276 MB reasonable here? > > > > openstack-selinux <- This package run relabling, does that kind > > > > of > > > > touching the filesystem impact the size due to docker layers? > > > > > > > > Also: python2-kubernetes is a fairly large package (18007990) do > > > > we use > > > > that in every image? I don't see any tripleo related repos > > > > importing > > > > from that when searching on Hound? The original commit message[1] > > > > adding it states it is for future convenience. > > > > > > > > On my undercloud we have 101 images, if we are downloading every > > > > 18 MB > > > > per image thats almost 1.8 GB for a package we don't use? (I hope > > > > it's > > > > not like this? With docker layers, we only download that 276 MB > > > > transaction once? Or?) > > > > > > > > > > > > [1] https://review.openstack.org/527927 > > > > > > > > > -- > > > Best regards, > > > Bogdan Dobrelya, > > > Irc #bogdando > > > > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best Regards, Sergii Golovatiuk
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev