[TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya-2
Changing the topic to follow the subject.

[tl;dr] it's time to rearchitect container images to stop incluiding
config-time only (puppet et al) bits, which are not needed runtime and
pose security issues, like CVEs, to maintain daily.

Background:
1) For the Distributed Compute Node edge case, there is potentially tens
of thousands of a single-compute-node remote edge sites connected over
WAN to a single control plane, which is having high latency, like a
100ms or so, and limited bandwith. Reducing the base layer size becomes
a decent goal there. See the security background below.
2) For a generic security (Day 2, maintenance) case, when
puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to be
updated and all layers on top - to be rebuild, and all of those layers,
to be re-fetched for cloud hosts and all containers to be restarted...
And all of that because of some fixes that have nothing to OpenStack. By
the remote edge sites as well, remember of "tens of thousands", high
latency and limited bandwith?..
3) TripleO CI updates (including puppet*) packages in containers, not in
a common base layer of those. So each a CI job has to update puppet* and
its dependencies - ruby/systemd as well. Reducing numbers of packages to
update for each container makes sense for CI as well.

Implementation related:

WIP patches [0],[1] for early review, uses a config "pod" approach, does
not require to maintain a two sets of config vs runtime images. Future
work: a) cronie requires systemd, we'd want to fix that also off the
base layer. b) rework to podman pods for docker-puppet.py instead of
--volumes-from a side car container (can't be backported for Queens
then, which is still nice to have a support for the Edge DCN case, at
least downstream only perhaps).

Some questions raised on IRC:

Q: is having a service be able to configure itself really need to
involve a separate pod?
A: Highly likely yes, removing not-runtime things is a good idea and
pods is an established PaaS paradigm already. That will require some
changes in the architecture though (see the topic with WIP patches).

Q: that's (fetching a config container) actually more data that about to
  download otherwise
A: It's not, if thinking of Day 2, when have to re-fetch the base layer
and top layers, when some unrelated to openstack CVEs got fixed there
for ruby/puppet/systemd. Avoid the need to restart service containers
because of those minor updates puched is also a nice thing.

Q: the best solution here would be using packages on the host,
generating the config files on the host. And then having an all-in-one
container for all the services which lets them run in an isolated mannner.
A: I think for Edge cases, that's a no go as we might want to consider
tiny low footprint OS distros like former known Container Linux or
Atomic. Also, an all-in-one container looks like an anti-pattern from
the world of VMs.

[0] https://review.openstack.org/#/q/topic:base-container-reduction
[1] https://review.rdoproject.org/r/#/q/topic:base-container-reduction

> Here is a related bug [1] and implementation [1] for that. PTAL folks!
>
> [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> [1] https://review.openstack.org/#/q/topic:base-container-reduction
>
>> Let's also think of removing puppet-tripleo from the base container.
>> It really brings the world-in (and yum updates in CI!) each job and each
>> container!
>> So if we did so, we should then either install puppet-tripleo and co on
>> the host and bind-mount it for the docker-puppet deployment task steps
>> (bad idea IMO), OR use the magical --volumes-from <a-side-car-container>
>> option to mount volumes from some "puppet-config" sidecar container
>> inside each of the containers being launched by docker-puppet tooling.
>
> On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at redhat.com>
> wrote:
>> We add this to all images:
>>
>> https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
>>
>> /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2 python
>> socat sudo which openstack-tripleo-common-container-base rsync cronie
>> crudini openstack-selinux ansible python-shade puppet-tripleo python2-
>> kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
>>
>> Is the additional 276 MB reasonable here?
>> openstack-selinux <- This package run relabling, does that kind of
>> touching the filesystem impact the size due to docker layers?
>>
>> Also: python2-kubernetes is a fairly large package (18007990) do we use
>> that in every image? I don't see any tripleo related repos importing
>> from that when searching on Hound? The original commit message[1]
>> adding it states it is for future convenience.
>>
>> On my undercloud we have 101 images, if we are downloading every 18 MB
>> per image thats almost 1.8 GB for a package we don't use? (I hope it's
>> not like this? With docker layers, we only download that 276 MB
>> transaction once? Or?)
>>
>>
>> [1] https://review.openstack.org/527927
>
>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Dan Prince
On Tue, 2018-11-27 at 16:24 +0100, Bogdan Dobrelya wrote:
> Changing the topic to follow the subject.
>
> [tl;dr] it's time to rearchitect container images to stop incluiding
> config-time only (puppet et al) bits, which are not needed runtime
> and
> pose security issues, like CVEs, to maintain daily.

I think your assertion that we need to rearchitect the config images to
container the puppet bits is incorrect here.

After reviewing the patches you linked to below it appears that you are
proposing we use --volumes-from to bind mount application binaries from
one container into another. I don't believe this is a good pattern for
containers. On baremetal if we followed the same pattern it would be
like using an /nfs share to obtain access to binaries across the
network to optimize local storage. Now... some people do this (like
maybe high performance computing would launch an MPI job like this) but
I don't think we should consider it best practice for our containers in
TripleO.

Each container should container its own binaries and libraries as much
as possible. And while I do think we should be using --volumes-from
more often in TripleO it would be for sharing *data* between
containers, not binaries.


>
> Background:
> 1) For the Distributed Compute Node edge case, there is potentially
> tens
> of thousands of a single-compute-node remote edge sites connected
> over
> WAN to a single control plane, which is having high latency, like a
> 100ms or so, and limited bandwith. Reducing the base layer size
> becomes
> a decent goal there. See the security background below.

The reason we put Puppet into the base layer was in fact to prevent it
from being downloaded multiple times. If we were to re-architect the
image layers such that the child layers all contained their own copies
of Puppet for example there would actually be a net increase in
bandwidth and disk usage. So I would argue we are already addressing
the goal of optimizing network and disk space.

Moving it out of the base layer so that you can patch it more often
without disrupting other services is a valid concern. But addressing
this concern while also preserving our definiation of a container (see
above, a container should contain all of its binaries) is going to cost
you something, namely disk and network space because Puppet would need
to be duplicated in each child container.

As Puppet is used to configure a majority of the services in TripleO
having it in the base container makes most sense. And yes, if there are
security patches for Puppet/Ruby those might result in a bunch of
containers getting pushed. But let Docker layers take care of this I
think... Don't try to solve things by constructing your own custom
mounts and volumes to work around the issue.


> 2) For a generic security (Day 2, maintenance) case, when
> puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to
> be
> updated and all layers on top - to be rebuild, and all of those
> layers,
> to be re-fetched for cloud hosts and all containers to be
> restarted...
> And all of that because of some fixes that have nothing to OpenStack.
> By
> the remote edge sites as well, remember of "tens of thousands", high
> latency and limited bandwith?..
> 3) TripleO CI updates (including puppet*) packages in containers, not
> in
> a common base layer of those. So each a CI job has to update puppet*
> and
> its dependencies - ruby/systemd as well. Reducing numbers of packages
> to
> update for each container makes sense for CI as well.
>
> Implementation related:
>
> WIP patches [0],[1] for early review, uses a config "pod" approach,
> does
> not require to maintain a two sets of config vs runtime images.
> Future
> work: a) cronie requires systemd, we'd want to fix that also off the
> base layer. b) rework to podman pods for docker-puppet.py instead of
> --volumes-from a side car container (can't be backported for Queens
> then, which is still nice to have a support for the Edge DCN case,
> at
> least downstream only perhaps).
>
> Some questions raised on IRC:
>
> Q: is having a service be able to configure itself really need to
> involve a separate pod?
> A: Highly likely yes, removing not-runtime things is a good idea and
> pods is an established PaaS paradigm already. That will require some
> changes in the architecture though (see the topic with WIP patches).

I'm a little confused on this one. Are you suggesting that we have 2
containers for each service? One with Puppet and one without?

That is certainly possible, but to pull it off would likely require you
to have things built like this:

 |base container| --> |service container| --> |service container w/
Puppet installed|

The end result would be Puppet being duplicated in a layer for each
services "config image". Very inefficient.

Again, I'm ansering this assumping we aren't violating our container
constraints and best practices where each container has the binaries
its needs to do its own configuration.

>
>  Q: that's (fetching a config container) actually more data that
> about to
>   download otherwise
> A: It's not, if thinking of Day 2, when have to re-fetch the base
> layer
> and top layers, when some unrelated to openstack CVEs got fixed
> there
> for ruby/puppet/systemd. Avoid the need to restart service
> containers
> because of those minor updates puched is also a nice thing.

Puppet is used only for configuration in TripleO. While security issues
do need to be addressed at any layer I'm not sure there would be an
urgency to re-deploy your cluster simply for a Puppet security fix
alone. Smart change management would help eliminate blindly deploying
new containers in the case where they provide very little security
benefit.

I think the focus on Puppet, and Ruby here is perhaps a bad example as
they are config time only. Rather than just think about them we should
also consider the rest of the things in our base container images as
well. This is always going to be a "balancing act". There are pros and
cons of having things in the base layer vs. the child/leaf layers.


>
> Q: the best solution here would be using packages on the host,
> generating the config files on the host. And then having an all-in-
> one
> container for all the services which lets them run in an isolated
> mannner.
> A: I think for Edge cases, that's a no go as we might want to
> consider
> tiny low footprint OS distros like former known Container Linux or
> Atomic. Also, an all-in-one container looks like an anti-pattern
> from
> the world of VMs.

This was suggested on IRC because it likely gives you the smallest
network/storage footprint for each edge node. The container would get
used for everything: running all the services, and configuring all the
services. Sort of a golden image approach. It may be an anti-pattern
but initially I thought you were looking to optimize these things.

I think a better solution might be to have container registries, or
container mirrors (reverse proxies or whatever) that allow you to cache
things as you deploy to the edge and thus optimize the network traffic.


>
> [0] https://review.openstack.org/#/q/topic:base-container-reduction
> [1]
> https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>
> > Here is a related bug [1] and implementation [1] for that. PTAL
> > folks!
> >
> > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > [1] https://review.openstack.org/#/q/topic:base-container-reduction
> >
> > > Let's also think of removing puppet-tripleo from the base
> > > container.
> > > It really brings the world-in (and yum updates in CI!) each job
> > > and each
> > > container!
> > > So if we did so, we should then either install puppet-tripleo and
> > > co on
> > > the host and bind-mount it for the docker-puppet deployment task
> > > steps
> > > (bad idea IMO), OR use the magical --volumes-from <a-side-car-
> > > container>
> > > option to mount volumes from some "puppet-config" sidecar
> > > container
> > > inside each of the containers being launched by docker-puppet
> > > tooling.
> >
> > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > redhat.com>
> > wrote:
> > > We add this to all images:
> > >
> > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > >
> > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > python
> > > socat sudo which openstack-tripleo-common-container-base rsync
> > > cronie
> > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > python2-
> > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > >
> > > Is the additional 276 MB reasonable here?
> > > openstack-selinux <- This package run relabling, does that kind
> > > of
> > > touching the filesystem impact the size due to docker layers?
> > >
> > > Also: python2-kubernetes is a fairly large package (18007990) do
> > > we use
> > > that in every image? I don't see any tripleo related repos
> > > importing
> > > from that when searching on Hound? The original commit message[1]
> > > adding it states it is for future convenience.
> > >
> > > On my undercloud we have 101 images, if we are downloading every
> > > 18 MB
> > > per image thats almost 1.8 GB for a package we don't use? (I hope
> > > it's
> > > not like this? With docker layers, we only download that 276 MB
> > > transaction once? Or?)
> > >
> > >
> > > [1] https://review.openstack.org/527927
> >
> >
> > --
> > Best regards,
> > Bogdan Dobrelya,
> > Irc #bogdando
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Fox, Kevin M
The pod concept allows you to have one tool per container do one thing and do it well.

You can have a container for generating config, and another container for consuming it.

In a Kubernetes pod, if you still wanted to do puppet,
you could have a pod that:
1. had an init container that ran puppet and dumped the resulting config to an emptyDir volume.
2. had your main container pull its config from the emptyDir volume.

Then each container would have no dependency on each other.

In full blown Kubernetes cluster you might have puppet generate a configmap though and ship it to your main container directly. Thats another matter though. I think the example pod example above is still usable without k8s?

Thanks,
Kevin
________________________________________
From: Dan Prince [[hidden email]]
Sent: Tuesday, November 27, 2018 10:10 AM
To: OpenStack Development Mailing List (not for usage questions); [hidden email]
Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

On Tue, 2018-11-27 at 16:24 +0100, Bogdan Dobrelya wrote:
> Changing the topic to follow the subject.
>
> [tl;dr] it's time to rearchitect container images to stop incluiding
> config-time only (puppet et al) bits, which are not needed runtime
> and
> pose security issues, like CVEs, to maintain daily.

I think your assertion that we need to rearchitect the config images to
container the puppet bits is incorrect here.

After reviewing the patches you linked to below it appears that you are
proposing we use --volumes-from to bind mount application binaries from
one container into another. I don't believe this is a good pattern for
containers. On baremetal if we followed the same pattern it would be
like using an /nfs share to obtain access to binaries across the
network to optimize local storage. Now... some people do this (like
maybe high performance computing would launch an MPI job like this) but
I don't think we should consider it best practice for our containers in
TripleO.

Each container should container its own binaries and libraries as much
as possible. And while I do think we should be using --volumes-from
more often in TripleO it would be for sharing *data* between
containers, not binaries.


>
> Background:
> 1) For the Distributed Compute Node edge case, there is potentially
> tens
> of thousands of a single-compute-node remote edge sites connected
> over
> WAN to a single control plane, which is having high latency, like a
> 100ms or so, and limited bandwith. Reducing the base layer size
> becomes
> a decent goal there. See the security background below.

The reason we put Puppet into the base layer was in fact to prevent it
from being downloaded multiple times. If we were to re-architect the
image layers such that the child layers all contained their own copies
of Puppet for example there would actually be a net increase in
bandwidth and disk usage. So I would argue we are already addressing
the goal of optimizing network and disk space.

Moving it out of the base layer so that you can patch it more often
without disrupting other services is a valid concern. But addressing
this concern while also preserving our definiation of a container (see
above, a container should contain all of its binaries) is going to cost
you something, namely disk and network space because Puppet would need
to be duplicated in each child container.

As Puppet is used to configure a majority of the services in TripleO
having it in the base container makes most sense. And yes, if there are
security patches for Puppet/Ruby those might result in a bunch of
containers getting pushed. But let Docker layers take care of this I
think... Don't try to solve things by constructing your own custom
mounts and volumes to work around the issue.


> 2) For a generic security (Day 2, maintenance) case, when
> puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to
> be
> updated and all layers on top - to be rebuild, and all of those
> layers,
> to be re-fetched for cloud hosts and all containers to be
> restarted...
> And all of that because of some fixes that have nothing to OpenStack.
> By
> the remote edge sites as well, remember of "tens of thousands", high
> latency and limited bandwith?..
> 3) TripleO CI updates (including puppet*) packages in containers, not
> in
> a common base layer of those. So each a CI job has to update puppet*
> and
> its dependencies - ruby/systemd as well. Reducing numbers of packages
> to
> update for each container makes sense for CI as well.
>
> Implementation related:
>
> WIP patches [0],[1] for early review, uses a config "pod" approach,
> does
> not require to maintain a two sets of config vs runtime images.
> Future
> work: a) cronie requires systemd, we'd want to fix that also off the
> base layer. b) rework to podman pods for docker-puppet.py instead of
> --volumes-from a side car container (can't be backported for Queens
> then, which is still nice to have a support for the Edge DCN case,
> at
> least downstream only perhaps).
>
> Some questions raised on IRC:
>
> Q: is having a service be able to configure itself really need to
> involve a separate pod?
> A: Highly likely yes, removing not-runtime things is a good idea and
> pods is an established PaaS paradigm already. That will require some
> changes in the architecture though (see the topic with WIP patches).

I'm a little confused on this one. Are you suggesting that we have 2
containers for each service? One with Puppet and one without?

That is certainly possible, but to pull it off would likely require you
to have things built like this:

 |base container| --> |service container| --> |service container w/
Puppet installed|

The end result would be Puppet being duplicated in a layer for each
services "config image". Very inefficient.

Again, I'm ansering this assumping we aren't violating our container
constraints and best practices where each container has the binaries
its needs to do its own configuration.

>
>  Q: that's (fetching a config container) actually more data that
> about to
>   download otherwise
> A: It's not, if thinking of Day 2, when have to re-fetch the base
> layer
> and top layers, when some unrelated to openstack CVEs got fixed
> there
> for ruby/puppet/systemd. Avoid the need to restart service
> containers
> because of those minor updates puched is also a nice thing.

Puppet is used only for configuration in TripleO. While security issues
do need to be addressed at any layer I'm not sure there would be an
urgency to re-deploy your cluster simply for a Puppet security fix
alone. Smart change management would help eliminate blindly deploying
new containers in the case where they provide very little security
benefit.

I think the focus on Puppet, and Ruby here is perhaps a bad example as
they are config time only. Rather than just think about them we should
also consider the rest of the things in our base container images as
well. This is always going to be a "balancing act". There are pros and
cons of having things in the base layer vs. the child/leaf layers.


>
> Q: the best solution here would be using packages on the host,
> generating the config files on the host. And then having an all-in-
> one
> container for all the services which lets them run in an isolated
> mannner.
> A: I think for Edge cases, that's a no go as we might want to
> consider
> tiny low footprint OS distros like former known Container Linux or
> Atomic. Also, an all-in-one container looks like an anti-pattern
> from
> the world of VMs.

This was suggested on IRC because it likely gives you the smallest
network/storage footprint for each edge node. The container would get
used for everything: running all the services, and configuring all the
services. Sort of a golden image approach. It may be an anti-pattern
but initially I thought you were looking to optimize these things.

I think a better solution might be to have container registries, or
container mirrors (reverse proxies or whatever) that allow you to cache
things as you deploy to the edge and thus optimize the network traffic.


>
> [0] https://review.openstack.org/#/q/topic:base-container-reduction
> [1]
> https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>
> > Here is a related bug [1] and implementation [1] for that. PTAL
> > folks!
> >
> > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > [1] https://review.openstack.org/#/q/topic:base-container-reduction
> >
> > > Let's also think of removing puppet-tripleo from the base
> > > container.
> > > It really brings the world-in (and yum updates in CI!) each job
> > > and each
> > > container!
> > > So if we did so, we should then either install puppet-tripleo and
> > > co on
> > > the host and bind-mount it for the docker-puppet deployment task
> > > steps
> > > (bad idea IMO), OR use the magical --volumes-from <a-side-car-
> > > container>
> > > option to mount volumes from some "puppet-config" sidecar
> > > container
> > > inside each of the containers being launched by docker-puppet
> > > tooling.
> >
> > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > redhat.com>
> > wrote:
> > > We add this to all images:
> > >
> > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > >
> > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > python
> > > socat sudo which openstack-tripleo-common-container-base rsync
> > > cronie
> > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > python2-
> > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > >
> > > Is the additional 276 MB reasonable here?
> > > openstack-selinux <- This package run relabling, does that kind
> > > of
> > > touching the filesystem impact the size due to docker layers?
> > >
> > > Also: python2-kubernetes is a fairly large package (18007990) do
> > > we use
> > > that in every image? I don't see any tripleo related repos
> > > importing
> > > from that when searching on Hound? The original commit message[1]
> > > adding it states it is for future convenience.
> > >
> > > On my undercloud we have 101 images, if we are downloading every
> > > 18 MB
> > > per image thats almost 1.8 GB for a package we don't use? (I hope
> > > it's
> > > not like this? With docker layers, we only download that 276 MB
> > > transaction once? Or?)
> > >
> > >
> > > [1] https://review.openstack.org/527927
> >
> >
> > --
> > Best regards,
> > Bogdan Dobrelya,
> > Irc #bogdando
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya-2
In reply to this post by Bogdan Dobrelya-2
To follow up and explain the patches for code review:

The "header" patch https://review.openstack.org/620310 -> (requires)
https://review.rdoproject.org/r/#/c/17534/, and also
https://review.openstack.org/620061 -> (which in turn requires)
https://review.openstack.org/619744 -> (Kolla change, the 1st to go)
https://review.openstack.org/619736

Please also read the commit messages, I tried to explain all "Whys" very
carefully. Just to sum up it here as well:

The current self-containing (config and runtime bits) architecture of
containers badly affects:

* the size of the base layer and all containers images as an
   additional 300MB (adds an extra 30% of size).
* Edge cases, where we have containers images to be distributed, at
   least once to hit local registries, over high-latency and limited
   bandwith, highly unreliable WAN connections.
* numbers of packages to update in CI for all containers for all
   services (CI jobs do not rebuild containers so each container gets
   updated for those 300MB of extra size).
* security and the surface of attacks, by introducing systemd et al as
   additional subjects for CVE fixes to maintain for all containers.
* services uptime, by additional restarts of services related to
   security maintanence of irrelevant to openstack components sitting
   as a dead weight in containers images for ever.

On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:

> Changing the topic to follow the subject.
>
> [tl;dr] it's time to rearchitect container images to stop incluiding
> config-time only (puppet et al) bits, which are not needed runtime and
> pose security issues, like CVEs, to maintain daily.
>
> Background: 1) For the Distributed Compute Node edge case, there is
> potentially tens of thousands of a single-compute-node remote edge sites
> connected over WAN to a single control plane, which is having high
> latency, like a 100ms or so, and limited bandwith.
> 2) For a generic security case,
> 3) TripleO CI updates all
>
> Challenge:
>
>> Here is a related bug [1] and implementation [1] for that. PTAL folks!
>>
>> [0] https://bugs.launchpad.net/tripleo/+bug/1804822
>> [1] https://review.openstack.org/#/q/topic:base-container-reduction
>>
>>> Let's also think of removing puppet-tripleo from the base container.
>>> It really brings the world-in (and yum updates in CI!) each job and
>>> each container!
>>> So if we did so, we should then either install puppet-tripleo and co
>>> on the host and bind-mount it for the docker-puppet deployment task
>>> steps (bad idea IMO), OR use the magical --volumes-from
>>> <a-side-car-container> option to mount volumes from some
>>> "puppet-config" sidecar container inside each of the containers being
>>> launched by docker-puppet tooling.
>>
>> On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at redhat.com>
>> wrote:
>>> We add this to all images:
>>>
>>> https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35 
>>>
>>>
>>> /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2 python
>>> socat sudo which openstack-tripleo-common-container-base rsync cronie
>>> crudini openstack-selinux ansible python-shade puppet-tripleo python2-
>>> kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
>>> Is the additional 276 MB reasonable here?
>>> openstack-selinux <- This package run relabling, does that kind of
>>> touching the filesystem impact the size due to docker layers?
>>>
>>> Also: python2-kubernetes is a fairly large package (18007990) do we use
>>> that in every image? I don't see any tripleo related repos importing
>>> from that when searching on Hound? The original commit message[1]
>>> adding it states it is for future convenience.
>>>
>>> On my undercloud we have 101 images, if we are downloading every 18 MB
>>> per image thats almost 1.8 GB for a package we don't use? (I hope it's
>>> not like this? With docker layers, we only download that 276 MB
>>> transaction once? Or?)
>>>
>>>
>>> [1] https://review.openstack.org/527927
>>
>>
>>
>> --
>> Best regards,
>> Bogdan Dobrelya,
>> Irc #bogdando
>
>


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Dan Prince
In reply to this post by Fox, Kevin M
On Wed, 2018-11-28 at 00:31 +0000, Fox, Kevin M wrote:

> The pod concept allows you to have one tool per container do one
> thing and do it well.
>
> You can have a container for generating config, and another container
> for consuming it.
>
> In a Kubernetes pod, if you still wanted to do puppet,
> you could have a pod that:
> 1. had an init container that ran puppet and dumped the resulting
> config to an emptyDir volume.
> 2. had your main container pull its config from the emptyDir volume.

We have basically implemented the same workflow in TripleO today. First
we execute Puppet in an "init container" (really just an ephemeral
container that generates the config files and then goes away). Then we
bind mount those configs into the service container.

One improvement we could make (which we aren't doing yet) is to use a
data container/volume to store the config files instead of using the
host. Sharing *data* within a 'pod' (set of containers, etc.) is
certainly a valid use of container volumes.

None of this is what we are really talking about in this thread though.
Most of the suggestions and patches are about making our base
container(s) smaller in size. And the means by which the patches do
that is to share binaries/applications across containers with custom
mounts/volumes. I don't think it is a good idea at all as it violates
encapsulation of the containers in general, regardless of whether we
use pods or not.

Dan


>
> Then each container would have no dependency on each other.
>
> In full blown Kubernetes cluster you might have puppet generate a
> configmap though and ship it to your main container directly. Thats
> another matter though. I think the example pod example above is still
> usable without k8s?
>
> Thanks,
> Kevin
> ________________________________________
> From: Dan Prince [[hidden email]]
> Sent: Tuesday, November 27, 2018 10:10 AM
> To: OpenStack Development Mailing List (not for usage questions);
> [hidden email]
> Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of
> containers for security and size of images (maintenance) sakes
>
> On Tue, 2018-11-27 at 16:24 +0100, Bogdan Dobrelya wrote:
> > Changing the topic to follow the subject.
> >
> > [tl;dr] it's time to rearchitect container images to stop
> > incluiding
> > config-time only (puppet et al) bits, which are not needed runtime
> > and
> > pose security issues, like CVEs, to maintain daily.
>
> I think your assertion that we need to rearchitect the config images
> to
> container the puppet bits is incorrect here.
>
> After reviewing the patches you linked to below it appears that you
> are
> proposing we use --volumes-from to bind mount application binaries
> from
> one container into another. I don't believe this is a good pattern
> for
> containers. On baremetal if we followed the same pattern it would be
> like using an /nfs share to obtain access to binaries across the
> network to optimize local storage. Now... some people do this (like
> maybe high performance computing would launch an MPI job like this)
> but
> I don't think we should consider it best practice for our containers
> in
> TripleO.
>
> Each container should container its own binaries and libraries as
> much
> as possible. And while I do think we should be using --volumes-from
> more often in TripleO it would be for sharing *data* between
> containers, not binaries.
>
>
> > Background:
> > 1) For the Distributed Compute Node edge case, there is potentially
> > tens
> > of thousands of a single-compute-node remote edge sites connected
> > over
> > WAN to a single control plane, which is having high latency, like a
> > 100ms or so, and limited bandwith. Reducing the base layer size
> > becomes
> > a decent goal there. See the security background below.
>
> The reason we put Puppet into the base layer was in fact to prevent
> it
> from being downloaded multiple times. If we were to re-architect the
> image layers such that the child layers all contained their own
> copies
> of Puppet for example there would actually be a net increase in
> bandwidth and disk usage. So I would argue we are already addressing
> the goal of optimizing network and disk space.
>
> Moving it out of the base layer so that you can patch it more often
> without disrupting other services is a valid concern. But addressing
> this concern while also preserving our definiation of a container
> (see
> above, a container should contain all of its binaries) is going to
> cost
> you something, namely disk and network space because Puppet would
> need
> to be duplicated in each child container.
>
> As Puppet is used to configure a majority of the services in TripleO
> having it in the base container makes most sense. And yes, if there
> are
> security patches for Puppet/Ruby those might result in a bunch of
> containers getting pushed. But let Docker layers take care of this I
> think... Don't try to solve things by constructing your own custom
> mounts and volumes to work around the issue.
>
>
> > 2) For a generic security (Day 2, maintenance) case, when
> > puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to
> > be
> > updated and all layers on top - to be rebuild, and all of those
> > layers,
> > to be re-fetched for cloud hosts and all containers to be
> > restarted...
> > And all of that because of some fixes that have nothing to
> > OpenStack.
> > By
> > the remote edge sites as well, remember of "tens of thousands",
> > high
> > latency and limited bandwith?..
> > 3) TripleO CI updates (including puppet*) packages in containers,
> > not
> > in
> > a common base layer of those. So each a CI job has to update
> > puppet*
> > and
> > its dependencies - ruby/systemd as well. Reducing numbers of
> > packages
> > to
> > update for each container makes sense for CI as well.
> >
> > Implementation related:
> >
> > WIP patches [0],[1] for early review, uses a config "pod" approach,
> > does
> > not require to maintain a two sets of config vs runtime images.
> > Future
> > work: a) cronie requires systemd, we'd want to fix that also off
> > the
> > base layer. b) rework to podman pods for docker-puppet.py instead
> > of
> > --volumes-from a side car container (can't be backported for Queens
> > then, which is still nice to have a support for the Edge DCN case,
> > at
> > least downstream only perhaps).
> >
> > Some questions raised on IRC:
> >
> > Q: is having a service be able to configure itself really need to
> > involve a separate pod?
> > A: Highly likely yes, removing not-runtime things is a good idea
> > and
> > pods is an established PaaS paradigm already. That will require
> > some
> > changes in the architecture though (see the topic with WIP
> > patches).
>
> I'm a little confused on this one. Are you suggesting that we have 2
> containers for each service? One with Puppet and one without?
>
> That is certainly possible, but to pull it off would likely require
> you
> to have things built like this:
>
>  |base container| --> |service container| --> |service container w/
> Puppet installed|
>
> The end result would be Puppet being duplicated in a layer for each
> services "config image". Very inefficient.
>
> Again, I'm ansering this assumping we aren't violating our container
> constraints and best practices where each container has the binaries
> its needs to do its own configuration.
>
> >  Q: that's (fetching a config container) actually more data that
> > about to
> >   download otherwise
> > A: It's not, if thinking of Day 2, when have to re-fetch the base
> > layer
> > and top layers, when some unrelated to openstack CVEs got fixed
> > there
> > for ruby/puppet/systemd. Avoid the need to restart service
> > containers
> > because of those minor updates puched is also a nice thing.
>
> Puppet is used only for configuration in TripleO. While security
> issues
> do need to be addressed at any layer I'm not sure there would be an
> urgency to re-deploy your cluster simply for a Puppet security fix
> alone. Smart change management would help eliminate blindly deploying
> new containers in the case where they provide very little security
> benefit.
>
> I think the focus on Puppet, and Ruby here is perhaps a bad example
> as
> they are config time only. Rather than just think about them we
> should
> also consider the rest of the things in our base container images as
> well. This is always going to be a "balancing act". There are pros
> and
> cons of having things in the base layer vs. the child/leaf layers.
>
>
> > Q: the best solution here would be using packages on the host,
> > generating the config files on the host. And then having an all-in-
> > one
> > container for all the services which lets them run in an isolated
> > mannner.
> > A: I think for Edge cases, that's a no go as we might want to
> > consider
> > tiny low footprint OS distros like former known Container Linux or
> > Atomic. Also, an all-in-one container looks like an anti-pattern
> > from
> > the world of VMs.
>
> This was suggested on IRC because it likely gives you the smallest
> network/storage footprint for each edge node. The container would get
> used for everything: running all the services, and configuring all
> the
> services. Sort of a golden image approach. It may be an anti-pattern
> but initially I thought you were looking to optimize these things.
>
> I think a better solution might be to have container registries, or
> container mirrors (reverse proxies or whatever) that allow you to
> cache
> things as you deploy to the edge and thus optimize the network
> traffic.
>
>
> > [0] https://review.openstack.org/#/q/topic:base-container-reduction
> > [1]
> > https://review.rdoproject.org/r/#/q/topic:base-container-reduction
> >
> > > Here is a related bug [1] and implementation [1] for that. PTAL
> > > folks!
> > >
> > > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > > [1]
> > > https://review.openstack.org/#/q/topic:base-container-reduction
> > >
> > > > Let's also think of removing puppet-tripleo from the base
> > > > container.
> > > > It really brings the world-in (and yum updates in CI!) each job
> > > > and each
> > > > container!
> > > > So if we did so, we should then either install puppet-tripleo
> > > > and
> > > > co on
> > > > the host and bind-mount it for the docker-puppet deployment
> > > > task
> > > > steps
> > > > (bad idea IMO), OR use the magical --volumes-from <a-side-car-
> > > > container>
> > > > option to mount volumes from some "puppet-config" sidecar
> > > > container
> > > > inside each of the containers being launched by docker-puppet
> > > > tooling.
> > >
> > > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > > redhat.com>
> > > wrote:
> > > > We add this to all images:
> > > >
> > > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > > >
> > > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > > python
> > > > socat sudo which openstack-tripleo-common-container-base rsync
> > > > cronie
> > > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > > python2-
> > > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > > >
> > > > Is the additional 276 MB reasonable here?
> > > > openstack-selinux <- This package run relabling, does that kind
> > > > of
> > > > touching the filesystem impact the size due to docker layers?
> > > >
> > > > Also: python2-kubernetes is a fairly large package (18007990)
> > > > do
> > > > we use
> > > > that in every image? I don't see any tripleo related repos
> > > > importing
> > > > from that when searching on Hound? The original commit
> > > > message[1]
> > > > adding it states it is for future convenience.
> > > >
> > > > On my undercloud we have 101 images, if we are downloading
> > > > every
> > > > 18 MB
> > > > per image thats almost 1.8 GB for a package we don't use? (I
> > > > hope
> > > > it's
> > > > not like this? With docker layers, we only download that 276 MB
> > > > transaction once? Or?)
> > > >
> > > >
> > > > [1] https://review.openstack.org/527927
> > >
> > > --
> > > Best regards,
> > > Bogdan Dobrelya,
> > > Irc #bogdando
>
> _____________________________________________________________________
> _____
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubs
> cribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _____________________________________________________________________
> _____
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubs
> cribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge][Kolla] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya-2
In reply to this post by Bogdan Dobrelya-2
Added Kolla tag as we all together might want to do something to that
systemd included in containers via *multiple* package dependencies, like
[0]. Ideally, that might be properly packaging all/some (like those
names listed in [1]) of the places having it as a dependency, to stop
doing that as of now it's Containers Time?.. As a temporary security
band-aiding I was thinking of removing systemd via footers [1] as an
extra layer added on top, but not sure that buys something good long-term.

[0] https://pastebin.com/RSaRsYgZ
[1]
https://review.openstack.org/#/c/620310/2/container-images/tripleo_kolla_template_overrides.j2@680

On 11/28/18 12:45 PM, Bogdan Dobrelya wrote:

> To follow up and explain the patches for code review:
>
> The "header" patch https://review.openstack.org/620310 -> (requires)
> https://review.rdoproject.org/r/#/c/17534/, and also
> https://review.openstack.org/620061 -> (which in turn requires)
> https://review.openstack.org/619744 -> (Kolla change, the 1st to go)
> https://review.openstack.org/619736
>
> Please also read the commit messages, I tried to explain all "Whys" very
> carefully. Just to sum up it here as well:
>
> The current self-containing (config and runtime bits) architecture of
> containers badly affects:
>
> * the size of the base layer and all containers images as an
>    additional 300MB (adds an extra 30% of size).
> * Edge cases, where we have containers images to be distributed, at
>    least once to hit local registries, over high-latency and limited
>    bandwith, highly unreliable WAN connections.
> * numbers of packages to update in CI for all containers for all
>    services (CI jobs do not rebuild containers so each container gets
>    updated for those 300MB of extra size).
> * security and the surface of attacks, by introducing systemd et al as
>    additional subjects for CVE fixes to maintain for all containers.
> * services uptime, by additional restarts of services related to
>    security maintanence of irrelevant to openstack components sitting
>    as a dead weight in containers images for ever.
>
> On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:
>> Changing the topic to follow the subject.
>>
>> [tl;dr] it's time to rearchitect container images to stop incluiding
>> config-time only (puppet et al) bits, which are not needed runtime and
>> pose security issues, like CVEs, to maintain daily.
>>
>> Background: 1) For the Distributed Compute Node edge case, there is
>> potentially tens of thousands of a single-compute-node remote edge
>> sites connected over WAN to a single control plane, which is having
>> high latency, like a 100ms or so, and limited bandwith.
>> 2) For a generic security case,
>> 3) TripleO CI updates all
>>
>> Challenge:
>>
>>> Here is a related bug [1] and implementation [1] for that. PTAL folks!
>>>
>>> [0] https://bugs.launchpad.net/tripleo/+bug/1804822
>>> [1] https://review.openstack.org/#/q/topic:base-container-reduction
>>>
>>>> Let's also think of removing puppet-tripleo from the base container.
>>>> It really brings the world-in (and yum updates in CI!) each job and
>>>> each container!
>>>> So if we did so, we should then either install puppet-tripleo and co
>>>> on the host and bind-mount it for the docker-puppet deployment task
>>>> steps (bad idea IMO), OR use the magical --volumes-from
>>>> <a-side-car-container> option to mount volumes from some
>>>> "puppet-config" sidecar container inside each of the containers
>>>> being launched by docker-puppet tooling.
>>>
>>> On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
>>> redhat.com> wrote:
>>>> We add this to all images:
>>>>
>>>> https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35 
>>>>
>>>>
>>>> /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2 python
>>>> socat sudo which openstack-tripleo-common-container-base rsync cronie
>>>> crudini openstack-selinux ansible python-shade puppet-tripleo python2-
>>>> kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
>>>> Is the additional 276 MB reasonable here?
>>>> openstack-selinux <- This package run relabling, does that kind of
>>>> touching the filesystem impact the size due to docker layers?
>>>>
>>>> Also: python2-kubernetes is a fairly large package (18007990) do we use
>>>> that in every image? I don't see any tripleo related repos importing
>>>> from that when searching on Hound? The original commit message[1]
>>>> adding it states it is for future convenience.
>>>>
>>>> On my undercloud we have 101 images, if we are downloading every 18 MB
>>>> per image thats almost 1.8 GB for a package we don't use? (I hope it's
>>>> not like this? With docker layers, we only download that 276 MB
>>>> transaction once? Or?)
>>>>
>>>>
>>>> [1] https://review.openstack.org/527927
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Bogdan Dobrelya,
>>> Irc #bogdando
>>
>>
>
>


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Dan Prince
In reply to this post by Bogdan Dobrelya-2
On Wed, 2018-11-28 at 12:45 +0100, Bogdan Dobrelya wrote:
> To follow up and explain the patches for code review:
>
> The "header" patch https://review.openstack.org/620310 -> (requires)
> https://review.rdoproject.org/r/#/c/17534/, and also
> https://review.openstack.org/620061 -> (which in turn requires)
> https://review.openstack.org/619744 -> (Kolla change, the 1st to go)
> https://review.openstack.org/619736

This email was cross-posted to multiple lists and I think we may have
lost some of the context in the process as the subject was changed.

Most of the suggestions and patches are about making our base
container(s) smaller in size. And the means by which the patches do
that is to share binaries/applications across containers with custom
mounts/volumes. I've -2'd most of them. What concerns me however is
that some of the TripleO cores seemed open to this idea yesterday on
IRC. Perhaps I've misread things but what you appear to be doing here
is quite drastic I think we need to consider any of this carefully
before proceeding with any of it.


>
> Please also read the commit messages, I tried to explain all "Whys"
> very
> carefully. Just to sum up it here as well:
>
> The current self-containing (config and runtime bits) architecture
> of
> containers badly affects:
>
> * the size of the base layer and all containers images as an
>    additional 300MB (adds an extra 30% of size).

You are accomplishing this by removing Puppet from the base container,
but you are also creating another container in the process. This would
still be required on all nodes as Puppet is our config tool. So you
would still be downloading some of this data anyways. Understood your
reasons for doing this are that it avoids rebuilding all containers
when there is a change to any of these packages in the base container.
What you are missing however is how often is it the case that Puppet is
updated that something else in the base container isn't?

I would wager that it is more rare than you'd think. Perhaps looking at
the history of an OpenStack distribution would be a valid way to assess
this more critically. Without this data to backup the numbers I'm
afraid what you are doing here falls into "pre-optimization" territory
for me and I don't think the means used in the patches warrent the
benefits you mention here.


> * Edge cases, where we have containers images to be distributed, at
>    least once to hit local registries, over high-latency and limited
>    bandwith, highly unreliable WAN connections.
> * numbers of packages to update in CI for all containers for all
>    services (CI jobs do not rebuild containers so each container gets
>    updated for those 300MB of extra size).

It would seem to me there are other ways to solve the CI containers
update problems. Rebuilding the base layer more often would solve this
right? If we always build our service containers off of a base layer
that is recent there should be no updates to the system/puppet packages
there in our CI pipelines.

> * security and the surface of attacks, by introducing systemd et al
> as
>    additional subjects for CVE fixes to maintain for all containers.

We aren't actually using systemd within our containers. I think those
packages are getting pulled in by an RPM dependency elsewhere. So
rather than using 'rpm -ev --nodeps' to remove it we could create a
sub-package for containers in those cases and install it instead. In
short rather than hack this to remove them why not pursue a proper
packaging fix?

In general I am a fan of getting things out of the base container we
don't need... so yeah lets do this. But lets do it properly.

> * services uptime, by additional restarts of services related to
>    security maintanence of irrelevant to openstack components sitting
>    as a dead weight in containers images for ever.

Like I said above how often is it that these packages actually change
where something else in the base container doesn't? Perhaps we should
get more data here before blindly implementing a solution we aren't
sure really helps out in the real world.

>
> On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:
> > Changing the topic to follow the subject.
> >
> > [tl;dr] it's time to rearchitect container images to stop
> > incluiding
> > config-time only (puppet et al) bits, which are not needed runtime
> > and
> > pose security issues, like CVEs, to maintain daily.
> >
> > Background: 1) For the Distributed Compute Node edge case, there
> > is
> > potentially tens of thousands of a single-compute-node remote edge
> > sites
> > connected over WAN to a single control plane, which is having high
> > latency, like a 100ms or so, and limited bandwith.
> > 2) For a generic security case,
> > 3) TripleO CI updates all
> >
> > Challenge:
> >
> > > Here is a related bug [1] and implementation [1] for that. PTAL
> > > folks!
> > >
> > > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > > [1]
> > > https://review.openstack.org/#/q/topic:base-container-reduction
> > >
> > > > Let's also think of removing puppet-tripleo from the base
> > > > container.
> > > > It really brings the world-in (and yum updates in CI!) each job
> > > > and
> > > > each container!
> > > > So if we did so, we should then either install puppet-tripleo
> > > > and co
> > > > on the host and bind-mount it for the docker-puppet deployment
> > > > task
> > > > steps (bad idea IMO), OR use the magical --volumes-from
> > > > <a-side-car-container> option to mount volumes from some
> > > > "puppet-config" sidecar container inside each of the containers
> > > > being
> > > > launched by docker-puppet tooling.
> > >
> > > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > > redhat.com>
> > > wrote:
> > > > We add this to all images:
> > > >
> > > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35 
> > > >
> > > >
> > > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > > python
> > > > socat sudo which openstack-tripleo-common-container-base rsync
> > > > cronie
> > > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > > python2-
> > > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > > > Is the additional 276 MB reasonable here?
> > > > openstack-selinux <- This package run relabling, does that kind
> > > > of
> > > > touching the filesystem impact the size due to docker layers?
> > > >
> > > > Also: python2-kubernetes is a fairly large package (18007990)
> > > > do we use
> > > > that in every image? I don't see any tripleo related repos
> > > > importing
> > > > from that when searching on Hound? The original commit
> > > > message[1]
> > > > adding it states it is for future convenience.
> > > >
> > > > On my undercloud we have 101 images, if we are downloading
> > > > every 18 MB
> > > > per image thats almost 1.8 GB for a package we don't use? (I
> > > > hope it's
> > > > not like this? With docker layers, we only download that 276 MB
> > > > transaction once? Or?)
> > > >
> > > >
> > > > [1] https://review.openstack.org/527927
> > >
> > >
> > > --
> > > Best regards,
> > > Bogdan Dobrelya,
> > > Irc #bogdando
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya-2
On 11/28/18 2:58 PM, Dan Prince wrote:

> On Wed, 2018-11-28 at 12:45 +0100, Bogdan Dobrelya wrote:
>> To follow up and explain the patches for code review:
>>
>> The "header" patch https://review.openstack.org/620310 -> (requires)
>> https://review.rdoproject.org/r/#/c/17534/, and also
>> https://review.openstack.org/620061 -> (which in turn requires)
>> https://review.openstack.org/619744 -> (Kolla change, the 1st to go)
>> https://review.openstack.org/619736
>
> This email was cross-posted to multiple lists and I think we may have
> lost some of the context in the process as the subject was changed.
>
> Most of the suggestions and patches are about making our base
> container(s) smaller in size. And the means by which the patches do
> that is to share binaries/applications across containers with custom
> mounts/volumes. I've -2'd most of them. What concerns me however is
> that some of the TripleO cores seemed open to this idea yesterday on
> IRC. Perhaps I've misread things but what you appear to be doing here
> is quite drastic I think we need to consider any of this carefully
> before proceeding with any of it.
>
>
>>
>> Please also read the commit messages, I tried to explain all "Whys"
>> very
>> carefully. Just to sum up it here as well:
>>
>> The current self-containing (config and runtime bits) architecture
>> of
>> containers badly affects:
>>
>> * the size of the base layer and all containers images as an
>>     additional 300MB (adds an extra 30% of size).
>
> You are accomplishing this by removing Puppet from the base container,
> but you are also creating another container in the process. This would
> still be required on all nodes as Puppet is our config tool. So you
> would still be downloading some of this data anyways. Understood your
> reasons for doing this are that it avoids rebuilding all containers
> when there is a change to any of these packages in the base container.
> What you are missing however is how often is it the case that Puppet is
> updated that something else in the base container isn't?

For CI jobs updating all containers, its quite an often to have changes
in openstack/tripleo puppet modules to pull in. IIUC, that automatically
picks up any updates for all of its dependencies and for the
dependencies of dependencies, and all that multiplied by a hundred of
total containers to get it updated. That is a *pain* we're used to have
these day for quite often timing out CI jobs... Ofc, the main cause is
delayed promotions though.

For real deployments, I have no data for the cadence of minor updates in
puppet and tripleo & openstack modules for it, let's ask operators (as
we're happened to be in the merged openstack-discuss list)? For its
dependencies though, like systemd and ruby, I'm pretty sure it's quite
often to have CVEs fixed there. So I expect what "in the fields"
security fixes delivering for those might bring some unwanted hassle for
long-term maintenance of LTS releases. As Tengu noted on IRC:
"well, between systemd, puppet and ruby, there are many security
concernes, almost every month... and also, what's the point keeping them
in runtime containers when they are useless?"

>
> I would wager that it is more rare than you'd think. Perhaps looking at
> the history of an OpenStack distribution would be a valid way to assess
> this more critically. Without this data to backup the numbers I'm
> afraid what you are doing here falls into "pre-optimization" territory
> for me and I don't think the means used in the patches warrent the
> benefits you mention here.
>
>
>> * Edge cases, where we have containers images to be distributed, at
>>     least once to hit local registries, over high-latency and limited
>>     bandwith, highly unreliable WAN connections.
>> * numbers of packages to update in CI for all containers for all
>>     services (CI jobs do not rebuild containers so each container gets
>>     updated for those 300MB of extra size).
>
> It would seem to me there are other ways to solve the CI containers
> update problems. Rebuilding the base layer more often would solve this
> right? If we always build our service containers off of a base layer
> that is recent there should be no updates to the system/puppet packages
> there in our CI pipelines.
>
>> * security and the surface of attacks, by introducing systemd et al
>> as
>>     additional subjects for CVE fixes to maintain for all containers.
>
> We aren't actually using systemd within our containers. I think those
> packages are getting pulled in by an RPM dependency elsewhere. So
> rather than using 'rpm -ev --nodeps' to remove it we could create a
> sub-package for containers in those cases and install it instead. In
> short rather than hack this to remove them why not pursue a proper
> packaging fix?
>
> In general I am a fan of getting things out of the base container we
> don't need... so yeah lets do this. But lets do it properly.
>
>> * services uptime, by additional restarts of services related to
>>     security maintanence of irrelevant to openstack components sitting
>>     as a dead weight in containers images for ever.
>
> Like I said above how often is it that these packages actually change
> where something else in the base container doesn't? Perhaps we should
> get more data here before blindly implementing a solution we aren't
> sure really helps out in the real world.
>
>>
>> On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:
>>> Changing the topic to follow the subject.
>>>
>>> [tl;dr] it's time to rearchitect container images to stop
>>> incluiding
>>> config-time only (puppet et al) bits, which are not needed runtime
>>> and
>>> pose security issues, like CVEs, to maintain daily.
>>>
>>> Background: 1) For the Distributed Compute Node edge case, there
>>> is
>>> potentially tens of thousands of a single-compute-node remote edge
>>> sites
>>> connected over WAN to a single control plane, which is having high
>>> latency, like a 100ms or so, and limited bandwith.
>>> 2) For a generic security case,
>>> 3) TripleO CI updates all
>>>
>>> Challenge:
>>>
>>>> Here is a related bug [1] and implementation [1] for that. PTAL
>>>> folks!
>>>>
>>>> [0] https://bugs.launchpad.net/tripleo/+bug/1804822
>>>> [1]
>>>> https://review.openstack.org/#/q/topic:base-container-reduction
>>>>
>>>>> Let's also think of removing puppet-tripleo from the base
>>>>> container.
>>>>> It really brings the world-in (and yum updates in CI!) each job
>>>>> and
>>>>> each container!
>>>>> So if we did so, we should then either install puppet-tripleo
>>>>> and co
>>>>> on the host and bind-mount it for the docker-puppet deployment
>>>>> task
>>>>> steps (bad idea IMO), OR use the magical --volumes-from
>>>>> <a-side-car-container> option to mount volumes from some
>>>>> "puppet-config" sidecar container inside each of the containers
>>>>> being
>>>>> launched by docker-puppet tooling.
>>>>
>>>> On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
>>>> redhat.com>
>>>> wrote:
>>>>> We add this to all images:
>>>>>
>>>>> https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
>>>>>
>>>>>
>>>>> /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
>>>>> python
>>>>> socat sudo which openstack-tripleo-common-container-base rsync
>>>>> cronie
>>>>> crudini openstack-selinux ansible python-shade puppet-tripleo
>>>>> python2-
>>>>> kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
>>>>> Is the additional 276 MB reasonable here?
>>>>> openstack-selinux <- This package run relabling, does that kind
>>>>> of
>>>>> touching the filesystem impact the size due to docker layers?
>>>>>
>>>>> Also: python2-kubernetes is a fairly large package (18007990)
>>>>> do we use
>>>>> that in every image? I don't see any tripleo related repos
>>>>> importing
>>>>> from that when searching on Hound? The original commit
>>>>> message[1]
>>>>> adding it states it is for future convenience.
>>>>>
>>>>> On my undercloud we have 101 images, if we are downloading
>>>>> every 18 MB
>>>>> per image thats almost 1.8 GB for a package we don't use? (I
>>>>> hope it's
>>>>> not like this? With docker layers, we only download that 276 MB
>>>>> transaction once? Or?)
>>>>>
>>>>>
>>>>> [1] https://review.openstack.org/527927
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Bogdan Dobrelya,
>>>> Irc #bogdando
>>
>>
>


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Dan Prince
On Wed, 2018-11-28 at 15:12 +0100, Bogdan Dobrelya wrote:

> On 11/28/18 2:58 PM, Dan Prince wrote:
> > On Wed, 2018-11-28 at 12:45 +0100, Bogdan Dobrelya wrote:
> > > To follow up and explain the patches for code review:
> > >
> > > The "header" patch https://review.openstack.org/620310 ->
> > > (requires)
> > > https://review.rdoproject.org/r/#/c/17534/, and also
> > > https://review.openstack.org/620061 -> (which in turn requires)
> > > https://review.openstack.org/619744 -> (Kolla change, the 1st to
> > > go)
> > > https://review.openstack.org/619736
> >
> > This email was cross-posted to multiple lists and I think we may
> > have
> > lost some of the context in the process as the subject was changed.
> >
> > Most of the suggestions and patches are about making our base
> > container(s) smaller in size. And the means by which the patches do
> > that is to share binaries/applications across containers with
> > custom
> > mounts/volumes. I've -2'd most of them. What concerns me however is
> > that some of the TripleO cores seemed open to this idea yesterday
> > on
> > IRC. Perhaps I've misread things but what you appear to be doing
> > here
> > is quite drastic I think we need to consider any of this carefully
> > before proceeding with any of it.
> >
> >
> > > Please also read the commit messages, I tried to explain all
> > > "Whys"
> > > very
> > > carefully. Just to sum up it here as well:
> > >
> > > The current self-containing (config and runtime bits)
> > > architecture
> > > of
> > > containers badly affects:
> > >
> > > * the size of the base layer and all containers images as an
> > >     additional 300MB (adds an extra 30% of size).
> >
> > You are accomplishing this by removing Puppet from the base
> > container,
> > but you are also creating another container in the process. This
> > would
> > still be required on all nodes as Puppet is our config tool. So you
> > would still be downloading some of this data anyways. Understood
> > your
> > reasons for doing this are that it avoids rebuilding all containers
> > when there is a change to any of these packages in the base
> > container.
> > What you are missing however is how often is it the case that
> > Puppet is
> > updated that something else in the base container isn't?
>
> For CI jobs updating all containers, its quite an often to have
> changes
> in openstack/tripleo puppet modules to pull in. IIUC, that
> automatically
> picks up any updates for all of its dependencies and for the
> dependencies of dependencies, and all that multiplied by a hundred
> of
> total containers to get it updated. That is a *pain* we're used to
> have
> these day for quite often timing out CI jobs... Ofc, the main cause
> is
> delayed promotions though.

Regarding CI I made a separate suggestion on that below in that
rebuilding the base layer more often could be a good solution here. I
don't think the puppet-tripleo package is that large however so we
could just live with it.

>
> For real deployments, I have no data for the cadence of minor updates
> in
> puppet and tripleo & openstack modules for it, let's ask operators
> (as
> we're happened to be in the merged openstack-discuss list)? For its
> dependencies though, like systemd and ruby, I'm pretty sure it's
> quite
> often to have CVEs fixed there. So I expect what "in the fields"
> security fixes delivering for those might bring some unwanted hassle
> for
> long-term maintenance of LTS releases. As Tengu noted on IRC:
> "well, between systemd, puppet and ruby, there are many security
> concernes, almost every month... and also, what's the point keeping
> them
> in runtime containers when they are useless?"

Reiterating again on previous points:

-I'd be fine removing systemd. But lets do it properly and not via 'rpm
-ev --nodeps'.
-Puppet and Ruby *are* required for configuration. We can certainly put
them in a separate container outside of the runtime service containers
but doing so would actually cost you much more space/bandwidth for each
service container. As both of these have to get downloaded to each node
anyway in order to generate config files with our current mechanisms
I'm not sure this buys you anything.

We are going in circles here I think....

Dan

>
> > I would wager that it is more rare than you'd think. Perhaps
> > looking at
> > the history of an OpenStack distribution would be a valid way to
> > assess
> > this more critically. Without this data to backup the numbers I'm
> > afraid what you are doing here falls into "pre-optimization"
> > territory
> > for me and I don't think the means used in the patches warrent the
> > benefits you mention here.
> >
> >
> > > * Edge cases, where we have containers images to be distributed,
> > > at
> > >     least once to hit local registries, over high-latency and
> > > limited
> > >     bandwith, highly unreliable WAN connections.
> > > * numbers of packages to update in CI for all containers for all
> > >     services (CI jobs do not rebuild containers so each container
> > > gets
> > >     updated for those 300MB of extra size).
> >
> > It would seem to me there are other ways to solve the CI containers
> > update problems. Rebuilding the base layer more often would solve
> > this
> > right? If we always build our service containers off of a base
> > layer
> > that is recent there should be no updates to the system/puppet
> > packages
> > there in our CI pipelines.
> >
> > > * security and the surface of attacks, by introducing systemd et
> > > al
> > > as
> > >     additional subjects for CVE fixes to maintain for all
> > > containers.
> >
> > We aren't actually using systemd within our containers. I think
> > those
> > packages are getting pulled in by an RPM dependency elsewhere. So
> > rather than using 'rpm -ev --nodeps' to remove it we could create a
> > sub-package for containers in those cases and install it instead.
> > In
> > short rather than hack this to remove them why not pursue a proper
> > packaging fix?
> >
> > In general I am a fan of getting things out of the base container
> > we
> > don't need... so yeah lets do this. But lets do it properly.
> >
> > > * services uptime, by additional restarts of services related to
> > >     security maintanence of irrelevant to openstack components
> > > sitting
> > >     as a dead weight in containers images for ever.
> >
> > Like I said above how often is it that these packages actually
> > change
> > where something else in the base container doesn't? Perhaps we
> > should
> > get more data here before blindly implementing a solution we aren't
> > sure really helps out in the real world.
> >
> > > On 11/27/18 4:08 PM, Bogdan Dobrelya wrote:
> > > > Changing the topic to follow the subject.
> > > >
> > > > [tl;dr] it's time to rearchitect container images to stop
> > > > incluiding
> > > > config-time only (puppet et al) bits, which are not needed
> > > > runtime
> > > > and
> > > > pose security issues, like CVEs, to maintain daily.
> > > >
> > > > Background: 1) For the Distributed Compute Node edge case,
> > > > there
> > > > is
> > > > potentially tens of thousands of a single-compute-node remote
> > > > edge
> > > > sites
> > > > connected over WAN to a single control plane, which is having
> > > > high
> > > > latency, like a 100ms or so, and limited bandwith.
> > > > 2) For a generic security case,
> > > > 3) TripleO CI updates all
> > > >
> > > > Challenge:
> > > >
> > > > > Here is a related bug [1] and implementation [1] for that.
> > > > > PTAL
> > > > > folks!
> > > > >
> > > > > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > > > > [1]
> > > > > https://review.openstack.org/#/q/topic:base-container-reduction
> > > > >
> > > > > > Let's also think of removing puppet-tripleo from the base
> > > > > > container.
> > > > > > It really brings the world-in (and yum updates in CI!) each
> > > > > > job
> > > > > > and
> > > > > > each container!
> > > > > > So if we did so, we should then either install puppet-
> > > > > > tripleo
> > > > > > and co
> > > > > > on the host and bind-mount it for the docker-puppet
> > > > > > deployment
> > > > > > task
> > > > > > steps (bad idea IMO), OR use the magical --volumes-from
> > > > > > <a-side-car-container> option to mount volumes from some
> > > > > > "puppet-config" sidecar container inside each of the
> > > > > > containers
> > > > > > being
> > > > > > launched by docker-puppet tooling.
> > > > >
> > > > > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > > > > redhat.com>
> > > > > wrote:
> > > > > > We add this to all images:
> > > > > >
> > > > > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > > > > >
> > > > > >
> > > > > > /bin/sh -c yum -y install iproute iscsi-initiator-utils
> > > > > > lvm2
> > > > > > python
> > > > > > socat sudo which openstack-tripleo-common-container-base
> > > > > > rsync
> > > > > > cronie
> > > > > > crudini openstack-selinux ansible python-shade puppet-
> > > > > > tripleo
> > > > > > python2-
> > > > > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > > > > > Is the additional 276 MB reasonable here?
> > > > > > openstack-selinux <- This package run relabling, does that
> > > > > > kind
> > > > > > of
> > > > > > touching the filesystem impact the size due to docker
> > > > > > layers?
> > > > > >
> > > > > > Also: python2-kubernetes is a fairly large package
> > > > > > (18007990)
> > > > > > do we use
> > > > > > that in every image? I don't see any tripleo related repos
> > > > > > importing
> > > > > > from that when searching on Hound? The original commit
> > > > > > message[1]
> > > > > > adding it states it is for future convenience.
> > > > > >
> > > > > > On my undercloud we have 101 images, if we are downloading
> > > > > > every 18 MB
> > > > > > per image thats almost 1.8 GB for a package we don't use?
> > > > > > (I
> > > > > > hope it's
> > > > > > not like this? With docker layers, we only download that
> > > > > > 276 MB
> > > > > > transaction once? Or?)
> > > > > >
> > > > > >
> > > > > > [1] https://review.openstack.org/527927
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Bogdan Dobrelya,
> > > > > Irc #bogdando
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Sergii Golovatiuk-2
In reply to this post by Dan Prince
Hi,
On Tue, Nov 27, 2018 at 7:13 PM Dan Prince <[hidden email]> wrote:
On Tue, 2018-11-27 at 16:24 +0100, Bogdan Dobrelya wrote:
> Changing the topic to follow the subject.
>
> [tl;dr] it's time to rearchitect container images to stop incluiding
> config-time only (puppet et al) bits, which are not needed runtime
> and
> pose security issues, like CVEs, to maintain daily.

I think your assertion that we need to rearchitect the config images to
container the puppet bits is incorrect here.

After reviewing the patches you linked to below it appears that you are
proposing we use --volumes-from to bind mount application binaries from
one container into another. I don't believe this is a good pattern for
containers. On baremetal if we followed the same pattern it would be
like using an /nfs share to obtain access to binaries across the
network to optimize local storage. Now... some people do this (like
maybe high performance computing would launch an MPI job like this) but
I don't think we should consider it best practice for our containers in
TripleO.

Each container should container its own binaries and libraries as much
as possible. And while I do think we should be using --volumes-from
more often in TripleO it would be for sharing *data* between
containers, not binaries.


>
> Background:
> 1) For the Distributed Compute Node edge case, there is potentially
> tens
> of thousands of a single-compute-node remote edge sites connected
> over
> WAN to a single control plane, which is having high latency, like a
> 100ms or so, and limited bandwith. Reducing the base layer size
> becomes
> a decent goal there. See the security background below.

The reason we put Puppet into the base layer was in fact to prevent it
from being downloaded multiple times. If we were to re-architect the
image layers such that the child layers all contained their own copies
of Puppet for example there would actually be a net increase in
bandwidth and disk usage. So I would argue we are already addressing
the goal of optimizing network and disk space.

Moving it out of the base layer so that you can patch it more often
without disrupting other services is a valid concern. But addressing
this concern while also preserving our definiation of a container (see
above, a container should contain all of its binaries) is going to cost
you something, namely disk and network space because Puppet would need
to be duplicated in each child container.

As Puppet is used to configure a majority of the services in TripleO
having it in the base container makes most sense. And yes, if there are
security patches for Puppet/Ruby those might result in a bunch of
containers getting pushed. But let Docker layers take care of this I
think... Don't try to solve things by constructing your own custom
mounts and volumes to work around the issue.


> 2) For a generic security (Day 2, maintenance) case, when
> puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to
> be
> updated and all layers on top - to be rebuild, and all of those
> layers,
> to be re-fetched for cloud hosts and all containers to be
> restarted...
> And all of that because of some fixes that have nothing to OpenStack.
> By
> the remote edge sites as well, remember of "tens of thousands", high
> latency and limited bandwith?..
> 3) TripleO CI updates (including puppet*) packages in containers, not
> in
> a common base layer of those. So each a CI job has to update puppet*
> and
> its dependencies - ruby/systemd as well. Reducing numbers of packages
> to
> update for each container makes sense for CI as well.
>
> Implementation related:
>
> WIP patches [0],[1] for early review, uses a config "pod" approach,
> does
> not require to maintain a two sets of config vs runtime images.
> Future
> work: a) cronie requires systemd, we'd want to fix that also off the
> base layer. b) rework to podman pods for docker-puppet.py instead of
> --volumes-from a side car container (can't be backported for Queens
> then, which is still nice to have a support for the Edge DCN case,
> at
> least downstream only perhaps).
>
> Some questions raised on IRC:
>
> Q: is having a service be able to configure itself really need to
> involve a separate pod?
> A: Highly likely yes, removing not-runtime things is a good idea and
> pods is an established PaaS paradigm already. That will require some
> changes in the architecture though (see the topic with WIP patches).

I'm a little confused on this one. Are you suggesting that we have 2
containers for each service? One with Puppet and one without?

That is certainly possible, but to pull it off would likely require you
to have things built like this:

 |base container| --> |service container| --> |service container w/
Puppet installed|

The end result would be Puppet being duplicated in a layer for each
services "config image". Very inefficient.

Again, I'm ansering this assumping we aren't violating our container
constraints and best practices where each container has the binaries
its needs to do its own configuration.

>
>  Q: that's (fetching a config container) actually more data that
> about to
>   download otherwise
> A: It's not, if thinking of Day 2, when have to re-fetch the base
> layer
> and top layers, when some unrelated to openstack CVEs got fixed
> there
> for ruby/puppet/systemd. Avoid the need to restart service
> containers
> because of those minor updates puched is also a nice thing.

Puppet is used only for configuration in TripleO. While security issues
do need to be addressed at any layer I'm not sure there would be an
urgency to re-deploy your cluster simply for a Puppet security fix
alone. Smart change management would help eliminate blindly deploying
new containers in the case where they provide very little security
benefit.

I think the focus on Puppet, and Ruby here is perhaps a bad example as
they are config time only. Rather than just think about them we should
also consider the rest of the things in our base container images as
well. This is always going to be a "balancing act". There are pros and
cons of having things in the base layer vs. the child/leaf layers.

It's interesting as puppet is required for config time only, but it is kept in every image whole its life. There is a pattern of side cars in Kubernetes where side container configures what's needed for main container and dies.
 


>
> Q: the best solution here would be using packages on the host,
> generating the config files on the host. And then having an all-in-
> one
> container for all the services which lets them run in an isolated
> mannner.
> A: I think for Edge cases, that's a no go as we might want to
> consider
> tiny low footprint OS distros like former known Container Linux or
> Atomic. Also, an all-in-one container looks like an anti-pattern
> from
> the world of VMs.

This was suggested on IRC because it likely gives you the smallest
network/storage footprint for each edge node. The container would get
used for everything: running all the services, and configuring all the
services. Sort of a golden image approach. It may be an anti-pattern
but initially I thought you were looking to optimize these things.

It is antipattern indeed. The smaller container is the better. Less chance of security issues, less data to transfer over network, less storage. In programming there are a lot of patterns to reuse code (OOP is a sample). So the same pattern should be applied to containers rather than blindly copy data to every container.
 

I think a better solution might be to have container registries, or
container mirrors (reverse proxies or whatever) that allow you to cache
things as you deploy to the edge and thus optimize the network traffic.

This solution is good addition but containers should be tiny and not fat.
 


>
> [0] https://review.openstack.org/#/q/topic:base-container-reduction
> [1]
> https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>
> > Here is a related bug [1] and implementation [1] for that. PTAL
> > folks!
> >
> > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > [1] https://review.openstack.org/#/q/topic:base-container-reduction
> >
> > > Let's also think of removing puppet-tripleo from the base
> > > container.
> > > It really brings the world-in (and yum updates in CI!) each job
> > > and each
> > > container!
> > > So if we did so, we should then either install puppet-tripleo and
> > > co on
> > > the host and bind-mount it for the docker-puppet deployment task
> > > steps
> > > (bad idea IMO), OR use the magical --volumes-from <a-side-car-
> > > container>
> > > option to mount volumes from some "puppet-config" sidecar
> > > container
> > > inside each of the containers being launched by docker-puppet
> > > tooling.
> >
> > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > redhat.com>
> > wrote:
> > > We add this to all images:
> > >
> > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > >
> > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > python
> > > socat sudo which openstack-tripleo-common-container-base rsync
> > > cronie
> > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > python2-
> > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > >
> > > Is the additional 276 MB reasonable here?
> > > openstack-selinux <- This package run relabling, does that kind
> > > of
> > > touching the filesystem impact the size due to docker layers?
> > >
> > > Also: python2-kubernetes is a fairly large package (18007990) do
> > > we use
> > > that in every image? I don't see any tripleo related repos
> > > importing
> > > from that when searching on Hound? The original commit message[1]
> > > adding it states it is for future convenience.
> > >
> > > On my undercloud we have 101 images, if we are downloading every
> > > 18 MB
> > > per image thats almost 1.8 GB for a package we don't use? (I hope
> > > it's
> > > not like this? With docker layers, we only download that 276 MB
> > > transaction once? Or?)
> > >
> > >
> > > [1] https://review.openstack.org/527927
> >
> >
> > --
> > Best regards,
> > Bogdan Dobrelya,
> > Irc #bogdando
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@...?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--
Best Regards,
Sergii Golovatiuk

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Fox, Kevin M
In reply to this post by Dan Prince
Ok, so you have the workflow in place, but it sounds like the containers are not laid out to best use that workflow. Puppet is in the base layer. That means whenever puppet gets updated, all the other containers must be too. And other such update coupling issues.

I'm with you, that binaries should not be copied from one container to another though.

Thanks,
Kevin
________________________________________
From: Dan Prince [[hidden email]]
Sent: Wednesday, November 28, 2018 5:31 AM
To: Former OpenStack Development Mailing List, use openstack-discuss now; [hidden email]
Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

On Wed, 2018-11-28 at 00:31 +0000, Fox, Kevin M wrote:

> The pod concept allows you to have one tool per container do one
> thing and do it well.
>
> You can have a container for generating config, and another container
> for consuming it.
>
> In a Kubernetes pod, if you still wanted to do puppet,
> you could have a pod that:
> 1. had an init container that ran puppet and dumped the resulting
> config to an emptyDir volume.
> 2. had your main container pull its config from the emptyDir volume.

We have basically implemented the same workflow in TripleO today. First
we execute Puppet in an "init container" (really just an ephemeral
container that generates the config files and then goes away). Then we
bind mount those configs into the service container.

One improvement we could make (which we aren't doing yet) is to use a
data container/volume to store the config files instead of using the
host. Sharing *data* within a 'pod' (set of containers, etc.) is
certainly a valid use of container volumes.

None of this is what we are really talking about in this thread though.
Most of the suggestions and patches are about making our base
container(s) smaller in size. And the means by which the patches do
that is to share binaries/applications across containers with custom
mounts/volumes. I don't think it is a good idea at all as it violates
encapsulation of the containers in general, regardless of whether we
use pods or not.

Dan


>
> Then each container would have no dependency on each other.
>
> In full blown Kubernetes cluster you might have puppet generate a
> configmap though and ship it to your main container directly. Thats
> another matter though. I think the example pod example above is still
> usable without k8s?
>
> Thanks,
> Kevin
> ________________________________________
> From: Dan Prince [[hidden email]]
> Sent: Tuesday, November 27, 2018 10:10 AM
> To: OpenStack Development Mailing List (not for usage questions);
> [hidden email]
> Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of
> containers for security and size of images (maintenance) sakes
>
> On Tue, 2018-11-27 at 16:24 +0100, Bogdan Dobrelya wrote:
> > Changing the topic to follow the subject.
> >
> > [tl;dr] it's time to rearchitect container images to stop
> > incluiding
> > config-time only (puppet et al) bits, which are not needed runtime
> > and
> > pose security issues, like CVEs, to maintain daily.
>
> I think your assertion that we need to rearchitect the config images
> to
> container the puppet bits is incorrect here.
>
> After reviewing the patches you linked to below it appears that you
> are
> proposing we use --volumes-from to bind mount application binaries
> from
> one container into another. I don't believe this is a good pattern
> for
> containers. On baremetal if we followed the same pattern it would be
> like using an /nfs share to obtain access to binaries across the
> network to optimize local storage. Now... some people do this (like
> maybe high performance computing would launch an MPI job like this)
> but
> I don't think we should consider it best practice for our containers
> in
> TripleO.
>
> Each container should container its own binaries and libraries as
> much
> as possible. And while I do think we should be using --volumes-from
> more often in TripleO it would be for sharing *data* between
> containers, not binaries.
>
>
> > Background:
> > 1) For the Distributed Compute Node edge case, there is potentially
> > tens
> > of thousands of a single-compute-node remote edge sites connected
> > over
> > WAN to a single control plane, which is having high latency, like a
> > 100ms or so, and limited bandwith. Reducing the base layer size
> > becomes
> > a decent goal there. See the security background below.
>
> The reason we put Puppet into the base layer was in fact to prevent
> it
> from being downloaded multiple times. If we were to re-architect the
> image layers such that the child layers all contained their own
> copies
> of Puppet for example there would actually be a net increase in
> bandwidth and disk usage. So I would argue we are already addressing
> the goal of optimizing network and disk space.
>
> Moving it out of the base layer so that you can patch it more often
> without disrupting other services is a valid concern. But addressing
> this concern while also preserving our definiation of a container
> (see
> above, a container should contain all of its binaries) is going to
> cost
> you something, namely disk and network space because Puppet would
> need
> to be duplicated in each child container.
>
> As Puppet is used to configure a majority of the services in TripleO
> having it in the base container makes most sense. And yes, if there
> are
> security patches for Puppet/Ruby those might result in a bunch of
> containers getting pushed. But let Docker layers take care of this I
> think... Don't try to solve things by constructing your own custom
> mounts and volumes to work around the issue.
>
>
> > 2) For a generic security (Day 2, maintenance) case, when
> > puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to
> > be
> > updated and all layers on top - to be rebuild, and all of those
> > layers,
> > to be re-fetched for cloud hosts and all containers to be
> > restarted...
> > And all of that because of some fixes that have nothing to
> > OpenStack.
> > By
> > the remote edge sites as well, remember of "tens of thousands",
> > high
> > latency and limited bandwith?..
> > 3) TripleO CI updates (including puppet*) packages in containers,
> > not
> > in
> > a common base layer of those. So each a CI job has to update
> > puppet*
> > and
> > its dependencies - ruby/systemd as well. Reducing numbers of
> > packages
> > to
> > update for each container makes sense for CI as well.
> >
> > Implementation related:
> >
> > WIP patches [0],[1] for early review, uses a config "pod" approach,
> > does
> > not require to maintain a two sets of config vs runtime images.
> > Future
> > work: a) cronie requires systemd, we'd want to fix that also off
> > the
> > base layer. b) rework to podman pods for docker-puppet.py instead
> > of
> > --volumes-from a side car container (can't be backported for Queens
> > then, which is still nice to have a support for the Edge DCN case,
> > at
> > least downstream only perhaps).
> >
> > Some questions raised on IRC:
> >
> > Q: is having a service be able to configure itself really need to
> > involve a separate pod?
> > A: Highly likely yes, removing not-runtime things is a good idea
> > and
> > pods is an established PaaS paradigm already. That will require
> > some
> > changes in the architecture though (see the topic with WIP
> > patches).
>
> I'm a little confused on this one. Are you suggesting that we have 2
> containers for each service? One with Puppet and one without?
>
> That is certainly possible, but to pull it off would likely require
> you
> to have things built like this:
>
>  |base container| --> |service container| --> |service container w/
> Puppet installed|
>
> The end result would be Puppet being duplicated in a layer for each
> services "config image". Very inefficient.
>
> Again, I'm ansering this assumping we aren't violating our container
> constraints and best practices where each container has the binaries
> its needs to do its own configuration.
>
> >  Q: that's (fetching a config container) actually more data that
> > about to
> >   download otherwise
> > A: It's not, if thinking of Day 2, when have to re-fetch the base
> > layer
> > and top layers, when some unrelated to openstack CVEs got fixed
> > there
> > for ruby/puppet/systemd. Avoid the need to restart service
> > containers
> > because of those minor updates puched is also a nice thing.
>
> Puppet is used only for configuration in TripleO. While security
> issues
> do need to be addressed at any layer I'm not sure there would be an
> urgency to re-deploy your cluster simply for a Puppet security fix
> alone. Smart change management would help eliminate blindly deploying
> new containers in the case where they provide very little security
> benefit.
>
> I think the focus on Puppet, and Ruby here is perhaps a bad example
> as
> they are config time only. Rather than just think about them we
> should
> also consider the rest of the things in our base container images as
> well. This is always going to be a "balancing act". There are pros
> and
> cons of having things in the base layer vs. the child/leaf layers.
>
>
> > Q: the best solution here would be using packages on the host,
> > generating the config files on the host. And then having an all-in-
> > one
> > container for all the services which lets them run in an isolated
> > mannner.
> > A: I think for Edge cases, that's a no go as we might want to
> > consider
> > tiny low footprint OS distros like former known Container Linux or
> > Atomic. Also, an all-in-one container looks like an anti-pattern
> > from
> > the world of VMs.
>
> This was suggested on IRC because it likely gives you the smallest
> network/storage footprint for each edge node. The container would get
> used for everything: running all the services, and configuring all
> the
> services. Sort of a golden image approach. It may be an anti-pattern
> but initially I thought you were looking to optimize these things.
>
> I think a better solution might be to have container registries, or
> container mirrors (reverse proxies or whatever) that allow you to
> cache
> things as you deploy to the edge and thus optimize the network
> traffic.
>
>
> > [0] https://review.openstack.org/#/q/topic:base-container-reduction
> > [1]
> > https://review.rdoproject.org/r/#/q/topic:base-container-reduction
> >
> > > Here is a related bug [1] and implementation [1] for that. PTAL
> > > folks!
> > >
> > > [0] https://bugs.launchpad.net/tripleo/+bug/1804822
> > > [1]
> > > https://review.openstack.org/#/q/topic:base-container-reduction
> > >
> > > > Let's also think of removing puppet-tripleo from the base
> > > > container.
> > > > It really brings the world-in (and yum updates in CI!) each job
> > > > and each
> > > > container!
> > > > So if we did so, we should then either install puppet-tripleo
> > > > and
> > > > co on
> > > > the host and bind-mount it for the docker-puppet deployment
> > > > task
> > > > steps
> > > > (bad idea IMO), OR use the magical --volumes-from <a-side-car-
> > > > container>
> > > > option to mount volumes from some "puppet-config" sidecar
> > > > container
> > > > inside each of the containers being launched by docker-puppet
> > > > tooling.
> > >
> > > On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at
> > > redhat.com>
> > > wrote:
> > > > We add this to all images:
> > > >
> > > > https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35
> > > >
> > > > /bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2
> > > > python
> > > > socat sudo which openstack-tripleo-common-container-base rsync
> > > > cronie
> > > > crudini openstack-selinux ansible python-shade puppet-tripleo
> > > > python2-
> > > > kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
> > > >
> > > > Is the additional 276 MB reasonable here?
> > > > openstack-selinux <- This package run relabling, does that kind
> > > > of
> > > > touching the filesystem impact the size due to docker layers?
> > > >
> > > > Also: python2-kubernetes is a fairly large package (18007990)
> > > > do
> > > > we use
> > > > that in every image? I don't see any tripleo related repos
> > > > importing
> > > > from that when searching on Hound? The original commit
> > > > message[1]
> > > > adding it states it is for future convenience.
> > > >
> > > > On my undercloud we have 101 images, if we are downloading
> > > > every
> > > > 18 MB
> > > > per image thats almost 1.8 GB for a package we don't use? (I
> > > > hope
> > > > it's
> > > > not like this? With docker layers, we only download that 276 MB
> > > > transaction once? Or?)
> > > >
> > > >
> > > > [1] https://review.openstack.org/527927
> > >
> > > --
> > > Best regards,
> > > Bogdan Dobrelya,
> > > Irc #bogdando
>
> _____________________________________________________________________
> _____
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubs
> cribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _____________________________________________________________________
> _____
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubs
> cribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Jiří Stránský
In reply to this post by Dan Prince
<snip>

>
> Reiterating again on previous points:
>
> -I'd be fine removing systemd. But lets do it properly and not via 'rpm
> -ev --nodeps'.
> -Puppet and Ruby *are* required for configuration. We can certainly put
> them in a separate container outside of the runtime service containers
> but doing so would actually cost you much more space/bandwidth for each
> service container. As both of these have to get downloaded to each node
> anyway in order to generate config files with our current mechanisms
> I'm not sure this buys you anything.

+1. I was actually under the impression that we concluded yesterday on
IRC that this is the only thing that makes sense to seriously consider.
But even then it's not a win-win -- we'd gain some security by leaner
production images, but pay for it with space+bandwidth by duplicating
image content (IOW we can help achieve one of the goals we had in mind
by worsening the situation w/r/t the other goal we had in mind.)

Personally i'm not sold yet but it's something that i'd consider if we
got measurements of how much more space/bandwidth usage this would
consume, and if we got some further details/examples about how serious
are the security concerns if we leave config mgmt tools in runtime images.

IIRC the other options (that were brought forward so far) were already
dismissed in yesterday's IRC discussion and on the reviews. Bin/lib bind
mounting being too hacky and fragile, and nsenter not really solving the
problem (because it allows us to switch to having different bins/libs
available, but it does not allow merging the availability of bins/libs
from two containers into a single context).

>
> We are going in circles here I think....

+1. I think too much of the discussion focuses on "why it's bad to have
config tools in runtime images", but IMO we all sorta agree that it
would be better not to have them there, if it came at no cost.

I think to move forward, it would be interesting to know: if we do this
(i'll borrow Dan's drawing):

|base container| --> |service container| --> |service container w/
Puppet installed|

How much more space and bandwidth would this consume per node (e.g.
separately per controller, per compute). This could help with decision
making.

>
> Dan
>

Thanks

Jirka

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya-2
On 11/28/18 6:02 PM, Jiří Stránský wrote:

> <snip>
>
>>
>> Reiterating again on previous points:
>>
>> -I'd be fine removing systemd. But lets do it properly and not via 'rpm
>> -ev --nodeps'.
>> -Puppet and Ruby *are* required for configuration. We can certainly put
>> them in a separate container outside of the runtime service containers
>> but doing so would actually cost you much more space/bandwidth for each
>> service container. As both of these have to get downloaded to each node
>> anyway in order to generate config files with our current mechanisms
>> I'm not sure this buys you anything.
>
> +1. I was actually under the impression that we concluded yesterday on
> IRC that this is the only thing that makes sense to seriously consider.
> But even then it's not a win-win -- we'd gain some security by leaner
> production images, but pay for it with space+bandwidth by duplicating
> image content (IOW we can help achieve one of the goals we had in mind
> by worsening the situation w/r/t the other goal we had in mind.)
>
> Personally i'm not sold yet but it's something that i'd consider if we
> got measurements of how much more space/bandwidth usage this would
> consume, and if we got some further details/examples about how serious
> are the security concerns if we leave config mgmt tools in runtime images.
>
> IIRC the other options (that were brought forward so far) were already
> dismissed in yesterday's IRC discussion and on the reviews. Bin/lib bind
> mounting being too hacky and fragile, and nsenter not really solving the
> problem (because it allows us to switch to having different bins/libs
> available, but it does not allow merging the availability of bins/libs
> from two containers into a single context).
>
>>
>> We are going in circles here I think....
>
> +1. I think too much of the discussion focuses on "why it's bad to have
> config tools in runtime images", but IMO we all sorta agree that it
> would be better not to have them there, if it came at no cost.
>
> I think to move forward, it would be interesting to know: if we do this
> (i'll borrow Dan's drawing):
>
> |base container| --> |service container| --> |service container w/
> Puppet installed|
>
> How much more space and bandwidth would this consume per node (e.g.
> separately per controller, per compute). This could help with decision
> making.

As I've already evaluated in the related bug, that is:

puppet-* modules and manifests ~ 16MB
puppet with dependencies ~61MB
dependencies of the seemingly largest a dependency, systemd ~190MB

that would be an extra layer size for each of the container images to be
downloaded/fetched into registries.

Given that we should decouple systemd from all/some of the dependencies
(an example topic for RDO [0]), that could save a 190MB. But it seems we
cannot break the love of puppet and systemd as it heavily relies on the
latter and changing packaging like that would higly likely affect
baremetal deployments with puppet and systemd co-operating.

Long story short, we cannot shoot both rabbits with a single shot, not
with puppet :) May be we could with ansible replacing puppet fully...
So splitting config and runtime images is the only choice yet to address
the raised security concerns. And let's forget about edge cases for now.
Tossing around a pair of extra bytes over 40,000 WAN-distributed
computes ain't gonna be our the biggest problem for sure.

[0] https://review.rdoproject.org/r/#/q/topic:base-container-reduction

>
>>
>> Dan
>>
>
> Thanks
>
> Jirka
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

James Slagle
On Wed, Nov 28, 2018 at 12:31 PM Bogdan Dobrelya <[hidden email]> wrote:
> Long story short, we cannot shoot both rabbits with a single shot, not
> with puppet :) May be we could with ansible replacing puppet fully...
> So splitting config and runtime images is the only choice yet to address
> the raised security concerns. And let's forget about edge cases for now.
> Tossing around a pair of extra bytes over 40,000 WAN-distributed
> computes ain't gonna be our the biggest problem for sure.

I think it's this last point that is the crux of this discussion. We
can agree to disagree about the merits of this proposal and whether
it's a pre-optimzation or micro-optimization, which I admit are
somewhat subjective terms. Ultimately, it seems to be about the "why"
do we need to do this as to the reason why the conversation seems to
be going in circles a bit.

I'm all for reducing container image size, but the reality is that
this proposal doesn't necessarily help us with the Edge use cases we
are talking about trying to solve.

Why would we even run the exact same puppet binary + manifest
individually 40,000 times so that we can produce the exact same set of
configuration files that differ only by things such as IP address,
hostnames, and passwords? Maybe we should instead be thinking about
how we can do that *1* time centrally, and produce a configuration
that can be reused across 40,000 nodes with little effort. The
opportunity for a significant impact in terms of how we can scale
TripleO is much larger if we consider approaching these problems with
a wider net of what we could do. There's opportunity for a lot of
better reuse in TripleO, configuration is just one area. The plan and
Heat stack (within the ResourceGroup) are some other areas.

At the same time, if some folks want to work on smaller optimizations
(such as container image size), with an approach that can be agreed
upon, then they should do so. We just ought to be careful about how we
justify those changes so that we can carefully weigh the effort vs the
payoff. In this specific case, I don't personally see this proposal
helping us with Edge use cases in a meaningful way given the scope of
the changes. That's not to say there aren't other use cases that could
justify it though (such as the security points brought up earlier).

--
-- James Slagle
--

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Dan Prince
On Wed, 2018-11-28 at 13:28 -0500, James Slagle wrote:

> On Wed, Nov 28, 2018 at 12:31 PM Bogdan Dobrelya <[hidden email]
> > wrote:
> > Long story short, we cannot shoot both rabbits with a single shot,
> > not
> > with puppet :) May be we could with ansible replacing puppet
> > fully...
> > So splitting config and runtime images is the only choice yet to
> > address
> > the raised security concerns. And let's forget about edge cases for
> > now.
> > Tossing around a pair of extra bytes over 40,000 WAN-distributed
> > computes ain't gonna be our the biggest problem for sure.
>
> I think it's this last point that is the crux of this discussion. We
> can agree to disagree about the merits of this proposal and whether
> it's a pre-optimzation or micro-optimization, which I admit are
> somewhat subjective terms. Ultimately, it seems to be about the "why"
> do we need to do this as to the reason why the conversation seems to
> be going in circles a bit.
>
> I'm all for reducing container image size, but the reality is that
> this proposal doesn't necessarily help us with the Edge use cases we
> are talking about trying to solve.
>
> Why would we even run the exact same puppet binary + manifest
> individually 40,000 times so that we can produce the exact same set
> of
> configuration files that differ only by things such as IP address,
> hostnames, and passwords? Maybe we should instead be thinking about
> how we can do that *1* time centrally, and produce a configuration
> that can be reused across 40,000 nodes with little effort. The
> opportunity for a significant impact in terms of how we can scale
> TripleO is much larger if we consider approaching these problems with
> a wider net of what we could do. There's opportunity for a lot of
> better reuse in TripleO, configuration is just one area. The plan and
> Heat stack (within the ResourceGroup) are some other areas.

We run Puppet for configuration because that is what we did on
baremetal and we didn't break backwards compatability for our
configuration options for upgrades. Our Puppet model relies on being
executed on each local host in order to splice in the correct IP
address and hostname. It executes in a distributed fashion, and works
fairly well considering the history of the project. It is robust,
guarantees no duplicate configs are being set, and is backwards
compatible with all the options TripleO supported on baremetal. Puppet
is arguably better for configuration than Ansible (which is what I hear
people most often suggest we replace it with). It suits our needs fine,
but it is perhaps a bit overkill considering we are only generating
config files.

I think the answer here is moving to something like Etcd. Perhaps
skipping over Ansible entirely as a config management tool (it is
arguably less capable than Puppet in this category anyway). Or we could
use Ansible for "legacy" services only, switch to Etcd for a majority
of the OpenStack services, and drop Puppet entirely (my favorite
option). Consolidating our technology stack would be wise.

We've already put some work and analysis into the Etcd effort. Just
need to push on it some more. Looking at the previous Kubernetes
prototypes for TripleO would be the place to start.

Config management migration is going to be tedious. Its technical debt
that needs to be handled at some point anyway. I think it is a general
TripleO improvement that could benefit all clouds, not just Edge.

Dan

>
> At the same time, if some folks want to work on smaller optimizations
> (such as container image size), with an approach that can be agreed
> upon, then they should do so. We just ought to be careful about how
> we
> justify those changes so that we can carefully weigh the effort vs
> the
> payoff. In this specific case, I don't personally see this proposal
> helping us with Edge use cases in a meaningful way given the scope of
> the changes. That's not to say there aren't other use cases that
> could
> justify it though (such as the security points brought up earlier).
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Bogdan Dobrelya-2
On 11/28/18 8:55 PM, Doug Hellmann wrote:
> I thought the preferred solution for more complex settings was config maps. Did that approach not work out?
>
> Regardless, now that the driver work is done if someone wants to take another stab at etcd integration it’ll be more straightforward today.
>
> Doug
>

While sharing configs is a feasible option to consider for large scale
configuration management, Etcd only provides a strong consistency, which
is also known as "Unavailable" [0]. For edge scenarios, to configure
40,000 remote computes over WAN connections, we'd rather want instead
weaker consistency models, like "Sticky Available" [0]. That would allow
services to fetch their configuration either from a central "uplink" or
locally as well, when the latter is not accessible from remote edge
sites. Etcd cannot provide 40,000 local endpoints to fit that case I'm
afraid, even if those would be read only replicas. That is also
something I'm highlighting in the paper [1] drafted for ICFC-2019.

But had we such a sticky available key value storage solution, we would
indeed have solved the problem of multiple configuration management
system execution for thousands of nodes as James describes it.

[0] https://jepsen.io/consistency
[1]
https://github.com/bogdando/papers-ieee/blob/master/ICFC-2019/LaTeX/position_paper_1570506394.pdf

On 11/28/18 11:22 PM, Dan Prince wrote:

> On Wed, 2018-11-28 at 13:28 -0500, James Slagle wrote:
>> On Wed, Nov 28, 2018 at 12:31 PM Bogdan Dobrelya <[hidden email]
>>> wrote:
>>> Long story short, we cannot shoot both rabbits with a single shot,
>>> not
>>> with puppet :) May be we could with ansible replacing puppet
>>> fully...
>>> So splitting config and runtime images is the only choice yet to
>>> address
>>> the raised security concerns. And let's forget about edge cases for
>>> now.
>>> Tossing around a pair of extra bytes over 40,000 WAN-distributed
>>> computes ain't gonna be our the biggest problem for sure.
>>
>> I think it's this last point that is the crux of this discussion. We
>> can agree to disagree about the merits of this proposal and whether
>> it's a pre-optimzation or micro-optimization, which I admit are
>> somewhat subjective terms. Ultimately, it seems to be about the "why"
>> do we need to do this as to the reason why the conversation seems to
>> be going in circles a bit.
>>
>> I'm all for reducing container image size, but the reality is that
>> this proposal doesn't necessarily help us with the Edge use cases we
>> are talking about trying to solve.
>>
>> Why would we even run the exact same puppet binary + manifest
>> individually 40,000 times so that we can produce the exact same set
>> of
>> configuration files that differ only by things such as IP address,
>> hostnames, and passwords? Maybe we should instead be thinking about
>> how we can do that *1* time centrally, and produce a configuration
>> that can be reused across 40,000 nodes with little effort. The
>> opportunity for a significant impact in terms of how we can scale
>> TripleO is much larger if we consider approaching these problems with
>> a wider net of what we could do. There's opportunity for a lot of
>> better reuse in TripleO, configuration is just one area. The plan and
>> Heat stack (within the ResourceGroup) are some other areas.
>
> We run Puppet for configuration because that is what we did on
> baremetal and we didn't break backwards compatability for our
> configuration options for upgrades. Our Puppet model relies on being
> executed on each local host in order to splice in the correct IP
> address and hostname. It executes in a distributed fashion, and works
> fairly well considering the history of the project. It is robust,
> guarantees no duplicate configs are being set, and is backwards
> compatible with all the options TripleO supported on baremetal. Puppet
> is arguably better for configuration than Ansible (which is what I hear
> people most often suggest we replace it with). It suits our needs fine,
> but it is perhaps a bit overkill considering we are only generating
> config files.
>
> I think the answer here is moving to something like Etcd. Perhaps

Not Etcd I think, see my comment above. But you're absolutely right Dan.

> skipping over Ansible entirely as a config management tool (it is
> arguably less capable than Puppet in this category anyway). Or we could
> use Ansible for "legacy" services only, switch to Etcd for a majority
> of the OpenStack services, and drop Puppet entirely (my favorite
> option). Consolidating our technology stack would be wise.
>
> We've already put some work and analysis into the Etcd effort. Just
> need to push on it some more. Looking at the previous Kubernetes
> prototypes for TripleO would be the place to start.
>
> Config management migration is going to be tedious. Its technical debt
> that needs to be handled at some point anyway. I think it is a general
> TripleO improvement that could benefit all clouds, not just Edge.
>
> Dan
>
>>
>> At the same time, if some folks want to work on smaller optimizations
>> (such as container image size), with an approach that can be agreed
>> upon, then they should do so. We just ought to be careful about how
>> we
>> justify those changes so that we can carefully weigh the effort vs
>> the
>> payoff. In this specific case, I don't personally see this proposal
>> helping us with Edge use cases in a meaningful way given the scope of
>> the changes. That's not to say there aren't other use cases that
>> could
>> justify it though (such as the security points brought up earlier).
>>
>


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Jiří Stránský
In reply to this post by Bogdan Dobrelya-2
On 28. 11. 18 18:29, Bogdan Dobrelya wrote:

> On 11/28/18 6:02 PM, Jiří Stránský wrote:
>> <snip>
>>
>>>
>>> Reiterating again on previous points:
>>>
>>> -I'd be fine removing systemd. But lets do it properly and not via 'rpm
>>> -ev --nodeps'.
>>> -Puppet and Ruby *are* required for configuration. We can certainly put
>>> them in a separate container outside of the runtime service containers
>>> but doing so would actually cost you much more space/bandwidth for each
>>> service container. As both of these have to get downloaded to each node
>>> anyway in order to generate config files with our current mechanisms
>>> I'm not sure this buys you anything.
>>
>> +1. I was actually under the impression that we concluded yesterday on
>> IRC that this is the only thing that makes sense to seriously consider.
>> But even then it's not a win-win -- we'd gain some security by leaner
>> production images, but pay for it with space+bandwidth by duplicating
>> image content (IOW we can help achieve one of the goals we had in mind
>> by worsening the situation w/r/t the other goal we had in mind.)
>>
>> Personally i'm not sold yet but it's something that i'd consider if we
>> got measurements of how much more space/bandwidth usage this would
>> consume, and if we got some further details/examples about how serious
>> are the security concerns if we leave config mgmt tools in runtime images.
>>
>> IIRC the other options (that were brought forward so far) were already
>> dismissed in yesterday's IRC discussion and on the reviews. Bin/lib bind
>> mounting being too hacky and fragile, and nsenter not really solving the
>> problem (because it allows us to switch to having different bins/libs
>> available, but it does not allow merging the availability of bins/libs
>> from two containers into a single context).
>>
>>>
>>> We are going in circles here I think....
>>
>> +1. I think too much of the discussion focuses on "why it's bad to have
>> config tools in runtime images", but IMO we all sorta agree that it
>> would be better not to have them there, if it came at no cost.
>>
>> I think to move forward, it would be interesting to know: if we do this
>> (i'll borrow Dan's drawing):
>>
>> |base container| --> |service container| --> |service container w/
>> Puppet installed|
>>
>> How much more space and bandwidth would this consume per node (e.g.
>> separately per controller, per compute). This could help with decision
>> making.
>
> As I've already evaluated in the related bug, that is:
>
> puppet-* modules and manifests ~ 16MB
> puppet with dependencies ~61MB
> dependencies of the seemingly largest a dependency, systemd ~190MB
>
> that would be an extra layer size for each of the container images to be
> downloaded/fetched into registries.

Thanks, i tried to do the math of the reduction vs. inflation in sizes
as follows. I think the crucial point here is the layering. If we do
this image layering:

|base| --> |+ service| --> |+ Puppet|

we'd drop ~267 MB from base image, but we'd be installing that to the
topmost level, per-component, right?

In my basic deployment, undercloud seems to have 17 "components" (49
containers), overcloud controller 15 components (48 containers), and
overcloud compute 4 components (7 containers). Accounting for overlaps,
the total number of "components" used seems to be 19. (By "components"
here i mean whatever uses a different ConfigImage than other services. I
just eyeballed it but i think i'm not too far off the correct number.)

So we'd subtract 267 MB from base image and add that to 19 leaf images
used in this deployment. That means difference of +4.8 GB to the current
image sizes. My /var/lib/registry dir on undercloud with all the images
currently has 5.1 GB. We'd almost double that to 9.9 GB.

Going from 5.1 to 9.9 GB seems like a lot of extra traffic for the CDNs
(both external and e.g. internal within OpenStack Infra CI clouds).

And for internal traffic between local registry and overcloud nodes, it
gives +3.7 GB per controller and +800 MB per compute. That may not be so
critical but still feels like a considerable downside.

Another gut feeling is that this way of image layering would take longer
time to build and to run the modify-image Ansible role which we use in
CI, so that could endanger how our CI jobs fit into the time limit. We
could also probably measure this but i'm not sure if it's worth spending
the time.

All in all i'd argue we should be looking at different options still.

>
> Given that we should decouple systemd from all/some of the dependencies
> (an example topic for RDO [0]), that could save a 190MB. But it seems we
> cannot break the love of puppet and systemd as it heavily relies on the
> latter and changing packaging like that would higly likely affect
> baremetal deployments with puppet and systemd co-operating.

Ack :/

>
> Long story short, we cannot shoot both rabbits with a single shot, not
> with puppet :) May be we could with ansible replacing puppet fully...
> So splitting config and runtime images is the only choice yet to address
> the raised security concerns. And let's forget about edge cases for now.
> Tossing around a pair of extra bytes over 40,000 WAN-distributed
> computes ain't gonna be our the biggest problem for sure.
>
> [0] https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>
>>
>>>
>>> Dan
>>>
>>
>> Thanks
>>
>> Jirka
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: [hidden email]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Fox, Kevin M
If the base layers are shared, you won't pay extra for the separate puppet container unless you have another container also installing ruby in an upper layer. With OpenStack, thats unlikely.

the apparent size of a container is not equal to its actual size.

Thanks,
Kevin
________________________________________
From: Jiří Stránský [[hidden email]]
Sent: Thursday, November 29, 2018 9:42 AM
To: [hidden email]
Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

On 28. 11. 18 18:29, Bogdan Dobrelya wrote:

> On 11/28/18 6:02 PM, Jiří Stránský wrote:
>> <snip>
>>
>>>
>>> Reiterating again on previous points:
>>>
>>> -I'd be fine removing systemd. But lets do it properly and not via 'rpm
>>> -ev --nodeps'.
>>> -Puppet and Ruby *are* required for configuration. We can certainly put
>>> them in a separate container outside of the runtime service containers
>>> but doing so would actually cost you much more space/bandwidth for each
>>> service container. As both of these have to get downloaded to each node
>>> anyway in order to generate config files with our current mechanisms
>>> I'm not sure this buys you anything.
>>
>> +1. I was actually under the impression that we concluded yesterday on
>> IRC that this is the only thing that makes sense to seriously consider.
>> But even then it's not a win-win -- we'd gain some security by leaner
>> production images, but pay for it with space+bandwidth by duplicating
>> image content (IOW we can help achieve one of the goals we had in mind
>> by worsening the situation w/r/t the other goal we had in mind.)
>>
>> Personally i'm not sold yet but it's something that i'd consider if we
>> got measurements of how much more space/bandwidth usage this would
>> consume, and if we got some further details/examples about how serious
>> are the security concerns if we leave config mgmt tools in runtime images.
>>
>> IIRC the other options (that were brought forward so far) were already
>> dismissed in yesterday's IRC discussion and on the reviews. Bin/lib bind
>> mounting being too hacky and fragile, and nsenter not really solving the
>> problem (because it allows us to switch to having different bins/libs
>> available, but it does not allow merging the availability of bins/libs
>> from two containers into a single context).
>>
>>>
>>> We are going in circles here I think....
>>
>> +1. I think too much of the discussion focuses on "why it's bad to have
>> config tools in runtime images", but IMO we all sorta agree that it
>> would be better not to have them there, if it came at no cost.
>>
>> I think to move forward, it would be interesting to know: if we do this
>> (i'll borrow Dan's drawing):
>>
>> |base container| --> |service container| --> |service container w/
>> Puppet installed|
>>
>> How much more space and bandwidth would this consume per node (e.g.
>> separately per controller, per compute). This could help with decision
>> making.
>
> As I've already evaluated in the related bug, that is:
>
> puppet-* modules and manifests ~ 16MB
> puppet with dependencies ~61MB
> dependencies of the seemingly largest a dependency, systemd ~190MB
>
> that would be an extra layer size for each of the container images to be
> downloaded/fetched into registries.

Thanks, i tried to do the math of the reduction vs. inflation in sizes
as follows. I think the crucial point here is the layering. If we do
this image layering:

|base| --> |+ service| --> |+ Puppet|

we'd drop ~267 MB from base image, but we'd be installing that to the
topmost level, per-component, right?

In my basic deployment, undercloud seems to have 17 "components" (49
containers), overcloud controller 15 components (48 containers), and
overcloud compute 4 components (7 containers). Accounting for overlaps,
the total number of "components" used seems to be 19. (By "components"
here i mean whatever uses a different ConfigImage than other services. I
just eyeballed it but i think i'm not too far off the correct number.)

So we'd subtract 267 MB from base image and add that to 19 leaf images
used in this deployment. That means difference of +4.8 GB to the current
image sizes. My /var/lib/registry dir on undercloud with all the images
currently has 5.1 GB. We'd almost double that to 9.9 GB.

Going from 5.1 to 9.9 GB seems like a lot of extra traffic for the CDNs
(both external and e.g. internal within OpenStack Infra CI clouds).

And for internal traffic between local registry and overcloud nodes, it
gives +3.7 GB per controller and +800 MB per compute. That may not be so
critical but still feels like a considerable downside.

Another gut feeling is that this way of image layering would take longer
time to build and to run the modify-image Ansible role which we use in
CI, so that could endanger how our CI jobs fit into the time limit. We
could also probably measure this but i'm not sure if it's worth spending
the time.

All in all i'd argue we should be looking at different options still.

>
> Given that we should decouple systemd from all/some of the dependencies
> (an example topic for RDO [0]), that could save a 190MB. But it seems we
> cannot break the love of puppet and systemd as it heavily relies on the
> latter and changing packaging like that would higly likely affect
> baremetal deployments with puppet and systemd co-operating.

Ack :/

>
> Long story short, we cannot shoot both rabbits with a single shot, not
> with puppet :) May be we could with ansible replacing puppet fully...
> So splitting config and runtime images is the only choice yet to address
> the raised security concerns. And let's forget about edge cases for now.
> Tossing around a pair of extra bytes over 40,000 WAN-distributed
> computes ain't gonna be our the biggest problem for sure.
>
> [0] https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>
>>
>>>
>>> Dan
>>>
>>
>> Thanks
>>
>> Jirka
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: [hidden email]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Fox, Kevin M
Oh, rereading the conversation again, the concern is having shared deps move up layers? so more systemd related then ruby?

The conversation about --nodeps makes it sound like its not actually used. Just an artifact of how the rpms are built... What about creating a dummy package that provides(systemd)? That avoids using --nodeps.

Thanks,
Kevin
________________________________________
From: Fox, Kevin M [[hidden email]]
Sent: Thursday, November 29, 2018 11:20 AM
To: Former OpenStack Development Mailing List, use openstack-discuss now
Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

If the base layers are shared, you won't pay extra for the separate puppet container unless you have another container also installing ruby in an upper layer. With OpenStack, thats unlikely.

the apparent size of a container is not equal to its actual size.

Thanks,
Kevin
________________________________________
From: Jiří Stránský [[hidden email]]
Sent: Thursday, November 29, 2018 9:42 AM
To: [hidden email]
Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

On 28. 11. 18 18:29, Bogdan Dobrelya wrote:

> On 11/28/18 6:02 PM, Jiří Stránský wrote:
>> <snip>
>>
>>>
>>> Reiterating again on previous points:
>>>
>>> -I'd be fine removing systemd. But lets do it properly and not via 'rpm
>>> -ev --nodeps'.
>>> -Puppet and Ruby *are* required for configuration. We can certainly put
>>> them in a separate container outside of the runtime service containers
>>> but doing so would actually cost you much more space/bandwidth for each
>>> service container. As both of these have to get downloaded to each node
>>> anyway in order to generate config files with our current mechanisms
>>> I'm not sure this buys you anything.
>>
>> +1. I was actually under the impression that we concluded yesterday on
>> IRC that this is the only thing that makes sense to seriously consider.
>> But even then it's not a win-win -- we'd gain some security by leaner
>> production images, but pay for it with space+bandwidth by duplicating
>> image content (IOW we can help achieve one of the goals we had in mind
>> by worsening the situation w/r/t the other goal we had in mind.)
>>
>> Personally i'm not sold yet but it's something that i'd consider if we
>> got measurements of how much more space/bandwidth usage this would
>> consume, and if we got some further details/examples about how serious
>> are the security concerns if we leave config mgmt tools in runtime images.
>>
>> IIRC the other options (that were brought forward so far) were already
>> dismissed in yesterday's IRC discussion and on the reviews. Bin/lib bind
>> mounting being too hacky and fragile, and nsenter not really solving the
>> problem (because it allows us to switch to having different bins/libs
>> available, but it does not allow merging the availability of bins/libs
>> from two containers into a single context).
>>
>>>
>>> We are going in circles here I think....
>>
>> +1. I think too much of the discussion focuses on "why it's bad to have
>> config tools in runtime images", but IMO we all sorta agree that it
>> would be better not to have them there, if it came at no cost.
>>
>> I think to move forward, it would be interesting to know: if we do this
>> (i'll borrow Dan's drawing):
>>
>> |base container| --> |service container| --> |service container w/
>> Puppet installed|
>>
>> How much more space and bandwidth would this consume per node (e.g.
>> separately per controller, per compute). This could help with decision
>> making.
>
> As I've already evaluated in the related bug, that is:
>
> puppet-* modules and manifests ~ 16MB
> puppet with dependencies ~61MB
> dependencies of the seemingly largest a dependency, systemd ~190MB
>
> that would be an extra layer size for each of the container images to be
> downloaded/fetched into registries.

Thanks, i tried to do the math of the reduction vs. inflation in sizes
as follows. I think the crucial point here is the layering. If we do
this image layering:

|base| --> |+ service| --> |+ Puppet|

we'd drop ~267 MB from base image, but we'd be installing that to the
topmost level, per-component, right?

In my basic deployment, undercloud seems to have 17 "components" (49
containers), overcloud controller 15 components (48 containers), and
overcloud compute 4 components (7 containers). Accounting for overlaps,
the total number of "components" used seems to be 19. (By "components"
here i mean whatever uses a different ConfigImage than other services. I
just eyeballed it but i think i'm not too far off the correct number.)

So we'd subtract 267 MB from base image and add that to 19 leaf images
used in this deployment. That means difference of +4.8 GB to the current
image sizes. My /var/lib/registry dir on undercloud with all the images
currently has 5.1 GB. We'd almost double that to 9.9 GB.

Going from 5.1 to 9.9 GB seems like a lot of extra traffic for the CDNs
(both external and e.g. internal within OpenStack Infra CI clouds).

And for internal traffic between local registry and overcloud nodes, it
gives +3.7 GB per controller and +800 MB per compute. That may not be so
critical but still feels like a considerable downside.

Another gut feeling is that this way of image layering would take longer
time to build and to run the modify-image Ansible role which we use in
CI, so that could endanger how our CI jobs fit into the time limit. We
could also probably measure this but i'm not sure if it's worth spending
the time.

All in all i'd argue we should be looking at different options still.

>
> Given that we should decouple systemd from all/some of the dependencies
> (an example topic for RDO [0]), that could save a 190MB. But it seems we
> cannot break the love of puppet and systemd as it heavily relies on the
> latter and changing packaging like that would higly likely affect
> baremetal deployments with puppet and systemd co-operating.

Ack :/

>
> Long story short, we cannot shoot both rabbits with a single shot, not
> with puppet :) May be we could with ansible replacing puppet fully...
> So splitting config and runtime images is the only choice yet to address
> the raised security concerns. And let's forget about edge cases for now.
> Tossing around a pair of extra bytes over 40,000 WAN-distributed
> computes ain't gonna be our the biggest problem for sure.
>
> [0] https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>
>>
>>>
>>> Dan
>>>
>>
>> Thanks
>>
>> Jirka
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: [hidden email]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes

Jiří Stránský
In reply to this post by Fox, Kevin M
On 29. 11. 18 20:20, Fox, Kevin M wrote:
> If the base layers are shared, you won't pay extra for the separate puppet container

Yes, and that's the state we're in right now.

>unless you have another container also installing ruby in an upper layer.

Not just Ruby but also Puppet and Systemd. I think that's what the
proposal we're discussing here suggests -- removing this content from
the base layer (so that we can get service runtime images without this
content present) and putting this content *on top* of individual service
images. Unless i'm missing some trick to start sharing *top* layers
rather than *base* layers, i think that effectively disables the space
sharing for the Ruby+Puppet+Systemd content.

> With OpenStack, thats unlikely.
>
> the apparent size of a container is not equal to its actual size.

Yes. :)

Thanks

Jirka

>
> Thanks,
> Kevin
> ________________________________________
> From: Jiří Stránský [[hidden email]]
> Sent: Thursday, November 29, 2018 9:42 AM
> To: [hidden email]
> Subject: Re: [openstack-dev] [TripleO][Edge] Reduce base layer of containers for security and size of images (maintenance) sakes
>
> On 28. 11. 18 18:29, Bogdan Dobrelya wrote:
>> On 11/28/18 6:02 PM, Jiří Stránský wrote:
>>> <snip>
>>>
>>>>
>>>> Reiterating again on previous points:
>>>>
>>>> -I'd be fine removing systemd. But lets do it properly and not via 'rpm
>>>> -ev --nodeps'.
>>>> -Puppet and Ruby *are* required for configuration. We can certainly put
>>>> them in a separate container outside of the runtime service containers
>>>> but doing so would actually cost you much more space/bandwidth for each
>>>> service container. As both of these have to get downloaded to each node
>>>> anyway in order to generate config files with our current mechanisms
>>>> I'm not sure this buys you anything.
>>>
>>> +1. I was actually under the impression that we concluded yesterday on
>>> IRC that this is the only thing that makes sense to seriously consider.
>>> But even then it's not a win-win -- we'd gain some security by leaner
>>> production images, but pay for it with space+bandwidth by duplicating
>>> image content (IOW we can help achieve one of the goals we had in mind
>>> by worsening the situation w/r/t the other goal we had in mind.)
>>>
>>> Personally i'm not sold yet but it's something that i'd consider if we
>>> got measurements of how much more space/bandwidth usage this would
>>> consume, and if we got some further details/examples about how serious
>>> are the security concerns if we leave config mgmt tools in runtime images.
>>>
>>> IIRC the other options (that were brought forward so far) were already
>>> dismissed in yesterday's IRC discussion and on the reviews. Bin/lib bind
>>> mounting being too hacky and fragile, and nsenter not really solving the
>>> problem (because it allows us to switch to having different bins/libs
>>> available, but it does not allow merging the availability of bins/libs
>>> from two containers into a single context).
>>>
>>>>
>>>> We are going in circles here I think....
>>>
>>> +1. I think too much of the discussion focuses on "why it's bad to have
>>> config tools in runtime images", but IMO we all sorta agree that it
>>> would be better not to have them there, if it came at no cost.
>>>
>>> I think to move forward, it would be interesting to know: if we do this
>>> (i'll borrow Dan's drawing):
>>>
>>> |base container| --> |service container| --> |service container w/
>>> Puppet installed|
>>>
>>> How much more space and bandwidth would this consume per node (e.g.
>>> separately per controller, per compute). This could help with decision
>>> making.
>>
>> As I've already evaluated in the related bug, that is:
>>
>> puppet-* modules and manifests ~ 16MB
>> puppet with dependencies ~61MB
>> dependencies of the seemingly largest a dependency, systemd ~190MB
>>
>> that would be an extra layer size for each of the container images to be
>> downloaded/fetched into registries.
>
> Thanks, i tried to do the math of the reduction vs. inflation in sizes
> as follows. I think the crucial point here is the layering. If we do
> this image layering:
>
> |base| --> |+ service| --> |+ Puppet|
>
> we'd drop ~267 MB from base image, but we'd be installing that to the
> topmost level, per-component, right?
>
> In my basic deployment, undercloud seems to have 17 "components" (49
> containers), overcloud controller 15 components (48 containers), and
> overcloud compute 4 components (7 containers). Accounting for overlaps,
> the total number of "components" used seems to be 19. (By "components"
> here i mean whatever uses a different ConfigImage than other services. I
> just eyeballed it but i think i'm not too far off the correct number.)
>
> So we'd subtract 267 MB from base image and add that to 19 leaf images
> used in this deployment. That means difference of +4.8 GB to the current
> image sizes. My /var/lib/registry dir on undercloud with all the images
> currently has 5.1 GB. We'd almost double that to 9.9 GB.
>
> Going from 5.1 to 9.9 GB seems like a lot of extra traffic for the CDNs
> (both external and e.g. internal within OpenStack Infra CI clouds).
>
> And for internal traffic between local registry and overcloud nodes, it
> gives +3.7 GB per controller and +800 MB per compute. That may not be so
> critical but still feels like a considerable downside.
>
> Another gut feeling is that this way of image layering would take longer
> time to build and to run the modify-image Ansible role which we use in
> CI, so that could endanger how our CI jobs fit into the time limit. We
> could also probably measure this but i'm not sure if it's worth spending
> the time.
>
> All in all i'd argue we should be looking at different options still.
>
>>
>> Given that we should decouple systemd from all/some of the dependencies
>> (an example topic for RDO [0]), that could save a 190MB. But it seems we
>> cannot break the love of puppet and systemd as it heavily relies on the
>> latter and changing packaging like that would higly likely affect
>> baremetal deployments with puppet and systemd co-operating.
>
> Ack :/
>
>>
>> Long story short, we cannot shoot both rabbits with a single shot, not
>> with puppet :) May be we could with ansible replacing puppet fully...
>> So splitting config and runtime images is the only choice yet to address
>> the raised security concerns. And let's forget about edge cases for now.
>> Tossing around a pair of extra bytes over 40,000 WAN-distributed
>> computes ain't gonna be our the biggest problem for sure.
>>
>> [0] https://review.rdoproject.org/r/#/q/topic:base-container-reduction
>>
>>>
>>>>
>>>> Dan
>>>>
>>>
>>> Thanks
>>>
>>> Jirka
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: [hidden email]?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
12