Re: [openstack-dev] creating a default for oslo config variables within a project?

Ben Nemec Wed, 04 Dec 2013 11:15:38 -0800

On 2013-12-04 12:51, Sean Dague wrote:

On 12/04/2013 11:56 AM, Ben Nemec wrote:
On 2013-12-04 06:07, Sean Dague wrote:
On 12/03/2013 11:21 PM, Clint Byrum wrote:
Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
On 12/03/2013 06:13 PM, Ben Nemec wrote:
On 2013-12-03 17:09, Sean Dague wrote:
On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
On 2013-12-03 15:56, Sean Dague wrote:
This cinder patch - https://review.openstack.org/#/c/48935/
Is blocked on failing upgrade because the updated oslo
lockutils won't
function until there is a specific configuration variableadded
to the
cinder.conf.

That work around is proposed here -
https://review.openstack.org/#/c/52070/3
However I think this is exactly the kind of forward breaksthat we
want
to prevent with grenade, as cinder failing to function after a
rolling
upgrade because a config item wasn't added is exactly the kind
of pain
we are trying to prevent happening to ops.
So the question is, how is this done correctly so that adefault
can be
set in the cinder code for this value, and it not require aconfig
change to work?
You're absolutely correct, in principle - if the default valueforlock_path worked for users before, we should absolutely continueto
support it.
I don't know that I have a good answer on how to handle this,
but for
context this change is the result of a nasty bug in lockutilsthat
meant
external locks were doing nothing if lock_path wasn't set.
Basically
it's something we should never have allowed in the first place.

As far as setting this in code, it's important that all of the
processes
for a service are using the same value to avoid the same bad
situation
we were in before.  For tests, we have a lockutils wrapper
(https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
that sets an environment variable to address this, but thatonlyworks if all of the processes are going to be spawned fromwithin
the same wrapper, and I'm not sure how secure that is for
production
deployments since it puts all of the lock files in a temporary
directory.
Right, I don't think the previous default really "worked" - if
you used
the default, then external locking was broken.
I suspect most distros do set a default - I see RDO has this inits
default nova.conf:

  lock_path = /var/lib/nova/tmp

So, yes - this is all terrible.
IMHO, rather than raise an exception we should log a big fatwarningabout relying on the default and perhaps just treat the lock asanin-process lock in that case ... since that's essentially whatit
was
before, right?
So a default of lock_path = /tmp will work (FHS says that pathhas
to be
there), even if not optimal. Could we make it a default valuelike
that
instead of the current default which is null (and hence theproblem).
IIRC, my initial fix was something similar to that, but it gotshot
down
because putting the lock files in a known world writeable location
was a
security issue.
Although maybe if we put them in a subdirectory of /tmp andensured
that
the permissions were such that only the user running the servicecould
use that directory, it might be acceptable?  We could still log a
warning if we wanted.
This seems like it would have implications for people runningserviceson Windows too, but we can probably find a way to make that workif we
decide on a solution.
How is that a security issue? Are the lock files being written with
some
sensitive data in them and have g or o permissions on? The stickybit
(+t) on /tmp will prevent other users from deleting the file.
Right, but it won't prevent users from creating a symlink with thesame
name.
ln -s /var/lib/nova/instances/xxxxx/image.raw/tmp/well.known.location
Now when you do

with open('/tmp/well.known.location', 'w') as lockfile:
  lockfile.write('Stuff')

Nova has just truncated the image file and written 'Stuff' to it.
So that's the generic case (and the way people often write this). But
the oslo lockutils code doesn't work that way. While it does open the
file for write, it does not actually write, it's using fcntl to hold
locks. That's taking a data structure on the fd in kernel memory(IIRC),
so it correctly gives it up if the process crashes.

I'm not saying there isn't some other possible security vulnerability
here as well, but it's not jumping out at me. So I'd love tounderstand
that, because if we can close that exposure we can provide a working
default, plus a strong recommendation for how to do that *right*. I'dbetotally happy with printing WARNING level at startup if lock_path =/tmp
that this should be adjusted.
Full disclosure: I don't claim to be a security expert, so take my
thoughts on this with a grain of salt.

tldr: I still don't see a way to do this without breaking _something_.

Unfortunately, while we don't actually write to the file, just opening
it for write access truncates it.  So there remains the issue Clint
raised if someone can make that file a link.  I suppose we could check
for the bad things people might do, but I'm pretty sure that way lies
madness.
Fair point, more coffee before I make statements like that I guess :).

Follow up question: could we change it to open in "a" mode instead.
Still requires write access, but won't truncate.
As a followup to my suggestion to have lockutils create something like
/tmp/oslo-lock-path and make it accessible only to the owner, I think
there's still a problem because a malicious user could create that
directory ahead of time and when OpenStack is run for the first timeit
would happily use the existing directory and potentially overwrite any
linked files in it.  There might be ways to avoid that, but I suspect
they would be racy (which I believe is why you're not supposed to use
tmp paths that are known ahead of time).  Granted, I'm not sure you
should be installing to a system that already has malicious users onit,but I suspect security people would be less than thrilled with thatanswer.
So the workable options that I see right now are to have it break on
upgrade if lock_path is not set, or continue to break external locking
by having it fall back to process locks when lock_path is not set, as
Mark suggested. I'm not wild about either so I'd love to hear ifanyone
has a third option.
Honestly, I'd love us to be clever and figure out a not dangerous way
through this, even if unwise (where we can yell at the user in the LOGs
loudly, and fail them in J if lock_dir=/tmp) that lets us progress
through this while gracefully bringing configs into line.


Copying openstack-security since they should probably sign off on this.

-Ben

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] creating a default for oslo config variables within a project?

Reply via email to