I realise the chances of any help on this are slim with the current advice on
moving to cfengine3, but I don't have the resources to do that, and I've just
hit a bug in cfengine2 which is a royal pain, and I was hoping someone can help
me fix it if possible. Actually it's two things, one definite bug, and one
side-effect of the way cfengine works that prevents me hacking my way around
the first bug.
Consider a largish organisation using methods to do certain things (I use them
for setting values in /etc/sysctl.conf, for example). You have several admins,
editing different parts of the policy, probably for different sets of machines.
These should not interfere with each other, but I've found a case where they
do. Here's a toy example that shows the problem - consider a tiny method which
basically does nothing, other than an alert to say it's run:
control:
MethodName = ( TestMethod )
MethodParameters = ( data )
actionsequence = ( editfiles )
classes:
dummy = ( any )
editfiles:
linux::
{ /tmp/foo
AppendIfNoSuchLine "${data}"
}
alerts:
dummy::
"Executing method with ${data}"
ReturnVariables(void)
ReturnClasses(void)
Now, call this from a cfengine policy:
control:
any::
actionsequence = ( methods )
methods:
linux::
TestMethod("wibble")
action=cf.TestMethod returnv=null returnc=null
All is well, this works.
Problem 1)
Now another admin comes along and adds another stanza for some other system.
It might be in an import in a faraway part of the policy, but the effect would
be that this has been done:
control:
any::
actionsequence = ( methods )
methods:
solaris::
TestMethod("wibble")
action=cf.TestMethod returnv=null returnc=null
linux::
TestMethod("wibble")
action=cf.TestMethod returnv=null returnc=null
When you run this policy on a linux host, the method is never called, because
the solaris stanza causes the method-dispatch to lock with a 1 minute elapse
timer *even though the method isn't actually being executed in that context*,
and so the linux stanza never gets executed. I've replicated this bug on 2.2.8
and 2.2.10.
So what can I do about this? I realise I could refactor the the cfengine
policy so that any given set of method invocation arguments was only every
called once, but that doesn't stop the problem resurfacing accidentally,
especially if you want sysadmin teams to be reasonably independent in their
work on their own machines.
One workaround I'm considering is hardcoding the ifelapsed and expireafter time
values in calls to GetLock for the methods-dispatch database to zero in do.c.
I don't *think* this is harmful, and in testing it does seem to work, but I'd
appreciate it if someone more familiar with the code could comment.
Problem 2)
It still doesn't completely solve the problem, though, because the various
calls to a method might also enable editing the same file (as indeed they do in
the above case of updating /tmp/foo), so in my real world case my fix still
doesn't work.
In fact, the problem's even worse than that - even if the parameters to the
method are different, such that Problem 1 doesn't occur, any second execution
of the method with different parameters will still fail, this time because of a
lock on editfiles and /tmp/foo
I tried setting the editfiles stanza within the method to have IfElapsed 0, but
that doesn't work.
All in all, I think I'm probably screwed here. methods are the only thing in
cfengine2 that really allow for parameterisation, but you can't actually
execute a single method more than once in any single invocation of cfagent.
Presumably this is all an artefact of the fact that methods don't inherit any
locks from the parents, but actually really behave as completely independent
cfagent processes, which was perhaps a slightly unfortunate implementation
decision.
Am I missing something?
Does this all get better with cfengine3, or am I still going to be bitten by
the same problem, especially problem 2?
Have I got my approach completely arse-about-face, and is there a better way of
achieving what I'm trying to do?
Thanks,
Tim
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
Help-cfengine mailing list
[email protected]
https://cfengine.org/mailman/listinfo/help-cfengine