On 26/06/2013, at 10:37 PM, Lars Marowsky-Bree <l...@suse.com> wrote:

> On 2013-06-26T21:31:14, Andrew Beekhof <and...@beekhof.net> wrote:
> 
>>> Distributions can take care of them when they integrate them; basically
>>> they'll trickle through until the whole stack the distributions ship
>>> builds again.
>> If we let 2.0.x be anything like 1.1.x, I suspect this would be rather 
>> difficult.
> 
> Not sure. With the sore exception of 1.1.8, the integration effort was
> reasonable, even for an Enterprise distribution. Yes, for changes so
> large and intrusive, a temporary branch (or a longer release cycle)
> would probably be preferable.

I wouldn't say the 6 months between 1.1.7 and 1.1.8 was a particularly 
aggressive release cycle.
Generally though, it has always been hard/dangerous to backport specific fixes 
because of the complexity of the interactions - particularly in the PE.

> 
>> The change I'm thinking of (CPG codepaths and global variables) was becoming 
>> a major support overhead and all-round headache.
>> I hadn't planned to make that change, but it was the best way to fix a bug 
>> that was holding up the release.
> 
> Yeah, that one. If it fixes a bug, it was probably unavoidable (though
> the specific commit (953bedf8f7f54f91a76672aeee5f44dc465741e9) didn't
> mention a bugzilla id).

It has always been the case that I find and fix far more bugs than people 
report.
I don't plan to start filing bugs for myself.

> But that trickles through all consumers here - OCFS2, DLM, sbd. Means we
> have to do more validation than a -rc should normally need - normally,
> during an rcX phase, I'd expect small, well-contained bugfixes for
> regressions only.
> 
> But perhaps this was one such exception.

Normally I would have waited until after the final release, and have done so in 
the past for other changes.

In this case though, I made an exception because the plan is to NOT have 
another 1.1.x and "it is still my intention not to have API changes in 2.0.x, 
so better before than after".

Granted I had completely forgotten about the plugin editions of ocfs2/dlm, but 
I was told you'd already deep frozen what you were planning to ship, so I don't 
understand the specific concern.

There is never a good point to make these changes, even if I make them just 
after a release people will just grumble when it comes time to look at the next 
one - just like you did above for 1.1.8.

> (Which bug did it fix, by the way? Can't immediately spot it from the
> commit code.)

Processes spinning for a few minutes while trying to send a CPG message.
First for corosync 2.x, then later for cman, then again for pacemakerd.

I borked at creating a third copy of that code when I noticed a bug in the 
second.
I much preferred the old cib_ais_dispatch() method signature but to make it 
work with the corosync's API required all kinds of nastiness which made it very 
brittle.

>> Plus it is still my intention not to have API changes in 2.0.x, so better 
>> before than after.
> 
> I wonder how that will go ;-)

We did pretty well with 1.0 once the line was drawn (after about .5 iirc).

> I don't really mind the API changes much,
> for me it's mostly a question of timing and how big the pile is at every
> single release.

I thought you wanted longer release cycles... wouldn't that make the pile 
bigger?  
And is it not better to batch them up and have one point of incompatibility 
rather than a continuous stream of them?

> If you consider the API from a customer point of view, the things like
> build or runtime dependencies on shared libraries aren't so much of an
> issue - hopefully, the distribution provider hides that before they
> release updates. Hence my "Oh well, I don't care" stance.

Except if it affects ocf2/dlm/sbd?

> What's more troublesome are changes to existing commands (even something
> minimal like "crm_resource -M" now generating a short location
> constraint,

I find it confusing how an contained 10 line change to a CLI tool is 
troublesome but you're prepared to wear the overhead of holding back API 
changes - which usually impact all sorts of code paths, sometimes across 
multiple projects.

Surely this would be the easiest of any possible change to hold back.

> which could potentially break scripts that interact with the
> CIB), or major changes to log messages (since those do break customer's
> scripts and monitoring environments).

CLI output I can usually be convinced of, but log messages are most definitely 
not something I will consider as a valid programming interface.

I have not and will not change them just to annoy people, but I must be allowed 
to reduce the level of noise and other improve them or rename the functions 
that produce them when appropriate (which changes the "functionname:" portion).

I have been hammered for years on the amount of logs Pacemaker produces, yet 
the moment I try to do something about it... sigh.

> 
>>> Important is to of course keep the major/minor numbers of the libraries
>>> updated so one doesn't get runtime problems.
>> I have been quite diligent running ./bumplibs.sh in preparation for releases 
>> for a while now.
> 
> Yes. Didn't mean to say it isn't working, just wanted to mention it.

The make target that generates the changelog prints out a reminder in bold 
magenta that ./bumplibs.sh needs to be run.
I have re-run it for 1.1.10 several times - its "at your own risk" if you're 
taking something between 1.1.x and 1.1.y though.
Upstream can't be held responsible for that.

> Because an update that fails to install until all dependencies are fixed
> is (mostly) fine, but one that installs and then breaks really annoys
> customers ;-)

Yep.


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to