+1 to the entire proposal. On Thu, 2012-07-19 at 18:56 +0100, Gordon Sim wrote: > I have been looking at what would be required to get AMQP 1.0 support > alongside AMQP 0-10 support in the c++ broker, i.e. qpidd. > > As part of that it became clear some refactoring of the broker codebase > would be required[1]. That in turn led me to believe that we should > consider dropping certain features. These would be dropped *after* the > pending 0.18 release; i.e. they would still be present in 0.18, but that > would be the last release in which they were present if my proposal were > accepted. > > The purpose of this mail is to list the features I would propose to drop > and my reasons for doing so. For those who find it overly long, I > apologise and offer a very short summary at the end! > > In each case the basic argument is that I believe the features are not > very well implemented and keeping them working as part of my refactoring > would take extra time that I would rather spend on achieving 1.0 support > making real improvements. > > The first feature I propose we drop is the 'legacy' versions of LVQ > behaviour. These forced a choice in the behaviour of the queue when > browsers (i.e. not destructive subscribers) received messages from it. > The choice was to either have browsers miss updates, or to suppress the > replacing of one message by another with a matching key. This choice was > really driven by a technical problem with the first implementation. We > have since already moved to an improved implementation where the > distinction is not relevant. I see no good reason to keep the old > behaviour any longer. > > The second feature is the old async queue replication mechanism. This is > very fragile and I believe is no longer necessary given the new and > improved ha solution that first appeared in 0.16 and has been improved > significantly for 0.18. > > The third feature is the 'last man standing' or 'cluster durable' > option. The biggest reason for dropping this comes later(!), but > considered on its own my concern is that there are no system level tests > for it so it is very hard to guarantee it still works without writing > all those tests. I am entirely unconvinced by this solution, and think > that again the new HA mechanism would be a better way to achieve this > (you could start up a backup node that forced all the replicated > messages to disk). I am therefore keen to avoid wasting time and effort. > > The fourth feature is - wait for it - the clustered broker capability as > enabled by the cluster.so plugin. I believe this is nearing the end of > its life anyway. It is currently only available on linux with no real > prospects of being ported to windows. The design as it turns out was > very fragile to changes in the codebase and there are still some > difficult to solve bugs within it. A new HA mechanism has been developed > (as alluded to above) and I believe that will replace the old cluster. > The work needed to keep the cluster working through my refactor is > sizeable. It would in any case have the potential to destabilise the > cluster (the aforementioned issue with fragility). This seems to me to > argue strongly for dropping this in releases after 0.18, and for anyone > affected, that would give them some time to try out the new HA and give > feedback as well. > > The fifth and final feature I propose we drop is the confusingly named > 'flow to disk' feature. Now for this one I have no alternative to offer > yet. The problem is supporting large queues whose aggregate size far > exceeds a bounded amount of memory. I believe the current implementation > is next to useless for the majority of cases as it keeps the headers of > all messages in memory. It is useless unless your messages are large > enough that the overhead keeping these headers in memory is outweighed > by the size of the body (this overhead is significantly larger than the > transfer size of the headers). Further, since a common cause for large > queues is a short lived disparity between the rate of inflow and > outflow, the current solution can compound the problem by radically > slowing down consumers even more. I believe there is a better solution > and I'm not convinced the current solution is worth the effort of > maintaining any further. (I know Kim has been working on a new store > interface and removing flow to disk would clean that up nicely as well!) > > I hope this makes sense. I'm keen to get any thoughts or feedback on > these points. The purpose is not to deprive anyone of features they are > using but rather to spend time on more important work. > > Summary: > > features to drop are: > > (i) legacy lvq modes; lvq support would still remain, only the two old > and peculiar modes would go; I really doubt anyone actually depends on > these anyway, they were more a limitation than a feature > > (ii) asynchronous queue replication; solution is not mature enough for > real world use anyway due to fragility and inability to resync; new HA > mechanism as introduced in 0.16 and improved on in 0.18 should address > the need anyway. > > (iii) clustering including last-man-standing mode; design is brittle and > currently ties it to linux platform; new HA is the long term solution > here anyway. > > (iv) flow to disk; current solution really doesn't solve the problem anyway > > --Gordon > > [1] If you are interested at all, you kind find my latest patch and some > notes on the internal changes up on reviewboard: > https://reviews.apache.org/r/5833/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org > For additional commands, e-mail: users-h...@qpid.apache.org >
--------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org For additional commands, e-mail: users-h...@qpid.apache.org