+1 to the entire proposal.

On Thu, 2012-07-19 at 18:56 +0100, Gordon Sim wrote:
> I have been looking at what would be required to get AMQP 1.0 support 
> alongside AMQP 0-10 support in the c++ broker, i.e. qpidd.
> 
> As part of that it became clear some refactoring of the broker codebase 
> would be required[1]. That in turn led me to believe that we should 
> consider dropping certain features. These would be dropped *after* the 
> pending 0.18 release; i.e. they would still be present in 0.18, but that 
> would be the last release in which they were present if my proposal were 
> accepted.
> 
> The purpose of this mail is to list the features I would propose to drop 
> and my reasons for doing so. For those who find it overly long, I 
> apologise and offer a very short summary at the end!
> 
> In each case the basic argument is that I believe the features are not 
> very well implemented and keeping them working as part of my refactoring 
> would take extra time that I would rather spend on achieving 1.0 support 
> making real improvements.
> 
> The first feature I propose we drop is the 'legacy' versions of LVQ 
> behaviour. These forced a choice in the behaviour of the queue when 
> browsers (i.e. not destructive subscribers) received messages from it. 
> The choice was to either have browsers miss updates, or to suppress the 
> replacing of one message by another with a matching key. This choice was 
> really driven by a technical problem with the first implementation. We 
> have since already moved to an improved implementation where the 
> distinction is not relevant. I see no good reason to keep the old 
> behaviour any longer.
> 
> The second feature is the old async queue replication mechanism. This is 
> very fragile and I believe is no longer necessary given the new and 
> improved ha solution that first appeared in 0.16 and has been improved 
> significantly for 0.18.
> 
> The third feature is the 'last man standing' or 'cluster durable' 
> option. The biggest reason for dropping this comes later(!), but 
> considered on its own my concern is that there are no system level tests 
> for it so it is very hard to guarantee it still works without writing 
> all those tests. I am entirely unconvinced by this solution, and think 
> that again the new HA mechanism would be a better way to achieve this 
> (you could start up a backup node that forced all the replicated 
> messages to disk). I am therefore keen to avoid wasting time and effort.
> 
> The fourth feature is - wait for it - the clustered broker capability as 
> enabled by the cluster.so plugin. I believe this is nearing the end of 
> its life anyway. It is currently only available on linux with no real 
> prospects of being ported to windows. The design as it turns out was 
> very fragile to changes in the codebase and there are still some 
> difficult to solve bugs within it. A new HA mechanism has been developed 
> (as alluded to above) and I believe that will replace the old cluster. 
> The work needed to keep the cluster working through my refactor is 
> sizeable. It would in any case have the potential to destabilise the 
> cluster (the aforementioned issue with fragility). This seems to me to 
> argue strongly for dropping this in releases after 0.18, and for anyone 
> affected, that would give them some time to try out the new HA and give 
> feedback as well.
> 
> The fifth and final feature I propose we drop is the confusingly named 
> 'flow to disk' feature. Now for this one I have no alternative to offer 
> yet. The problem is supporting large queues whose aggregate size far 
> exceeds a bounded amount of memory. I believe the current implementation 
> is next to useless for the majority of cases as it keeps the headers of 
> all messages in memory. It is useless unless your messages are large 
> enough that the overhead keeping these headers in memory is outweighed 
> by the size of the body (this overhead is significantly larger than the 
> transfer size of the headers). Further, since a common cause for large 
> queues is a short lived disparity between the rate of inflow and 
> outflow, the current solution can compound the problem by radically 
> slowing down consumers even more. I believe there is a better solution 
> and I'm not convinced the current solution is worth the effort of 
> maintaining any further. (I know Kim has been working on a new store 
> interface and removing flow to disk would clean that up nicely as well!)
> 
> I hope this makes sense. I'm keen to get any thoughts or feedback on 
> these points. The purpose is not to deprive anyone of features they are 
> using but rather to spend time on more important work.
> 
> Summary:
> 
> features to drop are:
> 
> (i) legacy lvq modes; lvq support would still remain, only the two old 
> and peculiar modes would go; I really doubt anyone actually depends on 
> these anyway, they were more a limitation than a feature
> 
> (ii) asynchronous queue replication; solution is not mature enough for 
> real world use anyway due to fragility and inability to resync; new HA 
> mechanism as introduced in 0.16 and improved on in 0.18 should address 
> the need anyway.
> 
> (iii) clustering including last-man-standing mode; design is brittle and 
> currently ties it to linux platform; new HA is the long term solution 
> here anyway.
> 
> (iv) flow to disk; current solution really doesn't solve the problem anyway
> 
> --Gordon
> 
> [1] If you are interested at all, you kind find my latest patch and some 
> notes on the internal changes up on reviewboard: 
> https://reviews.apache.org/r/5833/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> For additional commands, e-mail: users-h...@qpid.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
For additional commands, e-mail: users-h...@qpid.apache.org

Reply via email to