Firstly:  So, there is no /real/ load balancing, right?  

A client using failover, even if we use the update/rebalanceClusterxxx
broker settings, all clients COULD still all select the same broker, even if
randomize=true.  So if I have 2 clients, each connected to one broker, and I
add a second broker to the cluster, and I'm using
updateClusterClients="true" and rebalanceClusterClients="true" and
updateClusterClientsOnRemove="true", it is possible that both clients may
choose the same broker, hence not balancing 1 client to 1 broker.  (this
doesn't even cover the case that even if the clients were balanced,
1client/1broker, that the traffic overall wouldn't necessarily be... one
client could be transmitting twice the traffic - since the brokers are
networked however, the load might be balanced between the brokers but that's
only for consumption; ie. only after the heavier loaded broker gets the
messages).

Is this right?  There's no way to make sure that brokers are selected so
that client connections are evenly distributed right?

Failover is not the same as balancing.  In fact, there isn't any balancing
going on with the failover protocol - its failover, not balancing.  Randomly
picking one or going to the next one in the list isn't balancing.

I suppose its even less balanced if there are 2 clients and 3 brokers since
at least one broker isn't going to be connected to a client (pigeon hole
principle).

======================================================

Just to continue the discussion, Secondly...

The way I see it, there are 3 places that could be 'load balanced'.  (If we
consider allowing some brokers to be unbalanced [think of the 'elite' line
at the airport check-in again], then perhaps its more appropriate to call
this more generally 'load management' rather than 'load balancing'.)

1. Client - A client controls which brokers they send messages to by
selecting one from some 'list'.  The list can include more than one url, be
dynamically updated, contain a discovery endpoint (ie. some agent at the end
of the url creates the actual list).  Yet the balancing part (hopefully in
orchestration with other clients either directly or indirectly via feedback
statistics) involves the client balancing the sending of messages to a list
of available brokers.  Similarly for consumers making sure messages are
getting consumed in a balanced way.  This should be dynamic - re-balancing
based upon some monitoring policy, not just when brokers are brought
up/down/added/removed - that's just the trivial case of throughput being 0.

Perhaps the fabric agent, which is already acting as the source of available
urls when using discovery:(fabric:group), could use its collective
statistical knowledge to orchestrate suggested changes to clients.

2. Broker - Are all the brokers sharing the load of messages in the network
of brokers?  This seems to be already there, no?  Are there any brokers that
should be used or used more that are not available to more or all clients? 
The only criteria used in this algorithm is throughput and transmission
cost.  No consideration is given to other properties like QOS levels.

3. Hosts - Perhaps one host has more power than another and could handle
twice the load, or could have twice the number of brokers running.  This
aspect involves looking to make sure that all the resources in our solution
are getting utilized.  If one host only has slaves and the other masters,
then this is an imbalanced use of Hosts, even if the number of messages
going through the masters is balanced.

Perhaps it starts out balanced as in a master/slave pair using 2 hosts. 
Then one host or the master broker goes down.  Then we have both masters now
on the remaining host.  Then if the downed host/broker is restarted, we'd
like to rebalance.  There is the priorityBackup "fail-back" option which is
nice and would cover this scenario.  But it doesn't handle dynamically
favoring a host - perhaps making the slave the master on purpose when the
master's processing becomes very slow (perhaps a nightly batch run of some
other app eats up cpu cycles).

Your thoughts?

ps. I'm not sure if by 'clients' we mean two separate apps each with its own
connection (two separate factory instances), or one app with two connections
(same factory instance) - does it matter?



--
View this message in context: 
http://activemq.2283324.n4.nabble.com/Load-Balancing-Producer-Messages-from-the-JMS-Client-side-tp4656611p4656823.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to