Hello, On Fri, Oct 29, 2010 at 12:35 PM, Dan Frincu <dfri...@streamwide.ro> wrote:
> Hi, > > > Vladimir Legeza wrote: > > *Hello folks. > > I try to setup four ip balanced nodes but, I didn't found the right way to > balance load between nodes when some of them are filed. > > I've done:* > > [r...@node1 ~]# crm configure show > node node1 > node node2 > node node3 > node node4 > primitive ClusterIP ocf:heartbeat:IPaddr2 \ > params ip="10.138.10.252" cidr_netmask="32" > clusterip_hash="sourceip-sourceport" \ > op monitor interval="30s" > clone StreamIP ClusterIP \ > meta globally-unique="true" *clone-max="8" > clone-node-max="2"*target-role="Started" notify="true" ordered="true" > interleave="true" > property $id="cib-bootstrap-options" \ > dc-version="1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="4" \ > no-quorum-policy="ignore" \ > stonith-enabled="false" > > *When all the nodes are up and running:* > > [r...@node1 ~]# crm status > ============ > Last updated: Thu Oct 28 17:26:13 2010 > Stack: openais > Current DC: node2 - partition with quorum > Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438 > 4 Nodes configured, 4 expected votes > 2 Resources configured. > ============ > > Online: [ node1 node2 node3 node4 ] > > Clone Set: StreamIP (unique) > ClusterIP:0 (ocf::heartbeat:IPaddr2): Started node1 > ClusterIP:1 (ocf::heartbeat:IPaddr2): Started node1 > ClusterIP:2 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:3 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:4 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:5 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:6 (ocf::heartbeat:IPaddr2): Started node4 > ClusterIP:7 (ocf::heartbeat:IPaddr2): Started node4 > * > Everything is OK and each node takes 1/4 of all traffic - wonderfull. > But we become to 25% traffic loss if one of them goes down: > * > > Isn't this supposed to be normal behavior in a load balancing situation, 4 > nodes receive 25% of traffic each, one node goes down, the load balancer > notices the failure and directs 33,33% of traffic to the remaining nodes? > > The only way I see to achive 33...% is to decrease *clone-max *param value (that should be multiple of online nodes number) also *clone-max *should be changed on the fly (automaticly). hmm... Idea is very interesting. =8- ) * * > Just out of curiosity. > > [r...@node1 ~]# crm node standby node1 > [r...@node1 ~]# crm status > ============ > Last updated: Thu Oct 28 17:30:01 2010 > Stack: openais > Current DC: node2 - partition with quorum > Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438 > 4 Nodes configured, 4 expected votes > 2 Resources configured. > ============ > > Node node1: standby > Online: [ node2 node3 node4 ] > > Clone Set: StreamIP (unique) > * ClusterIP:0 (ocf::heartbeat:IPaddr2): Stopped > ClusterIP:1 (ocf::heartbeat:IPaddr2): Stopped * > ClusterIP:2 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:3 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:4 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:5 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:6 (ocf::heartbeat:IPaddr2): Started node4 > ClusterIP:7 (ocf::heartbeat:IPaddr2): Started node4 > > *I found the solution (to prevent loosing) by set clone-node-max to 3* > > [r...@node1 ~]# crm resource meta StreamIP set clone-node-max 3 > [r...@node1 ~]# crm status > ============ > Last updated: Thu Oct 28 17:35:05 2010 > Stack: openais > Current DC: node2 - partition with quorum > Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438 > 4 Nodes configured, 4 expected votes > 2 Resources configured. > ============ > > *Node node1: standby* > Online: [ node2 node3 node4 ] > > Clone Set: StreamIP (unique) > * ClusterIP:0 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:1 (ocf::heartbeat:IPaddr2): Started node3* > ClusterIP:2 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:3 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:4 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:5 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:6 (ocf::heartbeat:IPaddr2): Started node4 > ClusterIP:7 (ocf::heartbeat:IPaddr2): Started node4 > > *The problem is that nothing gonna changed when node1 back online.* > > [r...@node1 ~]# crm node online node1 > [r...@node1 ~]# crm status > ============ > Last updated: Thu Oct 28 17:37:43 2010 > Stack: openais > Current DC: node2 - partition with quorum > Version: 1.0.9-0a40fd0cb9f2fcedef9d1967115c912314c57438 > 4 Nodes configured, 4 expected votes > 2 Resources configured. > ============ > > Online: [ *node1* node2 node3 node4 ] > > Clone Set: StreamIP (unique) > * ClusterIP:0 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:1 (ocf::heartbeat:IPaddr2): Started node3* > ClusterIP:2 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:3 (ocf::heartbeat:IPaddr2): Started node2 > ClusterIP:4 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:5 (ocf::heartbeat:IPaddr2): Started node3 > ClusterIP:6 (ocf::heartbeat:IPaddr2): Started node4 > ClusterIP:7 (ocf::heartbeat:IPaddr2): Started node4 > * > There are NO TRAFFIC on node1. > If I back clone-node-max to 2 - all nodes revert to the original state.* > > > > So, My question is How to avoid such "hand-made" changes ( or is it > possible to automate* clone-node-max* adjustments)? > > Thanks! > > You could use location constraints for the clones, something like: > > location StreamIP:0 200: node1 > location StreamIP:0 100: node2 > > This way if node1 is up, it will run there, but if node1 fails it will move > to node2. And if you don't define resource stickiness, when node1 comes back > online, the resource migrates back to it. > I already tried to do so, but such configuration is not seems to be acceptable: crm(live)configure# location location_marker_0 StreamIP:0 200: node1 crm(live)configure# commit element rsc_location: Relax-NG validity error : Expecting an element rule, got nothing element rsc_location: Relax-NG validity error : Element constraints has extra content: rsc_location element configuration: Relax-NG validity error : Invalid sequence in interleave element configuration: Relax-NG validity error : Element configuration failed to validate content element cib: Relax-NG validity error : Element cib failed to validate content crm_verify[20887]: 2010/10/29_16:00:21 ERROR: main: CIB did not pass *DTD/schema validation* Errors found during check: config not valid > I haven't tested this, but it should give you a general idea about how it > could be implemented. > > Regards, > > Dan > > ------------------------------ > > _______________________________________________ > Pacemaker mailing list: > pacema...@oss.clusterlabs.orghttp://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > -- > Dan FRINCU > Systems Engineer > CCNA, RHCE > Streamwide Romania > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker