hum, must check but i think that galera cluster is 3+ nodes, i only tested with 3+ here, didn't tested with 2 nodes yet
2013/12/11 AskMonty KB <nore...@askmonty.org>: > Hello, > > A new question has been asked in "MariaDB FAQ" by maximilianodipietro: > -------------------------------- > Hi people, i have a two nodes cluster of MariaDb, and i need some help, i > have this simple script to do a solve failover, the script works like this: > if the master node dies, the script kills the slave and restart it as master, > and then the master as slave when the script detects that the process is > down, here's the code: > > ---------------------------------------------------------------------------------------------------------------------------------- > #!/bin/bash > > if [ $1 = "start" ] > then > > PidMaria=$(ps -ef | grep wsrep_cluster | grep -v grep | awk {'print $2'}) > Server2=$(nmap -v ipnodo2 | grep 3306 | tail -n1 | awk {'print $2'}) > TipoPid=$(ssh ipnodo2 "ps -ef |grep mysql | grep -v grep | cut -d "=" -f 2 | > cut -d "/" -f 3 | cut -d "-" -f 1 | cut -d "." -f 1") > TipoPidLocal=$(ps -ef |grep mysql | grep -v grep | cut -d "=" -f 2 | cut -d > "/" -f 3 | cut -d "-" -f 1 | cut -d "." -f 1) > > if [ -z $PidMaria ] > then > #sleep 20 > Server2=$(nmap -v ipnodo2 | grep 3306 | tail -n1 | awk {'print $2'}) > if [ -z $Server2 ] > then > if [ -z $TipoPid ] > then > if [ -z $(ssh ipnodo2 "ps -ef |grep mysql | > grep -v grep " | awk {'print $2'}) ] > then > /usr/sbin/mysqld --wsrep_cluster_address=gcomm:// > --user=mysql --wsrep_sst_auth=root:root > --wsrep_provider=/usr/lib64/galera/libgalera_smm.so > fi > else > /usr/sbin/mysqld > --wsrep_cluster_address=gcomm://ipnodo2 --wsrep_sst_auth=root:root > --user=mysql --wsrep_provider=/usr/lib64/galera/libgalera_smm.so > fi > else > if [ -z $TipoPid ] > then > /usr/sbin/mysqld --wsrep_cluster_address=gcomm://ipnodo2 --user=mysql > --wsrep_sst_auth=root:root --wsrep_provider=/usr/lib64/galera/libgalera_smm.so > fi > fi > > else > if [ -z $Server2 ] > then > if [ -z $TipoPidLocal ] > then > echo "hola" > else > kill -9 $PidMaria > /usr/sbin/mysqld --wsrep_cluster_address=gcomm:// > --wsrep_sst_auth=root:root --user=mysql > --wsrep_provider=/usr/lib64/galera/libgalera_smm.so > fi > else > if [ -z $PidMaria ] > then > /usr/sbin/mysqld --wsrep_cluster_address=gcomm://ipnodo2 > --user=mysql --wsrep_sst_auth=root:root > --wsrep_provider=/usr/lib64/galera/libgalera_smm.so > fi > fi > > fi > fi > -------------------------------------------------------------------------------------------------------------------------------- > > > > > > This is the error i get in the slave output when i run the script by hand. > > --------------------------------------------------------------------------------------------------------------------------- > + /usr/sbin/mysqld --wsrep_cluster_address=gcomm://IPMASTER --user=mysql > --wsrep_sst_auth=root:root --wsrep_provider=/usr/lib64/galera/libgalera_smm.so > 131211 20:04:35 [Note] WSREP: Read nil XID from storage engines, skipping > position init > 131211 20:04:35 [Note] WSREP: wsrep_load(): loading provider library > '/usr/lib64/galera/libgalera_smm.so' > 131211 20:04:35 [Note] WSREP: wsrep_load(): Galera 23.2.7(r157) by Codership > Oy <i...@codership.com> loaded succesfully. > 131211 20:04:35 [Note] WSREP: Found saved state: > 1503cc31-6281-11e3-abfc-5bf96ca010d8:-1 > 131211 20:04:35 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'. > 131211 20:04:35 [Note] WSREP: Passing config to GCS: base_host = IPLOCALHOST; > base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; > gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = > /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; > gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = > NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; > gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; > gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; > replicator.commit_order = 3 > 131211 20:04:35 [Note] WSREP: Assign initial position for certification: -1, > protocol version: -1 > 131211 20:04:35 [Note] WSREP: wsrep_sst_grab() > 131211 20:04:35 [Note] WSREP: Start replication > 131211 20:04:35 [Note] WSREP: Setting initial position to > 00000000-0000-0000-0000-000000000000:-1 > 131211 20:04:35 [Note] WSREP: protonet asio version 0 > 131211 20:04:35 [Note] WSREP: backend: asio > 131211 20:04:35 [Note] WSREP: GMCast version 0 > 131211 20:04:35 [Note] WSREP: (74e621f3-629f-11e3-a86f-0adb496b3ff6, > 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567 > 131211 20:04:35 [Note] WSREP: (74e621f3-629f-11e3-a86f-0adb496b3ff6, > 'tcp://0.0.0.0:4567') multicast: , ttl: 1 > 131211 20:04:35 [Note] WSREP: EVS version 0 > 131211 20:04:35 [Note] WSREP: PC version 0 > 131211 20:04:35 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', > peer '162.243.62.104:' > 131211 20:04:35 [Note] WSREP: declaring 15031b4b-6281-11e3-9ad7-8a59b1f0cba0 > stable > 131211 20:04:35 [Note] WSREP: > view(view_id(NON_PRIM,15031b4b-6281-11e3-9ad7-8a59b1f0cba0,8) memb { > 15031b4b-6281-11e3-9ad7-8a59b1f0cba0, > 74e621f3-629f-11e3-a86f-0adb496b3ff6, > } joined { > } left { > } partitioned { > 18299990-6281-11e3-a268-975810126780, > a6d47aea-6281-11e3-b007-16991d53e685, > d52af178-6281-11e3-871d-03c1e15f1ac4, > }) > 131211 20:05:05 [ERROR] WSREP: failed to open gcomm backend connection: 110: > failed to reach primary view: 110 (Connection timed out) > at gcomm/src/pc.cpp:connect():139 > 131211 20:05:05 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():195: Failed > to open backend connection: -110 (Connection timed out) > 131211 20:05:05 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1289: Failed to open > channel 'my_wsrep_cluster' at 'gcomm://IPMASTER': -110 (Connection timed out) > 131211 20:05:05 [ERROR] WSREP: gcs connect failed: Connection timed out > 131211 20:05:05 [ERROR] WSREP: wsrep::connect() failed: 6 > 131211 20:05:05 [ERROR] Aborting > > 131211 20:05:05 [Note] WSREP: Service disconnected. > 131211 20:05:06 [Note] WSREP: Some threads may fail to exit. > 131211 20:05:06 [Note] /usr/sbin/mysqld: Shutdown complete > ------------------------------------------------------------------------------------------------ > > > > When the script starts the master from the crontab, the PID is already up, > and i can access to the mariadb, but i cant write any database, only list the > databases. > > > [root@xxx ~]# mysql -uxxx -pxxxx > Welcome to the MariaDB monitor. Commands end with ; or \g. > Your MariaDB connection id is 24 > Server version: 5.5.33a-MariaDB > > Copyright (c) 2000, 2013, Oracle, Monty Program Ab and others. > > Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. > > MariaDB [(none)]> create database TESTEOKILL2; > ERROR 1047 (08S01): Unknown command > MariaDB [(none)]> > > > Any idea?. Thanks in advance! > -------------------------------- > > To view or answer this question please visit: > http://mariadb.com/kb/en/simple-cluster-two-nodes-error/ > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-discuss > Post to : maria-discuss@lists.launchpad.net > Unsubscribe : https://launchpad.net/~maria-discuss > More help : https://help.launchpad.net/ListHelp -- Roberto Spadim SPAEmpresarial _______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : maria-discuss@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp