Thanks João, Daniel,

I see João's PR - I'm not keen on getting the mysql-ha plugin removed if it can 
be improved/fixed. What users complain about is when it doesn't work or the 
documentation isn't clear about how to use it (or in fact have any MySQL HA 
plan for use with CloudStack).


Regards.

________________________________
From: João Jandre Paraquetti <j...@scclouds.com.br>
Sent: Wednesday, August 23, 2023 01:26
To: us...@cloudstack.apache.org <us...@cloudstack.apache.org>; 
dev@cloudstack.apache.org <dev@cloudstack.apache.org>
Subject: Re: [Consultation] Remove DB HA feature (db.ha.enabled)

Sure, Daniel

PR #7895 is currently in draft as we need to do some more tests.
However, the intention is to enable users to configure the DB connection
URI directly through `db.properties` file. These are the tests that have
been done so far with ACS without this PR changeset:

Using the current version in a setup with MariaDB and Galera, with a
cluster size of 3 and the following configuration on the db.properties file:
```
# High Availability And Cluster Properties
db.ha.enabled=true
db.ha.loadBalanceStrategy=com.cloud.utils.db.StaticStrategy
# cloud stack Database
db.cloud.replicas=192.168.201.161,192.168.201.162
db.cloud.autoReconnect=false
db.cloud.failOverReadOnly=false
db.cloud.reconnectAtTxEnd=false
db.cloud.autoReconnectForPools=true
db.cloud.secondsBeforeRetrySource=1800
db.cloud.queriesBeforeRetrySource=5000
db.cloud.initialTimeout=3600
```
When the MariaDB service stops in the main node, ACS switches to one of
the other two nodes. However, if the host is shut down, the switch never
occurs.

Then, we also did tests using the changes proposed in the PR, by
configuring the db.cloud.uri:

```
db.cloud.uri=jdbc:mariadb:sequential://192.168.201.160:3306,192.168.201.161:3306,192.168.201.162:3306/cloud?autoReconnect=true&prepStmtCacheSize=517&cachePrepStmts=true&sessionVariables=sql_mode='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION'&serverTimezone=UTC

# These properties are ignored when setting the URI manually, so no need
to set them.

# High Availability And Cluster Properties
# db.ha.enabled=true
# db.ha.loadBalanceStrategy=com.cloud.utils.db.StaticStrategy
# cloud stack Database
# db.cloud.replicas=192.168.201.161,192.168.201.162
# db.cloud.autoReconnect=false
# db.cloud.failOverReadOnly=false
# db.cloud.reconnectAtTxEnd=false
# db.cloud.autoReconnectForPools=true
# db.cloud.secondsBeforeRetrySource=1800
# db.cloud.queriesBeforeRetrySource=5000
# db.cloud.initialTimeout=3600
```

I was able to configure and use the sequential failover mode. This way,
when the MariaDB service stops in the main node and even if the host is
shut down, ACS is able to switch to the other DBs.

There are two differences between defining the URI manually (which is
proposed with PR#7895) and the generated by ACS.
The first one is the `jdbc:mariadb`, which is the driver that makes the
connection with the DBMS, this enables usage of MariaDB URL
configurations, this driver is being introduced into ACS with PR#7895.
The second one is the usage of the `sequential` [1] failover mode, that
will try to connect to hosts in the order in which they were declared in
the connection URL, so the first available host is used for all queries,
and if one of the hosts is shut down, it will try to reconnect with the
other on the list. As this mode only connects to a single DB, the
problems referenced by Rohit are avoided. But the failover mechanism is
still in place.

Best regards,
João Jandre

[1] - https://mariadb.com/kb/en/about-mariadb-connector-j/

On 22/08/2023 16:03, Daniel Salvador wrote:
> Hello Lucian and all,
>
> I am -1 on removing the whole DB HA feature from CloudStack.
>
> As we discussed on July[1], the current properties we have on
> "db.properties" regarding DB HA are hardcoded and only address some MySQL
> properties, which are not fully compatible with the properties for
> configuring DB HA on MariaDB. It indeed has some problems; however, I think
> we should keep the functionality and improve it, to enrich CloudStack and
> avoid using other layers to accomplish the goals. It is good to have a
> workaround, though.
>
> João Jandre and I are already working on a solution to flexibilize the DB
> parameters in order to allow one to configure DB HA properly when using
> MariaDB (and also do several other configurations). João, could you point
> to the PR that addresses the changes and share the configurations and tests
> we have done so far?
>
> Best regards,
> Daniel Salvador (gutoveronezi)
>
> [1] - https://lists.apache.org/thread/j0mmwy9dfr9k2kbnnjxcr2m7y8zwd34c
>
> On Tue, Aug 22, 2023 at 12:42 PM Nux <n...@li.nux.ro> wrote:
>
>> New adopters may not go ahead with it in production because they won't
>> get it working, unless they fix a lot of code, that would be a nice pull
>> request. :)
>>
>>
>> On 2023-08-22 16:25, K B Shiv Kumar wrote:
>>> Well, if it is broken and it is not prominently mentioned anywhere new
>>> adopters may go ahead with that on production. So I guess best to
>>> remove or at least mention that it is not production grade.
>>>
>>> Thanks
>>> Shiv
>>>
>>>
 

> On 22-Aug-2023, at 20:12, Nux <n...@li.nux.ro> wrote:
>>>>
>>>> But what do you think of the removal of DB HA code?
>>>>
>>>> When using Galera you need to query against a single node, don't
>>>> spread the load among all 3, as this will break certain locking
>>>> functionality in Cloudstack and lead to problems.
>>>>
>>>> In a Haproxy configuration you should be keeping just one active, eg:
>>>>         server galera1 10.0.3.2:3306 check
>>>>         server galera2 10.0.3.3:3306 check backup
>>>>         server galera3 10.0.3.4:3306 check backup
>>>>
>>>> Regards
>>>>
>>>> On 2023-08-22 15:36, K B Shiv Kumar wrote:
>>>>> We faced some issues when running Galera. We went back to master
>>>>> slave.
>>>>> Anyone using Galera in production for a long time?
>>>>> Regards,
>>>>> Shiv
>>>>>> On 22-Aug-2023, at 19:34, Nux <n...@li.nux.ro> wrote:
>>>>>> Happy to contribute a doc on how to achieve HA if we decide to
>>>>>> remove this.
>>>>>> Thanks
>>>>>> On 2023-08-22 15:01, Rohit Yadav wrote:
>>>>>>> +1 it's a broken feature that at least doesn't work with MySQL 8.x,
>>>>>>> I'm not sure if it worked with prior versions of MySQL. However, we
>>>>>>> need to document some sort of suggested MySQL HA setup in our docs.
>>>>>>> Regards.
>>>>>>> ________________________________
>>>>>>> From: Nux <n...@li.nux.ro>
>>>>>>> Sent: Tuesday, August 22, 2023 18:54
>>>>>>> To: us...@cloudstack.apache.org <us...@cloudstack.apache.org>; Dev
>>>>>>> <dev@cloudstack.apache.org>
>>>>>>> Subject: [Consultation] Remove DB HA feature (db.ha.enabled)
>>>>>>> Hello everyone,
>>>>>>> A few weeks ago I asked you if you use or managed to use the DB HA
>>>>>>> Cloudstack feature (db.ha.enabled)[1] and after reading some of the
>>>>>>> replies and doing intensive testing myself I have found out that
>>>>>>> the
>>>>>>> feature is indeed non-functional, it's broken.
>>>>>>> In my testing I discovered DB HA can easily be done outside of
>>>>>>> Cloudstack by employing load balancers and other techniques.
>>>>>>> Personally I have achieved that by using Haproxy in front of Galera
>>>>>>> cluster, but also introduced Keepalived (vrrp) in my setup to
>>>>>>> "balance"
>>>>>>> multiple Haproxies which also worked well.
>>>>>>> As such, since the feature is basically broken, it will not be
>>>>>>> trivial
>>>>>>> to fix it and there are better ways of doing HA, then I propose to
>>>>>>> remove it altogether.
>>>>>>> Thoughts? Anyone against it?
>>>>>>> Cheers
>>>>>>> [1] -
>>>>>>>
>> https://docs.cloudstack.apache.org/en/latest/adminguide/reliability.html#database-high-availability
>>

Reply via email to