Do we have framework to do this kind of looking in ZK? I mean, you said " create a new InterProcessSemaphoreMutex which handles the locking mechanism.". This feels that we would have to continue opening and closing this transaction manually, which is what causes a lot of our headaches with transactions (it is not MySQL locks fault entirely, but our code structure).
On Mon, Dec 18, 2017 at 7:47 AM, Marc-Aurèle Brothier <ma...@exoscale.ch> wrote: > We added ZK lock for fix this issue but we will remove all current locks in > ZK in favor of ZK one. The ZK lock is already encapsulated in a project > with an interface, but more work should be done to have a proper interface > for locks which could be implemented with the "tool" you want, either a DB > lock for simplicity, or ZK for more advanced scenarios. > > @Daan you will need to add the ZK libraries in CS and have a running ZK > server somewhere. The configuration value is read from the > server.properties. If the line is empty, the ZK client is not created and > any lock request will immediately return (not holding any lock). > > @Rafael: ZK is pretty easy to setup and have running, as long as you don't > put too much data in it. Regarding our scenario here, with only locks, it's > easy. ZK would be only the gatekeeper to locks in the code, ensuring that > multi JVM can request a true lock. > For the code point of view, you're opening a connection to a ZK node (any > of a cluster) and you create a new InterProcessSemaphoreMutex which handles > the locking mechanism. > > On Mon, Dec 18, 2017 at 10:24 AM, Ivan Kudryavtsev < > kudryavtsev...@bw-sw.com > > wrote: > > > Rafael, > > > > - It's easy to configure and run ZK either in single node or cluster > > - zookeeper should replace mysql locking mechanism used inside ACS code > > (places where ACS locks tables or rows). > > > > I don't think from the other size, that moving from MySQL locks to ZK > locks > > is easy and light and (even implemetable) way. > > > > 2017-12-18 16:20 GMT+07:00 Rafael Weingärtner < > rafaelweingart...@gmail.com > > >: > > > > > How hard is it to configure Zookeeper and get everything up and > running? > > > BTW: what zookeeper would be managing? CloudStack management servers or > > > MySQL nodes? > > > > > > On Mon, Dec 18, 2017 at 7:13 AM, Ivan Kudryavtsev < > > > kudryavtsev...@bw-sw.com> > > > wrote: > > > > > > > Hello, Marc-Aurele, I strongly believe that all mysql locks should be > > > > removed in favour of truly DLM solution like Zookeeper. The > performance > > > of > > > > 3node ZK ensemble should be enough to hold up to 1000-2000 locks per > > > second > > > > and it helps to move to truly clustered MySQL like galera without > > single > > > > master server. > > > > > > > > 2017-12-18 15:33 GMT+07:00 Marc-Aurèle Brothier <ma...@exoscale.ch>: > > > > > > > > > Hi everyone, > > > > > > > > > > I was wondering how many of you are running CloudStack with a > cluster > > > of > > > > > management servers. I would think most of you, but it would be nice > > to > > > > hear > > > > > everyone voices. And do you get hosts going over their capacity > > limits? > > > > > > > > > > We discovered that during the VM allocation, if you get a lot of > > > parallel > > > > > requests to create new VMs, most notably with large profiles, the > > > > capacity > > > > > increase is done too far after the host capacity checks and results > > in > > > > > hosts going over their capacity limits. To detail the steps: the > > > > deployment > > > > > planner checks for cluster/host capacity and pick up one deployment > > > plan > > > > > (zone, cluster, host). The plan is stored in the database under a > > > VMwork > > > > > job and another thread picks that entry and starts the deployment, > > > > > increasing the host capacity and sending the commands. Here > there's a > > > > time > > > > > gap between the host being picked up and the capacity increase for > > that > > > > > host of a couple of seconds, which is well enough to go over the > > > capacity > > > > > on one or more hosts. A few VMwork job can be added in the DB queue > > > > > targeting the same host before one gets picked up. > > > > > > > > > > To fix this issue, we're using Zookeeper to act as the multi JVM > lock > > > > > manager thanks to their curator library ( > > > > > https://curator.apache.org/curator-recipes/shared-lock.html). We > > also > > > > > changed the time when the capacity is increased, which occurs now > > > pretty > > > > > much after the deployment plan is found and inside the zookeeper > > lock. > > > > This > > > > > ensure we don't go over the capacity of any host, and it has been > > > proven > > > > > efficient since a month in our management server cluster. > > > > > > > > > > This adds another potential requirement which should be discuss > > before > > > > > proposing a PR. Today the code works seamlessly without ZK too, to > > > ensure > > > > > it's not a hard requirement, for example in a lab. > > > > > > > > > > Comments? > > > > > > > > > > Kind regards, > > > > > Marc-Aurèle > > > > > > > > > > > > > > > > > > > > > -- > > > > With best regards, Ivan Kudryavtsev > > > > Bitworks Software, Ltd. > > > > Cell: +7-923-414-1515 > > > > WWW: http://bitworks.software/ <http://bw-sw.com/> > > > > > > > > > > > > > > > > -- > > > Rafael Weingärtner > > > > > > > > > > > -- > > With best regards, Ivan Kudryavtsev > > Bitworks Software, Ltd. > > Cell: +7-923-414-1515 > > WWW: http://bitworks.software/ <http://bw-sw.com/> > > > -- Rafael Weingärtner