Hi Sven,
You can use the ListVolumeStats API call (I put in an example request and
response below).
Since this goes over the management network, though, it's possible if your
management network is down, but your storage network is up that this call could
fail, but your VMs might still have perfectly good access to their volumes.
Talk to you later!
Mike
Request:
{
"method": "ListVolumeStats",
"params": {
"volumeIDs": [1, 2]
},
"id" : 1
}
Response:
{
"id": 1,
"result": {
"volumeStats": [
{
"accountID": 1,
"actualIOPS": 14,
"asyncDelay": null,
"averageIOPSize": 13763,
"burstIOPSCredit": 0,
"clientQueueDepth": 0,
"desiredMetadataHosts": null,
"latencyUSec": 552,
"metadataHosts": {
"deadSecondaries": [],
"liveSecondaries": [],
"primary": 5
},
"nonZeroBlocks": 10962174,
"normalizedIOPS": 34,
"readBytes": 747306804224,
"readBytesLastSample": 0,
"readLatencyUSec": 0,
"readLatencyUSecTotal": 11041939920,
"readOps": 19877559,
"readOpsLastSample": 0,
"samplePeriodMSec": 500,
"throttle": 0,
"timestamp": "2020-06-02T17:14:35.444789Z",
"unalignedReads": 2176454,
"unalignedWrites": 1438822,
"volumeAccessGroups": [
1
],
"volumeID": 1,
"volumeSize": 2147483648000,
"volumeUtilization": 0.002266666666666667,
"writeBytes": 3231402834432,
"writeBytesLastSample": 106496,
"writeLatencyUSec": 552,
"writeLatencyUSecTotal": 44174792405,
"writeOps": 340339085,
"writeOpsLastSample": 7,
"zeroBlocks": 513325826
},
{
"accountID": 1,
"actualIOPS": 0,
"asyncDelay": null,
"averageIOPSize": 11261,
"burstIOPSCredit": 0,
"clientQueueDepth": 0,
"desiredMetadataHosts": null,
"latencyUSec": 0,
"metadataHosts": {
"deadSecondaries": [],
"liveSecondaries": [],
"primary": 5
},
"nonZeroBlocks": 28816654,
"normalizedIOPS": 0,
"readBytes": 778768996864,
"readBytesLastSample": 0,
"readLatencyUSec": 0,
"readLatencyUSecTotal": 7068679159,
"readOps": 14977610,
"readOpsLastSample": 0,
"samplePeriodMSec": 500,
"throttle": 0,
"timestamp": "2020-06-02T17:14:35.445978Z",
"unalignedReads": 890959,
"unalignedWrites": 358758,
"volumeAccessGroups": [
1
],
"volumeID": 2,
"volumeSize": 2147483648000,
"volumeUtilization": 0,
"writeBytes": 8957684071424,
"writeBytesLastSample": 0,
"writeLatencyUSec": 0,
"writeLatencyUSecTotal": 16780712096,
"writeOps": 406101472,
"writeOpsLastSample": 0,
"zeroBlocks": 495471346
}
]
}
}
On 6/2/20, 9:11 AM, "Sven Vogel" <[email protected]> wrote:
NetApp Security WARNING: This is an external email. Do not click links or
open attachments unless you recognize the sender and know the content is safe.
Hi Paul,
Thanks for the answer and help.
Ok. Secondary Storage is no good solution what I understand.
> 1. HAManager
> 2. HighAvailbilityManager
> 3. KVMHAConfig
which of the three should we expand and which one should be active?
@Mike did you know somethings like that if there is a check of volume
activity?
Maybe we can poll the API but I think this will be a massive polling
(overload) if we poll for each volume.
Ah the moment I don’t have any idea how this could work.
Cheers
Sven
__
Sven Vogel
Lead Cloud Solution Architect
EWERK DIGITAL GmbH
Brühl 24, D-04109 Leipzig
P +49 341 42649 - 99
F +49 341 42649 - 98
[email protected]
www.ewerk.com
Geschäftsführer:
Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke
Registergericht: Leipzig HRB 9065
Zertifiziert nach:
ISO/IEC 27001:2013
DIN EN ISO 9001:2015
DIN ISO/IEC 20000-1:2011
EWERK-Blog | LinkedIn | Xing | Twitter | Facebook
Auskünfte und Angebote per Mail sind freibleibend und unverbindlich.
Disclaimer Privacy:
Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) ist
vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht der
bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung,
Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte
informieren Sie in diesem Fall unverzüglich den Absender und löschen Sie die
E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System. Vielen
Dank.
The contents of this e-mail (including any attachments) are confidential
and may be legally privileged. If you are not the intended recipient of this
e-mail, any disclosure, copying, distribution or use of its contents is
strictly prohibited, and you should please notify the sender immediately and
then delete it (including any attachments) from your system. Thank you.
> Am 01.06.2020 um 19:30 schrieb Paul Angus <[email protected]>:
>
> Hi Sven,
>
> I think that there is a piece of the jigsaw that you are missing.
>
> Given that the only thing that we know, is that we can no longer
communicate with the host agent; To avoid split brain/corruption of VMs,
CloudStack must determine if the guests VMs are still running on the host not.
The only way we can do that is look for disk activity created by those VMs.
>
> Using a secondary storage heartbeat would give a false 'host is down' if
say a switch went down carrying sec storage and mgmt. traffic
>
> Wrt solidfire, you could poll SolidFire via API for activity on the
volumes which belong to the VMs on the unresponsive host. I don't know if
there is an equivalent for Ceph.
>
> Kind regards
>
>
> Paul Angus
>
>
>
> [email protected]
> www.shapeblue.com
> 3 London Bridge Street, 3rd floor, News Building, London SE1 9SGUK
> @shapeblue
>
>
>
>
> -----Original Message-----
> From: Sven Vogel <[email protected]>
> Sent: 01 June 2020 12:30
> To: dev <[email protected]>
> Subject: Managed Storage and HA
>
> Hi Community,
>
> I try to encounter how HA works. Our goal is it to make it usable with
managed storage like (Netapp Solidfire / maybe it works with CEPH too) so if
its possible.
>
> This is a good guide and for some times we fixed and added the missing
keys.
>
https://cwiki.apache.org/confluence/display/CLOUDSTACK/High+Availability+Developer%27s+Guide<https://cwiki.apache.org/confluence/display/CLOUDSTACK/High+Availability+Developer's+Guide>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Host+HA
>
> In the database I found out there are three different types of HA.
>
> If you select in the configuration table "SELECT * FROM `configuration`
WHERE `component` LIKE '%ha%' LIMIT 0,1000;“ you will get three types of
components.
>
> 1. HAManager
> 2. HighAvailbilityManager
> 3. KVMHAConfig
>
> "HAManager and HighAvailbilityManager" are the base which was extended
from Rohit with „KVMHAConfig“ - KVM with stonith fencing.
>
> I understand all things work together but maybe I need to understand the
process a little bit better.
>
>
------------------------------------------------------------------------------------
> To clarify I write down what I think to each of them. This is what I
understand but please correct me or help me to understand it a little bit
better.
>
> —>I found out that if we use managed storage a restart of virtual
machines only works on the same host. This is what I understand the lack of the
missing heartbeat file on the shared storage because we don’t have shared
storage like NFS.
>
> —
> "If the network ping investigation returns that it cannot detect the
status of the host, CloudStack HA then relies on the hypervisor specific
investigation. For VMware, there is no such investigation as the hypervisor
host handles its own HA. For XenServer and KVM, CloudStack HA deploys a
monitoring script that writes the current timestamp on to a heartbeat file on
shared storage."
> —
>
> And
>
> —
> For the initial release, only KVM with NFS storage will be supported.
However, the storage check component will be implemented in a modular fashion
allowing for checks using other storage platforms(e.g. Ceph) in the future.
> —
>
------------------------------------------------------------------------------------
>
> We would implement a plugin or extend this for managed storage but at the
moment I need to understand where this should happen. Since managed storage
uses different volumes for each VM we should its not easy to make an storage
heartbeat like NFS. the lack of one missing volume don’t means the hole storage
has an problem so I think its not easy to encounter from one volumes to a
complete storage.
>
> We don’t use KVMHAConfig a the moment and encounter that if a Host goes
down (offline) the virtual machines will not be restarted on another host. They
will only restarted on the host if the host comes back.(online). We don’t want
a hard fencing of the hosts but we want a correct determination whether the
host is still alive. Fencing would maybe in our case a little bit hard because
we don’t have an hard data corruption on entire storage.
>
> Some questions.
> 1. let's assume correctly that the HA don’t work without an shared
storage and network ping? Is this the cause why our virtual machines will not
restarted on another host? Is this correct or do we have an config problem?
> 2. Where could the plugin be implemented? Is there a preferred place?
> 3. If point 1. Is correctly I thought the idea would be to add an global
flag to use the secondary storage (NFS) as heartbeat to find out if there is
any host inactive?
>
> Thanks and Cheers
>
> Sven
>
>
> __
>
> Sven Vogel
> Lead Cloud Solution Architect
>
> EWERK DIGITAL GmbH
> Brühl 24, D-04109 Leipzig
> P +49 341 42649 - 99
> F +49 341 42649 - 98
> [email protected]
> www.ewerk.com
>
> Geschäftsführer:
> Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke
> Registergericht: Leipzig HRB 9065
>
> Support:
> +49 341 42649 555
>
> Zertifiziert nach:
> ISO/IEC 27001:2013
> DIN EN ISO 9001:2015
> DIN ISO/IEC 20000-1:2011
>
> ISAE 3402 Typ II Assessed
>
> EWERK-Blog<https://blog.ewerk.com/> |
LinkedIn<https://www.linkedin.com/company/ewerk-group> |
Xing<https://www.xing.com/company/ewerk> |
Twitter<https://twitter.com/EWERK_Group> |
Facebook<https://de-de.facebook.com/EWERK.IT/>
>
>
> Auskünfte und Angebote per Mail sind freibleibend und unverbindlich.
>
> Disclaimer Privacy:
> Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien)
ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht der
bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung,
Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte
informieren Sie in diesem Fall unverzüglich den Absender und löschen Sie die
E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System. Vielen
Dank.
>
> The contents of this e-mail (including any attachments) are confidential
and may be legally privileged. If you are not the intended recipient of this
e-mail, any disclosure, copying, distribution or use of its contents is
strictly prohibited, and you should please notify the sender immediately and
then delete it (including any attachments) from your system. Thank you.