I have different collection schedules depending on the importance of the data i'm collecting.  You can adjust as you need accordingly.

For your IO issues you can easily simply poll your machine's IO load statistics from the 5/10/15 minute averages.  That will not give you precise intervals but certainly will tell you if something is going wrong.

To be honest,  Zabbix is flexible enough to get you what you need even if you're not monitoring the metric directly.   Anything you can do to raise system visibility is good stuff!

Enjoy!

On 2024-02-13 14:32, Jorge Visentini wrote:
Hi.

I also use Zabbix here. Its problem is that it collects metrics in real time, this is not its function. There are other alternatives like Elasticsearch + metricbeat, but from what I've tested, it's very heavy and uses a lot of disk space lol.

I never used Prometheus, I found it interesting. I'll do some tests.

@Patrick Dubois <mailto:[email protected]> How often do you collect information with Zabbix? Every 1 minute? Because for example... for the information to be used correctly for analysis, we have to have an IO load of at least 1 continuous minute so that Zabbix can collect the correct information.

Cheers!

Em ter., 13 de fev. de 2024 às 13:44, Patrick Dubois via Users <[email protected]> escreveu:

    For detailed monitoring I use Zabbix.  This way I get detailed
    metrics
    on my hypervisors, VMs as well as my network storage.

    If a machine starts generating large IO I get alerts highlighting the
    responsible machine as well as the impacted services.  For
    example,  you
    might get high IO on a VM but also the correlated high latency on
    systems sharing the storage.

    Sometimes users will report the the high latency, masking the real
    problem so it's nice to have a holistic view of the entire
    environment.

    Patrick.Dubois

    On 2024-02-13 11:19, marek wrote:
    > hi,
    >
    > i have prometheus based ovirt hosts monitoring (node_exporter,
    > smartcl_exporter, ipmi_exporter)
    >
    > https://prometheus-community.github.io/ansible/branch/main/ and
    alerts
    > from https://samber.github.io/awesome-prometheus-alerts/
    >
    > after i started this monitoring  i found that one VM is overloading
    > local storage (so i must check IO limiting documentation as a
    homework
    > :) )
    >
    > but my question is
    >
    > how do you monitor IO traffic per VM? (IOPS, read/write traffic,..)
    >
    > some qemu/libvirt exporter? some custom text file + node_exporter?
    >
    > thanks for tips
    >
    > Marek
    > _______________________________________________
    > Users mailing list -- [email protected]
    > To unsubscribe send an email to [email protected]
    > Privacy Statement: https://www.ovirt.org/privacy-policy.html
    > oVirt Code of Conduct:
    > https://www.ovirt.org/community/about/community-guidelines/
    > List Archives:
    >
    
https://lists.ovirt.org/archives/list/[email protected]/message/6HVHFX464QJPJTVXUFCF7RAGAUFD33HE/
    _______________________________________________
    Users mailing list -- [email protected]
    To unsubscribe send an email to [email protected]
    Privacy Statement: https://www.ovirt.org/privacy-policy.html
    oVirt Code of Conduct:
    https://www.ovirt.org/community/about/community-guidelines/
    List Archives:
    
https://lists.ovirt.org/archives/list/[email protected]/message/L4SU7YZ52PO4FPCFBF4NWP6LE67ERSX2/



--
Att,
Jorge Visentini
+55 55 98432-9868
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/KA4XBIMYHVIMMMF57HDWBRQOSPZOBHDP/

Reply via email to