BTW, If you are not using the LVM Datastore, just replace LVM_SIZE_CMD with
LVM_SIZE_CMD="" We are looking for a better method to handle this http://dev.opennebula.org/issues/2912 Note that this is not a change because of the monitoring system but by the need of monitoring the Datastore Size. Also, any change made in monitor_ds.sh can be propagated with onehost sync and versioned with VERSION attribute: http://docs.opennebula.org/4.6/administration/hosts_and_clusters/host_guide.html#sync On Wed, Jul 30, 2014 at 4:45 PM, Ruben S. Montero <[email protected]> wrote: > Hi, > > 1.- monitor_ds.sh may use LVM commands (vgdisplay) that needs sudo access. > It should be automatically setup by the opennebula node packages. > > 2.- It is not a real daemon, the first time a host is monitored a process > is left to periodically send information. OpenNebula restarts it if no > information is received in 3 monitor steps. Nothing needs to be set up... > > Cheers > > > On Wed, Jul 30, 2014 at 3:50 PM, Steven Timm <[email protected]> wrote: > >> On Wed, 30 Jul 2014, Ruben S. Montero wrote: >> >> >>> Maybe you could try to execute the monitor probes in the node, >>> >>> 1. ssh the node >>> 2. Go to /var/tmp/one/im >>> 3. Execute run_probes kvm-probes >>> >> >> When I do that, (using sh -x ) I get the following: >> >> -bash-4.1$ sh -x ./run_probes kvm-probes >> ++ dirname ./run_probes >> + source ./../scripts_common.sh >> ++ export LANG=C >> ++ LANG=C >> ++ export PATH=/bin:/sbin:/usr/bin:/usr/krb5/bin:/usr/lib64/qt-3.3/ >> bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin >> ++ PATH=/bin:/sbin:/usr/bin:/usr/krb5/bin:/usr/lib64/qt-3.3/ >> bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin >> ++ AWK=awk >> ++ BASH=bash >> ++ CUT=cut >> ++ DATE=date >> ++ DD=dd >> ++ DF=df >> ++ DU=du >> ++ GREP=grep >> ++ ISCSIADM=iscsiadm >> ++ LVCREATE=lvcreate >> ++ LVREMOVE=lvremove >> ++ LVRENAME=lvrename >> ++ LVS=lvs >> ++ LN=ln >> ++ MD5SUM=md5sum >> ++ MKFS=mkfs >> ++ MKISOFS=genisoimage >> ++ MKSWAP=mkswap >> ++ QEMU_IMG=qemu-img >> ++ RADOS=rados >> ++ RBD=rbd >> ++ READLINK=readlink >> ++ RM=rm >> ++ SCP=scp >> ++ SED=sed >> ++ SSH=ssh >> ++ SUDO=sudo >> ++ SYNC=sync >> ++ TAR=tar >> ++ TGTADM=tgtadm >> ++ TGTADMIN=tgt-admin >> ++ TGTSETUPLUN=tgt-setup-lun-one >> ++ TR=tr >> ++ VGDISPLAY=vgdisplay >> ++ VMKFSTOOLS=vmkfstools >> ++ WGET=wget >> +++ uname -s >> ++ '[' xLinux = xLinux ']' >> ++ SED='sed -r' >> +++ basename ./run_probes >> ++ SCRIPT_NAME=run_probes >> + export LANG=C >> + LANG=C >> + HYPERVISOR_DIR=kvm-probes.d >> + ARGUMENTS=kvm-probes >> ++ dirname ./run_probes >> + SCRIPTS_DIR=. >> + cd . >> ++ '[' -d kvm-probes.d ']' >> ++ run_dir kvm-probes.d >> ++ cd kvm-probes.d >> +++ ls architecture.sh collectd-client-shepherd.sh cpu.sh kvm.rb >> monitor_ds.sh name.sh poll.sh version.sh >> ++ for i in '`ls *`' >> ++ '[' -x architecture.sh ']' >> ++ ./architecture.sh kvm-probes >> ++ EXIT_CODE=0 >> ++ '[' x0 '!=' x0 ']' >> ++ for i in '`ls *`' >> ++ '[' -x collectd-client-shepherd.sh ']' >> ++ ./collectd-client-shepherd.sh kvm-probes >> ++ EXIT_CODE=0 >> ++ '[' x0 '!=' x0 ']' >> ++ for i in '`ls *`' >> ++ '[' -x cpu.sh ']' >> ++ ./cpu.sh kvm-probes >> ++ EXIT_CODE=0 >> ++ '[' x0 '!=' x0 ']' >> ++ for i in '`ls *`' >> ++ '[' -x kvm.rb ']' >> ++ ./kvm.rb kvm-probes >> ++ EXIT_CODE=0 >> ++ '[' x0 '!=' x0 ']' >> ++ for i in '`ls *`' >> ++ '[' -x monitor_ds.sh ']' >> ++ ./monitor_ds.sh kvm-probes >> [sudo] password for oneadmin: >> >> and it stays hung on the password for oneadmin. >> >> What's going on? >> >> Also, you mentioned a collectd--are you saying that OpenNebula 4.6 now >> needs to run a daemon on every single VM host? Where is it documented >> on how to set it up? >> >> Steve >> >> >> >> >> >> >> >>> Make sure you do not have a host using the same hostname fgtest14 and >>> running a collectd process >>> >>> On Jul 29, 2014 4:35 PM, "Steven Timm" <[email protected]> wrote: >>> >>> I am still trying to debug a nasty monitoring inconsistency. >>> >>> -bash-4.1$ onevm list | grep fgtest14 >>> 26 oneadmin oneadmin fgt6x4-26 runn 6 4G >>> fgtest14 117d 19h50 >>> 27 oneadmin oneadmin fgt5x4-27 runn 10 4G >>> fgtest14 117d 17h57 >>> 28 oneadmin oneadmin fgt1x1-28 runn 10 4.1G >>> fgtest14 117d 16h59 >>> 30 oneadmin oneadmin fgt5x1-30 runn 0 4G >>> fgtest14 116d 23h50 >>> 33 oneadmin oneadmin ip6sl5vda-33 runn 6 4G >>> fgtest14 116d 19h57 >>> -bash-4.1$ onehost list >>> ID NAME CLUSTER RVM ALLOCATED_CPU >>> ALLOCATED_MEM STAT >>> 3 fgtest11 ipv6 0 0 / 400 (0%) 0K / >>> 15.7G (0%) on >>> 4 fgtest12 ipv6 0 0 / 400 (0%) 0K / >>> 15.7G (0%) on >>> 7 fgtest13 ipv6 0 0 / 800 (0%) 0K / >>> 23.6G (0%) on >>> 8 fgtest14 ipv6 5 0 / 800 (0%) 0K / >>> 23.6G (0%) on >>> 9 fgtest20 ipv6 3 300 / 800 (37%) 12G / 31.4G >>> (38%) on >>> 11 fgtest19 ipv6 0 0 / 800 (0%) 0K / >>> 31.5G (0%) on >>> -bash-4.1$ onehost show 8 >>> HOST 8 INFORMATION >>> ID : 8 >>> NAME : fgtest14 >>> CLUSTER : ipv6 >>> STATE : MONITORED >>> IM_MAD : kvm >>> VM_MAD : kvm >>> VN_MAD : dummy >>> LAST MONITORING TIME : 07/29 09:25:45 >>> >>> HOST SHARES >>> TOTAL MEM : 23.6G >>> USED MEM (REAL) : 876.4M >>> USED MEM (ALLOCATED) : 0K >>> TOTAL CPU : 800 >>> USED CPU (REAL) : 0 >>> USED CPU (ALLOCATED) : 0 >>> RUNNING VMS : 5 >>> >>> LOCAL SYSTEM DATASTORE #102 CAPACITY >>> TOTAL: : 548.8G >>> USED: : 175.3G >>> FREE: : 345.6G >>> >>> MONITORING INFORMATION >>> ARCH="x86_64" >>> CPUSPEED="2992" >>> HOSTNAME="fgtest14.fnal.gov" >>> HYPERVISOR="kvm" >>> MODELNAME="Intel(R) Xeon(R) CPU E5450 @ 3.00GHz" >>> NETRX="234844577" >>> NETTX="21553126" >>> RESERVED_CPU="" >>> RESERVED_MEM="" >>> VERSION="4.6.0" >>> >>> VIRTUAL MACHINES >>> >>> ID USER GROUP NAME STAT UCPU UMEM HOST >>> TIME >>> 26 oneadmin oneadmin fgt6x4-26 runn 6 4G >>> fgtest14 117d 19h50 >>> 27 oneadmin oneadmin fgt5x4-27 runn 10 4G >>> fgtest14 117d 17h57 >>> 28 oneadmin oneadmin fgt1x1-28 runn 10 4.1G >>> fgtest14 117d 17h00 >>> 30 oneadmin oneadmin fgt5x1-30 runn 0 4G >>> fgtest14 116d 23h50 >>> 33 oneadmin oneadmin ip6sl5vda-33 runn 6 4G >>> fgtest14 116d 19h57 >>> ------------------------------------------------------------ >>> ----------------------- >>> >>> All of this looks great, right? >>> Just one problem: There are no VM's running on fgtest14 and >>> haven't been for 4 days. >>> >>> [root@fgtest14 ~]# virsh list >>> Id Name State >>> ---------------------------------------------------- >>> >>> [root@fgtest14 ~]# >>> >>> ------------------------------------------------------------ >>> ------------- >>> Yet the monitoring reports no errors. >>> >>> Tue Jul 29 09:28:10 2014 [InM][D]: Host fgtest14 (8) successfully >>> monitored. >>> >>> ------------------------------------------------------------ >>> ----------------- >>> At the same time, there is no evidence that ONE is actually trying >>> to or >>> succeeding to monitor these five vm's yet they are still stuck in >>> "runn" >>> which means I can't do a onevm restart to restart them. >>> (the vm images of these 5 vm's are still out there on the VM host >>> and >>> I would like to save and restart them if I can). >>> >>> What is the remotes command that ONE4.6 would use to monitor this >>> host? >>> Can I do it manually and see what output I get? >>> >>> Are we dealing with some kind of a bug, or just a very confused >>> system? >>> Any help is appreciated. I have to get this sorted out before >>> I dare deploy one4.x in production. >>> >>> Steve Timm >>> >>> >>> ------------------------------------------------------------------ >>> Steven C. Timm, Ph.D (630) 840-8525 >>> [email protected] http://home.fnal.gov/~timm/ >>> Fermilab Scientific Computing Division, Scientific Computing >>> Services Quad. >>> Grid and Cloud Services Dept., Associate Dept. Head for Cloud >>> Computing >>> _______________________________________________ >>> Users mailing list >>> [email protected] >>> http://lists.opennebula.org/listinfo.cgi/users-opennebula.org >>> >>> >>> >>> >> ------------------------------------------------------------------ >> Steven C. Timm, Ph.D (630) 840-8525 >> [email protected] http://home.fnal.gov/~timm/ >> Fermilab Scientific Computing Division, Scientific Computing Services >> Quad. >> Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing > > > > > -- > -- > Ruben S. Montero, PhD > Project co-Lead and Chief Architect > OpenNebula - Flexible Enterprise Cloud Made Simple > www.OpenNebula.org | [email protected] | @OpenNebula > -- -- Ruben S. Montero, PhD Project co-Lead and Chief Architect OpenNebula - Flexible Enterprise Cloud Made Simple www.OpenNebula.org | [email protected] | @OpenNebula
_______________________________________________ Users mailing list [email protected] http://lists.opennebula.org/listinfo.cgi/users-opennebula.org
