We have done more investigations. An error happens when the script /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh is executed:
MGMT LOGS: 2016-08-29 16:00:11,342 DEBUG [c.c.a.t.Request] (AgentManager-Handler-12:null) (logid:) Seq 8-1837187172990452475: Processing: { Ans: , MgmtId: 90520741415395, via: 8, Ver: v1, Flags: 10, [{"com.cloud.agent.api.GetRouterAlertsAnswer":{"result":false,"details":"/usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: h!/bin/bash: not found","wait":0}}] } 2016-08-29 16:00:11,345 DEBUG [c.c.a.t.Request] (RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Seq 8-1837187172990452475: Received: { Ans: , MgmtId: 90520741415395, via: 8(kvm702), Ver: v1, Flags: 10, { GetRouterAlertsAnswer } } *2016-08-29 16:00:11,345 DEBUG [c.c.a.m.AgentManagerImpl] (RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Details from executing class com.cloud.agent.api.routing.GetRouterAlertsCommand: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: h!/bin/bash: not found* 2016-08-29 16:00:11,345 WARN [c.c.n.r.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:ctx-9cdc1e33) (logid:5f04bf82) Unable to get alerts from router r-221-VM /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: 1: /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh: h!/bin/bash: not found 2016-08-29 16:00:11,411 INFO [o.a.c.e.o.NetworkOrchestrator] (Network-Scavenger-1:ctx-64537bea) (logid:0c5d8104) NetworkGarbageCollector uses '600' seconds for GC interval. 2016-08-29 16:00:11,416 DEBUG [o.a.c.e.o.NetworkOrchestrator] (Network-Scavenger-1:ctx-64537bea) (logid:0c5d8104) We found network 219 to be free for the first time. Adding it to the list: 1472479211 2016-08-29 16:00:12,392 DEBUG [c.c.s.StatsCollector] (StatsCollector-6:ctx-c1aeb528) (logid:a364cc05) StorageCollector is running... 2016-08-29 16:00:12,403 DEBUG [c.c.h.o.r.Ovm3HypervisorGuru] (StatsCollector-6:ctx-c1aeb528) (logid:a364cc05) getCommandHostDelegation: class com.cloud.agent.api.GetStorageStatsCommand For debugging we modified the "router_proxy.sh" slightly on the KVM Host: kvm# grep -A 2 DEBUG /usr/share/cloudstack-common/scripts/network/domr/router_proxy.sh -n 46:# DEBUG CS Upgrade *47-echo "$(date): ssh -p 3922 -q -o StrictHostKeyChecking=no -i $cert root@$domRIp /opt/cloud/bin/$script $*" >> /tmp/router.txt* 48-ssh -p 3922 -q -o StrictHostKeyChecking=no -i $cert root@$domRIp "/opt/cloud/bin/$script $*" kvm# tail /tmp/router.txt Mon Aug 29 16:10:11 CEST 2016: ssh -p 3922 -q -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud root@169.254.2.67 /opt/cloud/bin/netusage.sh -g Mon Aug 29 16:10:11 CEST 2016: ssh -p 3922 -q -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud root@169.254.2.67 /opt/cloud/bin/netusage.sh -g On the router with IP 169.254.2.67 we see the failed logins as well: root@r-221-VM:~# tail -n 2 /var/log/auth.log Aug 29 14:10:11 r-221-VM sshd[7074]: Connection closed by 169.254.0.1 [preauth] Aug 29 14:10:11 r-221-VM sshd[7075]: Connection closed by 169.254.0.1 [preauth] These commands did not succeed. If we execute the command manually we get prompted for the key passphrase. kvm# ssh -p 3922 -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.cloud root@169.254.2.67 Enter passphrase for key '/root/.ssh/id_rsa.cloud': can someone confirm that the rsa key is protected with a passhrase? this looks suspicious to us... regards martin On 08/29/2016 10:09 AM, martin kolly wrote: > thanks Simon and Rohit for the valuable inputs! After applying the > following procedure the ssl errors are gone and host state is UP. > > # service cloudstack-management stop > > mysql> delete from configuration where name = "ssl.keystore" ; > # mv /etc/cloudstack/management/cloudmanagementserver.keystore > /etc/cloudstack/management/cloudmanagementserver.keystore.old > > # service cloudstack-management start > > # file /etc/cloudstack/management/cloudmanagementserver.keystore > /etc/cloudstack/management/cloudmanagementserver.keystore: Java KeyStore > # file /etc/cloudstack/management/cloudmanagementserver.keystore.old > /etc/cloudstack/management/cloudmanagementserver.keystore.old: data > > #java -version > java version "1.7.0_111" > > However new routers are not deployed, we still see entries that KVM > hosts are unreachable (logs underneath). > > After starting a new router it is shown with "virsh list" on the KVM: > kvm# virsh list | grep r-233-VM > 111 r-233-VM running > > After some seconds the router is deleted, > > kvm# virsh list | grep r-233-VM > # > > kvm#virsh -v > 1.2.2 > > (libvirt was already restarted) > > > > ############ Logs/Info KVM ################# > > kvm # dpkg -l | grep cloudstack > ii cloudstack-agent > 4.9.0 all CloudStack agent > ii cloudstack-common > 4.9.0 all A common package > which contains files which are shared by several CloudStack packages > > kvm# df -kh | grep cloud > 10.100.12.9:/export/cloud 188G 137G 51G 73% > /mnt/5db02c19-1e8f-3591-bdb4-02608362521e > > kvm# tail -f /var/log/cloudstack/agent/agent.log > ... > 2016-08-29 06:55:06.203+0000: 4425: error : > qemuMonitorFindBalloonObjectPath:1032 : internal error: Cannot determine > balloon device path > 2016-08-29 06:55:06.223+0000: 4426: error : > qemuMonitorFindBalloonObjectPath:1032 : internal error: Cannot determine > balloon device path > 2016-08-29 06:55:24.062+0000: 4425: warning : qemuDomainObjTaint:1628 : > Domain id=112 name='r-233-VM' uuid=d9dcd37a-242d-43ac-a18e-79a4bfa86ebb > is tainted: high-privileges > 2016-08-29 06:55:49.066+0000: 4423: error : qemuMonitorIO:656 : internal > error: End of file from monitor > > > > > ############ Logs/Info Management Server ######## > > # dpkg -l | grep cloudstack > ii cloudstack-agent > 4.9.0 all CloudStack agent > ii cloudstack-common > 4.9.0 all A common package > which contains files which are shared by several CloudStack packages > ii cloudstack-management > 4.9.0 all CloudStack server library > ii cloudstack-usage > 4.9.0 all CloudStack usage monitor > > # mysql -u root cloud -e "select id,name,path from cloud.storage_pool > where pool_type='Filesystem'" > +----+----------------------+-------------------------+ > | id | name | path | > +----+----------------------+-------------------------+ > | 1 | kvm704 Local Storage | /var/lib/libvirt/images | > | 2 | kvm701 Local Storage | /var/lib/libvirt/images | > | 3 | kvm702 Local Storage | /var/lib/libvirt/images | > | 4 | kvm703 Local Storage | /var/lib/libvirt/images | > +----+----------------------+-------------------------+ > > #tail -f /var/log/cloudstack/management/management-server.log | grep -v > DEBUG > ... > 2016-08-29 08:55:42,484 WARN [o.a.c.alerts] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) alertType:: 9 // dataCenterId:: 1 // podId:: 1 // > clusterId:: null // message:: Command: > com.cloud.agent.api.GetDomRVersionCommand failed while starting virtual > router > 2016-08-29 08:55:42,494 ERROR > [c.c.n.r.VirtualNetworkApplianceManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) GetDomRVersionCmd failed > 2016-08-29 08:55:42,495 WARN > [c.c.n.r.VirtualNetworkApplianceManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Command: com.cloud.agent.api.GetDomRVersionCommand > failed while starting virtual router > 2016-08-29 08:55:42,495 INFO [c.c.v.VirtualMachineManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) The guru did not like the answers so stopping > VM[DomainRouter|r-233-VM] > 2016-08-29 08:55:49,184 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-5bd7c179) (logid:86c0376e) Begin cleanup > expired async-jobs > 2016-08-29 08:55:49,192 INFO [o.a.c.f.j.i.AsyncJobManagerImpl] > (AsyncJobMgr-Heartbeat-1:ctx-5bd7c179) (logid:86c0376e) End cleanup > expired async-jobs > 2016-08-29 08:55:50,861 ERROR [c.c.v.VirtualMachineManagerImpl] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Failed to start instance VM[DomainRouter|r-233-VM] > com.cloud.utils.exception.ExecutionException: Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) > at > com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753) > at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-08-29 08:55:51,136 ERROR [c.c.v.VmWorkJobHandlerProxy] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Invocation exception, caused by: > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > 2016-08-29 08:55:51,136 INFO [c.c.v.VmWorkJobHandlerProxy] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739 ctx-68467c7a) > (logid:337b354e) Rethrow exception > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > 2016-08-29 08:55:51,137 ERROR [c.c.v.VmWorkJobDispatcher] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739) (logid:337b354e) > Unable to complete AsyncJobVO {id:2739, userId: 2, accountId: 2, > instanceType: null, instanceId: null, cmd: com.cloud.vm.VmWorkStart, > cmdInfo: > rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAAAAAAAAAACAAAAAAAAAAIAAAAAAAAA6XQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAAAAAADHcIAAAAEAAAAAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw, > cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0, > result: null, initMsid: 90520741415395, completeMsid: null, lastUpdated: > null, lastPolled: null, created: Mon Aug 29 08:55:21 CEST 2016}, job > origin:2738 > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1120) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) > at > com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753) > at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: com.cloud.utils.exception.ExecutionException: Unable to > start VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in > finalizeStart, not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > ... 17 more > 2016-08-29 08:55:51,168 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (Work-Job-Executor-112:ctx-b46df979 job-2738/job-2739) (logid:337b354e) > Remove job-2739 from job monitoring > 2016-08-29 08:55:53,197 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (Work-Job-Executor-113:ctx-6411b64b job-2738/job-2740) (logid:db064d99) > Add job-2740 into job monitoring > 2016-08-29 08:55:53,218 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (Work-Job-Executor-113:ctx-6411b64b job-2738/job-2740) (logid:337b354e) > Remove job-2740 from job monitoring > 2016-08-29 08:55:53,247 ERROR [c.c.a.ApiAsyncJobDispatcher] > (API-Job-Executor-68:ctx-d2eb7557 job-2738) (logid:337b354e) Unexpected > exception while executing > org.apache.cloudstack.api.command.admin.router.StartRouterCmd > com.cloud.exception.AgentUnavailableException: Resource [Host:8] is > unreachable: Host 8: Unable to start instance due to Unable to start > VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in finalizeStart, > not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1120) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:107) > at > com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:4753) > at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:102) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:554) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) > at > org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) > at > org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) > at > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:502) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: com.cloud.utils.exception.ExecutionException: Unable to > start VM:d9dcd37a-242d-43ac-a18e-79a4bfa86ebb due to error in > finalizeStart, not retrying > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:1084) > at > com.cloud.vm.VirtualMachineManagerImpl.orchestrateStart(VirtualMachineManagerImpl.java:4592) > at sun.reflect.GeneratedMethodAccessor290.invoke(Unknown Source) > ... 17 more > 2016-08-29 08:55:53,256 INFO [o.a.c.f.j.i.AsyncJobMonitor] > (API-Job-Executor-68:ctx-d2eb7557 job-2738) (logid:337b354e) Remove > job-2738 from job monitoring > > > > > regards > Martin > > > > > On 08/26/2016 08:47 PM, Rohit Yadav wrote: >> Hi Martin, >> >> >> I checked my openssl connect output and my management server on port 8250 >> was indeed returning valid certificates. On further investigation, when >> connections to management server (NioServer, handled by Link class that >> reads /etc/cloudstack/management/cloudmanagementserver.keystore) are >> received it uses a keystore file >> (/etc/cloudstack/management/cloudmanagementserver.keystore). Please check if >> that's valid, or backup your keystore file and do what Simon has suggested. >> >> >> >> For reference, here the output from my 4.9.0 mgmt server running on Ubuntu >> 14.04.4: >> >> >> $ openssl s_client -tls1 -connect 192.168.1.11:8250 >> CONNECTED(00000003) >> depth=0 C = Unknown, O = bluebox, OU = bluebox, CN = Cloudstack User >> verify error:num=18:self signed certificate >> verify return:1 >> depth=0 C = Unknown, O = bluebox, OU = bluebox, CN = Cloudstack User >> verify return:1 >> --- >> Certificate chain >> 0 s:/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> i:/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> --- >> Server certificate >> -----BEGIN CERTIFICATE----- >> MIIDPzCCAiegAwIBAgIEFHHX2DANBgkqhkiG9w0BAQsFADBQMRAwDgYDVQQGEwdV >> bmtub3duMRAwDgYDVQQKEwdibHVlYm94MRAwDgYDVQQLEwdibHVlYm94MRgwFgYD >> VQQDEw9DbG91ZHN0YWNrIFVzZXIwHhcNMTUwNTA2MTM0MjMzWhcNMjUwNTAzMTM0 >> MjMzWjBQMRAwDgYDVQQGEwdVbmtub3duMRAwDgYDVQQKEwdibHVlYm94MRAwDgYD >> VQQLEwdibHVlYm94MRgwFgYDVQQDEw9DbG91ZHN0YWNrIFVzZXIwggEiMA0GCSqG >> SIb3DQEBAQUAA4IBDwAwggEKAoIBAQC0wN8kfOJMzwlbrOnBj/jjvjjIwDVpYLtH >> WoKkNB+rzzKEUxaYwaQxe6E3M536ZuqcaJBIqcYPwTIkWyulvuHuJpSQak4VbuDV >> f7dqt5RacLFT0jUciqTvL5QDCrk0uNugKkWgEvNtokVGSBwLPVEcdcGWpku1EpeH >> vMYmpOkcWgbC8Z9D7QTlVw6oEWbPAtKr+gDrXdOFnpPPI45rteatIIgKm1Q6JjZM >> qrUKfqt7s8ts6ZgdAN2WmtieSsnUX1su9SJMYg2J8LK7UJeGqiNtE+g944GPqtnW >> aDldjharq54e79ug2ktxw29I3ulpRD/vgxwZmcPJrePUKwY91KEjAgMBAAGjITAf >> MB0GA1UdDgQWBBSUJvLY8RL/1fTVqj1rT8136da4yzANBgkqhkiG9w0BAQsFAAOC >> AQEAgDdFIgLvOH/UgRp2nnFUVcMp+uchSLj8CbCkukJBrUwrmJHp3Os+H1ggk8Vt >> j3conj06zJBNN/E0J8pcpagE1aR+l4R8WxF3g/Oc7bNyrUlkGSQ82vavg9sEkwHY >> eQY/4wj8CprICs9JilgZ6keeWNWgAW1goLZSzGVwz5eE0lPuc2Dg3laR5RsuTxie >> dgQhpbOx3UZun+dhuP5NUHc+KWyrNvSZNN8FruO602KWZwm0Hndl7RVbkNEd0kxq >> FhFK4Scc2HBrKMUrPTzO1nGCgR1gA015C2MFfmjeW49VTi95WnY8DDG2euUYAtpl >> lLVuNxJxq7eDJfP/M9kxSIKgrQ== >> -----END CERTIFICATE----- >> subject=/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> issuer=/C=Unknown/O=bluebox/OU=bluebox/CN=Cloudstack User >> --- >> No client certificate CA names sent >> Server Temp Key: ECDH, P-256, 256 bits >> --- >> SSL handshake has read 1325 bytes and written 331 bytes >> --- >> New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-SHA >> Server public key is 2048 bit >> Secure Renegotiation IS supported >> Compression: NONE >> Expansion: NONE >> No ALPN negotiated >> SSL-Session: >> Protocol : TLSv1 >> Cipher : ECDHE-RSA-AES256-SHA >> Session-ID: >> 57C08E5D841D9B74AB533A0000BDBE4275424640B5B7D1985D438B275598436F >> Session-ID-ctx: >> Master-Key: >> E1DE7D7464766DCECD82088561A7D928519CA20BA91FDF4AA38BB2E51AB7C27D3F1ABA5788ADED2AD49FE8597EC4A344 >> Key-Arg : None >> PSK identity: None >> PSK identity hint: None >> SRP username: None >> Start Time: 1472237149 >> Timeout : 7200 (sec) >> Verify return code: 18 (self signed certificate) >> --- >> >> >> >> >> Regards. >> >> ________________________________ >> From: martin kolly <martin.ko...@senselan.ch> >> Sent: 26 August 2016 20:36:02 >> To: dev@cloudstack.apache.org >> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >> >> >> good point, thanks Simon! with openssl we receive a response on port 8250. >> >> # telnet 10.100.12.10 8250 >> Trying 10.100.12.10... >> Connected to 10.100.12.10. >> Escape character is '^]'. >> Connection closed by foreign host. >> >> # nc -zv 10.100.12.10 8250 >> Connection to 10.100.12.10 8250 port [tcp/*] succeeded! >> >> # openssl s_client -tls1 -connect 10.100.12.10:8250 >> CONNECTED(00000003) >> write:errno=104 >> --- >> no peer certificate available >> --- >> No client certificate CA names sent >> --- >> SSL handshake has read 0 bytes and written 0 bytes >> --- >> New, (NONE), Cipher is (NONE) >> Secure Renegotiation IS NOT supported >> Compression: NONE >> Expansion: NONE >> SSL-Session: >> Protocol : TLSv1 >> Cipher : 0000 >> Session-ID: >> Session-ID-ctx: >> Master-Key: >> Key-Arg : None >> PSK identity: None >> PSK identity hint: None >> SRP username: None >> Start Time: 1472223447 >> Timeout : 7200 (sec) >> Verify return code: 0 (ok) >> --- >> >> >> On 08/26/2016 04:49 PM, Simon Weller wrote: >>> Martin, >>> >>> >>> Are you able to actually telnet to 8250 from the host to the mgmt server? >>> >>> >>> - Si >>> >>> >>> ________________________________ >>> From: martin kolly <martin.ko...@senselan.ch> >>> Sent: Friday, August 26, 2016 9:41 AM >>> To: dev@cloudstack.apache.org >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> Hi Rohit >>> >>> We highly appreciate your efforts! Unfortunately it still does not work. >>> - ulimit is increased on mgmt server >>> - jar file replaced >>> - we confirm that cloudstack-agent 4.9.0 is installed >>> >>> MGMT Server >>> # wget >>> https://github.com/rhtyd/cloudstack/releases/download/4.9.0-nioinbound/cloud-utils-4.9.0.jar >>> -O cloud-utils-4.9.0.jar.patch >>> # md5sum cloud-utils-4.9.0.jar.patch >>> c4496f42cc6741f562ac645c3a3d8a0c cloud-utils-4.9.0.jar.patch >>> # md5sum >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar >>> c4496f42cc6741f562ac645c3a3d8a0c >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar >>> >>> # ulimit -a >>> core file size (blocks, -c) 0 >>> data seg size (kbytes, -d) unlimited >>> scheduling priority (-e) 0 >>> file size (blocks, -f) unlimited >>> pending signals (-i) 64109 >>> max locked memory (kbytes, -l) 64 >>> max memory size (kbytes, -m) unlimited >>> open files (-n) 10240 >>> pipe size (512 bytes, -p) 8 >>> POSIX message queues (bytes, -q) 819200 >>> real-time priority (-r) 0 >>> stack size (kbytes, -s) 8192 >>> cpu time (seconds, -t) unlimited >>> max user processes (-u) 64109 >>> virtual memory (kbytes, -v) unlimited >>> file locks (-x) unlimited >>> >>> KVM Server >>> # wget >>> https://github.com/rhtyd/cloudstack/releases/download/4.9.0-nioinbound/cloud-utils-4.9.0.jar >>> -O cloud-utils-4.9.0.jar.patch >>> # md5sum cloud-utils-4.9.0.jar.patch >>> c4496f42cc6741f562ac645c3a3d8a0c cloud-utils-4.9.0.jar.patch >>> # md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> c4496f42cc6741f562ac645c3a3d8a0c >>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> >>> # apt-cache policy cloudstack-agent >>> cloudstack-agent: >>> Installed: 4.9.0 >>> Candidate: 4.9.0 >>> Version table: >>> *** 4.9.0 0 >>> 500 http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ >>> ./ Packages >>> 100 /var/lib/dpkg/status >>> >>> The logs are attached. By the way: the error message "Caught the Exception >>> in VmIpFetchTask" was already there with 4.8 release. >>> >>> Thanks >>> Martin >>> >>> On 08/25/2016 07:44 PM, Rohit Yadav wrote: >>> >>> Hi Martin, >>> >>> >>> Thanks for sharing. Alright, I'm not sure what's causing issue but based on >>> the logs seems like only KVM agents are having issues while connecting to >>> mgmt server as I don't see any Nio related exceptions in the management >>> server logs. >>> >>> >>> I could not see the cloudstack-agent version in the logs, I'm assuming that >>> they were all upgraded to 4.9.0, and there are no conflicting jars at >>> /usr/share/cloudstack-agent/lib. >>> >>> >>> First, can you make sure mgmt server has enough ulimit. I found that >>> Ubuntu/Debian's init.d script don't override this while CentOS >>> initd/systemd script sets ulimit. On your mgmt server, edit >>> /etc/init.d/cloudstack-management and add ulimit -n 10240 just before the >>> mgmt server is started in the 'state' section (for me it was at around line >>> #147 where it logs a message that it's starting the cloudstack-management >>> server). >>> >>> >>> Next, if this still does not solve the issue -- I created a special >>> cloud-utils.jar for you that you need to place on your mgmt server and on >>> the KVM agents and restart the mgmt server. This will increase verbosity of >>> the error while reduce the Nio polling loop timeout (from 100ms to 10ms). >>> On KVM agents, the error from the logs is that during SSL handshake inbound >>> connection/stream gets closed, and we want to know the exception message. >>> Please get the jar from here: >>> >>> https://github.com/rhtyd/cloudstack/releases/tag/4.9.0-nioinbound and place >>> them at: >>> >>> /usr/share/cloudstack-agent/lib/ (on kvm host) >>> >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/ (on mgmt >>> server host) >>> >>> >>> Let me know what worked for you, and if it still failed can you share the >>> mgmt server and agent logs once again. Thanks. >>> >>> >>> Regards. >>> >>> ________________________________ >>> From: martin kolly >>> <martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch> >>> Sent: 25 August 2016 20:50:08 >>> To: dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org> >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> Hi Rohit >>> >>> We are running java version 1.7.0.111 on KVM and management server. >>> mgmt# java -version >>> java version "1.7.0_111" >>> kvm# java -version >>> java version "1.7.0_111" >>> >>> We get the same error message. Attached are the logs with TRACE enabled. >>> >>> "success consists of going from failure to failure without loss of >>> enthusiasm." >>> >>> regards >>> martin >>> >>> On 08/25/2016 02:02 PM, Rohit Yadav wrote: >>> >>> Hi Martin, >>> >>> >>> Thanks for sharing, on the surface there does not seem to be any issue in >>> configuration causing the failures. I'm personally running KVM and Ubuntu >>> hosts based env without issues, I'm on Ubuntu 14.04.4 (Linux bluebox >>> 3.16.0-45-generic #60~14.04.1-Ubuntu) and java 1.7.0_79. Can you try >>> upgrading your JRE7 to latest (openjdk-7-jre, 7u111-2.6.7-0ubuntu0.14.04.3) >>> on all mgmt server and kvm hosts? >>> >>> >>> If upgrading your JRE does not help, can you increase the logging verbosity >>> for both the agent and management server (in /etc/cloudstack/{agent, >>> management} there would be a log4j file, edit that and replace DEBUG/INFO >>> with TRACE for class/keys com.cloud and org.apache.cloudstack) and re-share >>> logs when the failures occur? I want to see what additional information we >>> can get from logs when it tries to connect to host 10.100.12.10 on port: >>> 8250. >>> >>> >>> Regards. >>> >>> ________________________________ >>> From: martin kolly >>> <martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch> >>> Sent: 25 August 2016 17:11:06 >>> To: >>> dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org> >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> >>> @Simon: We have one management server with local DB. KVMs connect >>> directly to the management server without any security/loadbalancing >>> device. >>> >>> Thanks >>> Martin >>> >>> On 08/25/2016 12:41 PM, Simon Weller wrote: >>> >>> >>> Martin, >>> >>> Can you provide more detail about your haproxy setup? >>> Are you running it on separate servers, or on the management server itself? >>> >>> - Si >>> >>> Simon Weller/ENA >>> (615) 312-6068 >>> >>> rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com> >>> www.shapeblue.com<http://www.shapeblue.com> >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >>> @shapeblue >>> >>> >>> >>> >>> -----Original Message----- >>> From: martin kolly >>> [martin.ko...@senselan.ch<mailto:martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch>] >>> Received: Thursday, 25 Aug 2016, 5:04AM >>> To: Rohit Yadav >>> [rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com>]; >>> >>> dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org> >>> >>> [dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org>] >>> Subject: Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> >>> thanks for your reply. >>> >>> This morning we repeated the upgrade process from 4.8 to 4.9 with the >>> following repository: >>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/. >>> <http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/><http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/><http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/><http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/> >>> Unfortunately we run into the same issue: >>> >>> /2016-08-25 09:49:00,660 INFO [utils.nio.NioClient] (main:null) >>> (logid:) Connecting to 10.100.12.10:8250// >>> //2016-08-25 09:49:00,668 WARN [utils.nio.Link] (main:null) (logid:) >>> This SSL engine was forced to close inbound due to end of stream.// >>> //2016-08-25 09:49:00,668 ERROR [utils.nio.NioClient] (main:null) >>> (logid:) SSL Handshake failed while connecting to host: 10.100.12.10 >>> port: 8250// >>> //2016-08-25 09:49:00,668 ERROR [utils.nio.NioConnection] (main:null) >>> (logid:) Unable to initialize the threads.// >>> //java.io.IOException: SSL Handshake failed while connecting to host: >>> 10.100.12.10 port: 8250// >>> // at com.cloud.utils.nio.NioClient.init(NioClient.java:67)// >>> // at com.cloud.utils.nio.NioConnection.start(NioConnection.java:88)// >>> // at com.cloud.agent.Agent.start(Agent.java:237)// >>> // at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399)// >>> // at >>> com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367)// >>> // at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351)// >>> // at com.cloud.agent.AgentShell.start(AgentShell.java:456)// >>> // at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)// >>> // at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)// >>> // at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)// >>> // at java.lang.reflect.Method.invoke(Method.java:606)// >>> // at >>> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)// >>> //2016-08-25 09:49:00,669 INFO [utils.exception.CSExceptionErrorCode] >>> (main:null) (logid:) Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions// >>> //2016-08-25 09:49:00,669 WARN [cloud.agent.Agent] (main:null) (logid:) >>> NIO Connection Exception >>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed >>> while connecting to host: 10.100.12.10 port: 8250// >>> //2016-08-25 09:49:00,670 INFO [cloud.agent.Agent] (main:null) (logid:) >>> Attempted to connect to the server, but received an unexpected >>> exception, trying again.../ >>> >>> *KVM Hosts: >>> */# java -version >>> java version "1.7.0_95" >>> OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.14.04.1) >>> OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode) >>> >>> # dpkg --get-selections | grep -e 'jdk' -e 'java' >>> ca-certificates-java install >>> java-common install >>> libcommons-daemon-java install >>> openjdk-7-jre-headless:amd64 install >>> tzdata-java install >>> >>> # apt-cache policy cloudstack-agent >>> cloudstack-agent: >>> Installed: 4.9.0 >>> Candidate: 4.9.0 >>> Version table: >>> *** 4.9.0 0 >>> 500 >>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ ./ Packages >>> 100 /var/lib/dpkg/status >>> >>> # find /usr/share/ -name "cloud-utils*.jar" >>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> # md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar >>> >>> >>> /*Management Server: >>> */# java -version// >>> //java version "1.7.0_95"// >>> //OpenJDK Runtime Environment (IcedTea 2.6.4) >>> (7u95-2.6.4-0ubuntu0.14.04.1)// >>> //OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)// >>> // >>> //# dpkg --get-selections | grep -e 'jdk' -e 'java'// >>> //ca-certificates-java install// >>> //java-common install// >>> //libcommons-collections3-java install// >>> //libcommons-daemon-java install// >>> //libcommons-dbcp-java install// >>> //libcommons-pool-java install// >>> //libecj-java install// >>> //libgeronimo-jta-1.1-spec-java install// >>> //libmysql-java install// >>> //libservlet2.5-java install// >>> //libtomcat6-java install// >>> //openjdk-7-jre-headless:amd64 install// >>> //tzdata-java install// >>> // >>> //# apt-cache policy cloudstack-management// >>> //cloudstack-management:// >>> // Installed: 4.9.0// >>> // Candidate: 4.9.0// >>> // Version table:// >>> // *** 4.9.0 0// >>> // 500 >>> http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ ./ Packages// >>> // 100 /var/lib/dpkg/status/// >>> >>> /# find /usr/share/ -name "cloud-utils*.jar"// >>> ///usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar// >>> ///usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar// >>> ///usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar// >>> //# md5sum >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar// >>> //a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-utils-4.9.0.jar// >>> //# md5sum /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar// >>> //a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar// >>> //# md5sum /usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar// >>> //a8de7306d7c80b5a73e93b83afdd119f >>> /usr/share/cloudstack-usage/lib/cloud-utils-4.9.0.jar/ >>> >>> The classpath.conf was not modified: >>> /# cat /etc/cloudstack/management/classpath.conf >>> #!/bin/bash >>> #... >>> >>> SYSTEMJARS="" >>> SCP=$(build-classpath $SYSTEMJARS 2>/dev/null) ; if [ $? != 0 ] ; then >>> export SCP="" ; fi >>> MCP="" >>> DCP="/usr/share/tomcat6/bin/bootstrap.jar:/usr/share/tomcat6/bin/tomcat-juli.jar" >>> CLASSPATH=$SCP:$DCP:$MCP:/etc/cloudstack/management:/usr/share/cloudstack-management/setup >>> for jarfile in ""/* ; do >>> if [ ! -e "$jarfile" ] ; then continue ; fi >>> CLASSPATH=$jarfile:$CLASSPATH >>> done >>> for plugin in ""/* ; do >>> if [ ! -e "$plugin" ] ; then continue ; fi >>> CLASSPATH=$plugin:$CLASSPATH >>> done >>> for vendorconf in "/etc/cloudstack/management"/vendor/* ; do >>> if [ ! -d "$vendorconf" ] ; then continue ; fi >>> CLASSPATH=$vendorconf:$CLASSPATH >>> done >>> export CLASSPATH >>> if ([ -z "$JAVA_HOME" ] || [ ! -d "$JAVA_HOME" ]) && [ -d >>> /usr/lib/jvm/jre-1.7.0 ]; then >>> export JAVA_HOME=/usr/lib/jvm/jre-1.7.0 >>> fi >>> PATH=$JAVA_HOME/bin:/sbin:/usr/sbin:$PATH >>> export PATH/ >>> >>> Regards >>> Martin >>> >>> On 08/24/2016 06:56 PM, Rohit Yadav wrote: >>> >>> >>> Martin, >>> >>> >>> Were you able to fix your issue after installing packages from the >>> repo Will shared and restarting the services? >>> >>> I've not personally tested the apt-get.eu repo, but I had earlier >>> built this repo which I'm personally using in my local KVM-trusty >>> based cloud: http://packages.shapeblue.com/cloudstack/upstream/debian/4.9/ >>> >>> >>> If you're still getting the error, can you share the JRE version >>> you're running, both on the mgmt server and on the KVM hosts? You can >>> run java -version, or share output of "dpkg --get-selections | grep -e >>> 'jdk' -e 'java'". Are you running CloudStack with any additional plugins? >>> >>> From the logs, looks like there are mixed jar files, >>> NioConnectionException class was not found -- something's wrong with >>> your installation. there must be a cloud-utils jar file make sure your >>> installation don't have multiple copies/versions of jars >>> (somewhere) in the in /usr/share/cloudstack-common and in >>> /usr/share/cloudstack-management/webapps/client/ paths: >>> >>> Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions >>> The error "Unable to initialize the threads." suggests, JVM was not >>> able to spawn threads. I would like to know your JRE version and any >>> other settings configured in /etc/cloudstack/management/classpath.conf >>> (and there are bunch of other files where JAVA_OPTS might have been >>> overridden). Note: For now you should only be using JRE1.7. >>> >>> >>> Regards. >>> >>> rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com> >>> www.shapeblue.com<http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com> >>> @shapeblue >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> *From:* martin kolly >>> <martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch><mailto:martin.ko...@senselan.ch> >>> *Sent:* 24 August 2016 19:53:26 >>> *To:* >>> dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org>; >>> Rohit Yadav >>> *Subject:* Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> Thanks Will! >>> >>> yes the repo is pointing to 4.9 release for all KVMs and for the >>> management server: >>> /cloudstack:~# cat /etc/apt/sources.list.d/cloudstack.list // >>> //deb http://cloudstack.apt-get.eu/ubuntu trusty 4.9/ >>> >>> All KVM agents and the mgmt server are upgraded to release 4.9 based >>> on the documentation.We have restarted all the cloudstack-agents and >>> the cloudstack-management service as well. >>> >>> Network traces are showing packets from KVM <-> Mgmt on port 8250. >>> there is no security device in between. >>> >>> thanks >>> fanfarlo >>> >>> >>> >>> >>> On 08/24/2016 04:13 PM, Will Stevens wrote: >>> >>> >>> @rohit, I am guessing they should be installing the cloudstack-agent using >>> the following repo right? That is what is described in the upgrade (trusty >>> instead of precise though). >>> >>> http://cloudstack.apt-get.eu/ubuntu/dists/trusty/4.9/ >>> >>> @fanfarlo, are your repo's setup to point to the new 4.9 version? >>> >>> cheers, >>> >>> will >>> >>> On Wed, Aug 24, 2016 at 9:46 AM, Rohit Yadav >>> <rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com> >>> wrote: >>> >>> >>> >>> The PR and fix already exists in 4.9.0 release. Please make sure to >>> upgrade all of your management server(s) and KVM agents and then also >>> restart them after the upgrade. >>> >>> >>> If you are seeing SSL handshake failures, it could be due to network or >>> security issue and most likely due to mismatch between CloudStack mgmt >>> server and KVM agent version. >>> >>> >>> Regards. >>> >>> rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com> >>> www.shapeblue.com<http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com> >>> @shapeblue >>> >>> >>> >>> ------------------------------ >>> *From:* Will Stevens >>> <williamstev...@gmail.com><mailto:williamstev...@gmail.com><mailto:williamstev...@gmail.com><mailto:williamstev...@gmail.com> >>> *Sent:* 24 August 2016 18:17:17 >>> *To:* >>> dev@cloudstack.apache.org<mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org><mailto:dev@cloudstack.apache.org>; >>> Rohit Yadav >>> *Subject:* Re: CS 4.9 NIO Selector wait time PR-1601 >>> >>> >>> That PR is already merged, so you don't have to do anything to get that >>> code, you already have it. >>> >>> @rohit, can you review this? I think this is a similar to the issue Simon >>> reported earlier. >>> >>> Will >>> >>> On Aug 24, 2016 6:56 AM, "fanfarlo" >>> <fanfar...@gmail.com><mailto:fanfar...@gmail.com><mailto:fanfar...@gmail.com><mailto:fanfar...@gmail.com> >>> wrote: >>> >>> >>> >>> hi all >>> >>> We have the following environment: >>> - OS: Debian 14.04 (hypervisors and management) >>> - 4 KVM Hosts >>> - Cloudstack Release 4.9 with local database >>> >>> Since we upgraded to Release 4.9 the KVM hosts no longer connect to the >>> management Server. Upgrade procedure was followed as described: >>> http://docs.cloudstack.apache.org/projects/cloudstack-releas >>> e-notes/en/4.9.0/upgrade/upgrade-4.8.html >>> >>> >>> On the KVM hosts we have the following error message: >>> /2016-08-24 10:42:49,678 INFO [utils.exception.CSExceptionErrorCode] >>> (main:null) (logid:) Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions >>> 2016-08-24 10:42:49,678 WARN [cloud.agent.Agent] (main:null) (logid:) >>> NIO Connection Exception >>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed >>> while connecting to host: 10.100.12.10 port: 8250 >>> 2016-08-24 10:42:49,678 INFO [cloud.agent.Agent] (main:null) (logid:) >>> Attempted to connect to the server, but received an unexpected >>> exception, trying again... >>> 2016-08-24 10:42:54,679 INFO [utils.nio.NioClient] (main:null) (logid:) >>> Connecting to 10.100.12.10:8250 >>> 2016-08-24 10:42:54,684 WARN [utils.nio.Link] (main:null) (logid:) This >>> SSL engine was forced to close inbound due to end of stream. >>> 2016-08-24 10:42:54,684 ERROR [utils.nio.NioClient] (main:null) (logid:) >>> SSL Handshake failed while connecting to host: 10.100.12.10 port: 8250 >>> 2016-08-24 10:42:54,685 ERROR [utils.nio.NioConnection] (main:null) >>> (logid:) Unable to initialize the threads. >>> java.io.IOException: SSL Handshake failed while connecting to host: >>> 10.100.12.10 port: 8250 >>> at com.cloud.utils.nio.NioClient.init(NioClient.java:67) >>> at com.cloud.utils.nio.NioConnection.start(NioConnection.java:88) >>> at com.cloud.agent.Agent.start(Agent.java:237) >>> at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399) >>> at >>> com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367) >>> at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351) >>> at com.cloud.agent.AgentShell.start(AgentShell.java:456) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>> ssorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>> thodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.commons.daemon.support.DaemonLoader.start(DaemonL >>> oader.java:243) >>> 2016-08-24 10:42:54,685 INFO [utils.exception.CSExceptionErrorCode] >>> (main:null) (logid:) Could not find exception: >>> com.cloud.utils.exception.NioConnectionException in error code list for >>> exceptions >>> 2016-08-24 10:42:54,685 WARN [cloud.agent.Agent] (main:null) (logid:) >>> NIO Connection Exception >>> com.cloud.utils.exception.NioConnectionException: SSL Handshake failed >>> while connecting to host: 10.100.12.10 port: 8250 >>> 2016-08-24 10:42:54,686 INFO [cloud.agent.Agent] (main:null) (logid:) >>> Attempted to connect to the server, but received an unexpected >>> exception, trying again.../ >>> >>> >>> Port is open on the management server, there is no firewall in between. >>> We found that there was a bug report here: >>> https://issues.apache.org/jira/browse/CLOUDSTACK-9348. There is a PR >>> changing the NIO Selector wait time: >>> https://github.com/apache/cloudstack/pull/1601 which was merged into >>> master branch. >>> >>> Since we installed Release 4.9 we probably need to patch the >>> NioConection.class as described in PR1601 , right? >>> >>> kvm03# unzip -v /usr/share/cloudstack-agent/lib/cloud-utils-4.9.0.jar | >>> grep NioConnection >>> 3923 Defl:N 1778 55% 2016-08-02 09:28 05aaf7d5 >>> com/cloud/utils/nio/NioConnection$1.class >>> 881 Defl:N 495 44% 2016-08-02 09:28 e378984c >>> com/cloud/utils/nio/NioConnection$ChangeRequest.class >>> 15410 Defl:N 7130 54% 2016-08-02 09:28 b3281f5a >>> com/cloud/utils/nio/NioConnection.class >>> 1134 Defl:N 584 49% 2016-08-02 09:28 8d5cb4a8 >>> com/cloud/utils/exception/NioConnectionException.class >>> >>> Due to a lack of java expertise we have some basic questions: >>> - Is there a patched jar file available ? public build server? >>> - Do we need to create the jar from sources ? procedure? >>> - How do we apply the patch ? >>> >>> many thanks! >>> fanfarlo >>> >>> >>> >>> >>> >>> >>> >>> >>> rohit.ya...@shapeblue.com<mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com><mailto:rohit.ya...@shapeblue.com> >>> www.shapeblue.com<http://www.shapeblue.com><http://www.shapeblue.com><http://www.shapeblue.com> >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >>> @shapeblue >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> rohit.ya...@shapeblue.com >> www.shapeblue.com >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >> @shapeblue >> >> >> >> >