0sorkon opened a new issue, #9250: URL: https://github.com/apache/cloudstack/issues/9250
##### ISSUE TYPE <!-- Pick one below and delete the rest --> * Bug Report ##### COMPONENT NAME <!-- Categorize the issue, e.g. API, VR, VPN, UI, etc. --> ~~~ management server ~~~ ##### CLOUDSTACK VERSION <!-- New line separated list of affected versions, commit ID for issues on main branch. --> ~~~ 4.18.2 ~~~ Hello, I need help to resolve this. After updating 4.18.1 -> 4.18.2, the first updated host is stuck in the "Connecting" state. The MS update was successful, but the problem occurred with the first host I tried to update immediately after MS. In the MS logs: > 2024-06-13 16:12:30,872 DEBUG [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-12:null) (logid:) SeqA 60-328142: Sending Seq 60-328142: { Ans: , MgmtId: 20788631421694, via: 60, Ver: v1, Flags: 100010, [{"com.cloud.agent.api.AgentControlAnswer":{"result":"true","wait":"0","bypassHostMaintenance":"false"}}] } > 2024-06-13 16:12:31,437 INFO [c.c.u.e.CSExceptionErrorCode] (AgentConnectTaskPool-149:ctx-31302c74) (logid:90640aee) Could not find exception: com.cloud.exception.ConnectionException in error code list for exceptions > 2024-06-13 16:12:31,437 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-149:ctx-31302c74) (logid:90640aee) Failed to handle host connection: > com.cloud.exception.ConnectionException: Unable to acquire lock on host 53f6efbf-5180-4475-8f24-b960a3e08db1 > at com.cloud.agent.manager.AgentManagerImpl.sendReadyAndGetAttache(AgentManagerImpl.java:1153) > at com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:1166) > at com.cloud.agent.manager.AgentManagerImpl$HandleAgentConnectTask.runInContext(AgentManagerImpl.java:1250) > at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48) > at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55) > at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102) > at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52) > at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45) > at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > 2024-06-13 16:12:31,438 WARN [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-149:ctx-31302c74) (logid:90640aee) Unable to create attache for agent: Seq 0-0: { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 1, [{"com.cloud.agent.api.StartupRoutingCommand":{"cpuSockets":"1","cpus":"48","speed":"2794","memory":"268619120640","dom0MinMemory":"1073741824","poolSync":"false","supportsClonedVolumes":"false","caps":"hvm,snapshot","pool":"/root","hypervisorType":"KVM","hostDetails":{"Host.OS.Kernel.Version":"5.15.0-105-generic","com.cloud.network.Networks.RouterPrivateIpStrategy":"HostLocal","Host.OS.Version":"22.04","host.volume.encryption":"true","secured":"true","Host.OS":"Ubuntu"},"hostTags":[],"groupDetails":{},"type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"7f57af20-c5cf-3ec7-95bc-7bf6463cfa84-LibvirtComputingResource","name":"cloud.nl.introserv.net","id":"0","version":"4.18.2.0","iqn":"iqn.2004-10.com.ubuntu:01:e7f6625b59c","publicIpAddress":"185.191.1.138","public Netmask":"255.255.255.248","publicMacAddress":"e2:c5:c6:ce:3b:69","privateIpAddress":"172.16.0.1","privateMacAddress":"b2:38:22:80:59:7d","privateNetmask":"255.255.255.0","storageIpAddress":"172.16.0.1","storageNetmask":"255.255.255.0","storageMacAddress":"b2:38:22:80:59:7d","resourceName":"LibvirtComputingResource","gatewayIpAddress":"185.191.1.137","msHostList":"178.132.3.175@static","wait":"0","bypassHostMaintenance":"false"}},{"com.cloud.agent.api.StartupStorageCommand":{"totalSize":"(0 bytes) 0","poolInfo":{"uuid":"8a76c6b2-f791-4f6b-a09e-85a581b8189f","host":"172.16.0.1","localPath":"/Storage/images","hostPath":"/Storage/images","poolType":"Filesystem","capacityBytes":"(97.87 GB) 105089261568","availableBytes":"(74.69 GB) 80196403200"},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","pod":"1","guid":"7f57af20-c5cf-3ec7-95bc-7bf6463cfa84-LibvirtComputingResource","name":"cloud.nl.introserv.net","id":"0","version":"4.18.2.0","resourceName":"Libvi rtComputingResource","msHostList":"178.132.3.175@static","wait":"0","bypassHostMaintenance":"false"}}] } > 2024-06-13 16:12:31,446 WARN [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-5:null) (logid:) Throwing away a request because it came through as the first command on a connect: Seq 0--1: { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 111, [{"com.cloud.agent.api.ShutdownCommand":{"reason":"sig.kill","removeHost":"false","wait":"0","bypassHostMaintenance":"false"}}] } > 2024-06-13 16:12:31,539 DEBUG [c.c.s.StatsCollector] (StatsCollector-5:ctx-7eb2d626) (logid:ac80bd51) ManagementServerCollector is running... > 2024-06-13 16:12:31,540 WARN [c.c.a.m.AgentManagerImpl] (AgentManager-Handler-13:null) (logid:) Throwing away a request because it came through as the first command on a connect: Seq 0-3: { Cmd , MgmtId: -1, via: 0, Ver: v1, Flags: 11, [{"com.cloud.agent.api.PingRoutingWithNwGroupsCommand":{"newGroupStates":{},"_hostVmStateReport":{},"_gatewayAccessible":"true","_vnetAccessible":"true","hostType":"Routing","hostId":"0","wait":"0","bypassHostMaintenance":"false"}}] } In the agent log: > 2024-06-13 16:17:09,751 INFO [cloud.agent.Agent] (Agent-Handler-2:null) (logid:) Startup Response Received: agent id = 0 > 2024-06-13 16:19:09,746 INFO [cloud.agent.Agent] (AgentShutdownThread:null) (logid:) Stopping the agent: Reason = sig.kill > 2024-06-13 16:19:21,239 INFO [cloud.agent.AgentShell] (main:null) (logid:) Agent started > 2024-06-13 16:19:21,243 INFO [cloud.agent.AgentShell] (main:null) (logid:) Implementation Version is 4.18.2.0 > 2024-06-13 16:19:21,244 INFO [cloud.agent.AgentShell] (main:null) (logid:) agent.properties found at /etc/cloudstack/agent/agent.properties > 2024-06-13 16:19:21,283 INFO [cloud.agent.AgentShell] (main:null) (logid:) Defaulting to using properties file for storage > 2024-06-13 16:19:21,284 INFO [cloud.agent.AgentShell] (main:null) (logid:) Defaulting to the constant time backoff algorithm > 2024-06-13 16:19:21,321 INFO [cloud.utils.LogUtils] (main:null) (logid:) log4j configuration found at /etc/cloudstack/agent/log4j-cloud.xml > 2024-06-13 16:19:21,321 INFO [cloud.agent.AgentShell] (main:null) (logid:) Using default Java settings for IPv6 preference for agent connection > 2024-06-13 16:19:21,387 INFO [cloud.agent.Agent] (main:null) (logid:) id is 0 > 2024-06-13 16:19:21,396 ERROR [kvm.resource.LibvirtComputingResource] (main:null) (logid:) uefi properties file not found due to: Unable to find file uefi.properties. > 2024-06-13 16:19:21,435 INFO [kvm.resource.LibvirtConnection] (main:null) (logid:) No existing libvirtd connection found. Opening a new one > 2024-06-13 16:19:21,652 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) IO uring driver for Qemu: enabled > 2024-06-13 16:19:21,757 INFO [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) adding storage adaptor for com.cloud.hypervisor.kvm.storage.LinstorStorageAdaptor > 2024-06-13 16:19:21,760 INFO [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) adding storage adaptor for com.cloud.hypervisor.kvm.storage.StorPoolStorageAdaptor > 2024-06-13 16:19:21,761 WARN [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) Duplicate StorageAdaptor type PowerFlex, not loading com.cloud.hypervisor.kvm.storage.ScaleIOStorageAdaptor > 2024-06-13 16:19:21,761 INFO [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) adding storage adaptor for com.cloud.hypervisor.kvm.storage.IscsiAdmStorageAdaptor > 2024-06-13 16:19:21,762 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) No libvirt.vif.driver specified. Defaults to BridgeVifDriver. > 2024-06-13 16:19:21,888 INFO [cloud.serializer.GsonHelper] (main:null) (logid:) Default Builder inited. > 2024-06-13 16:19:21,888 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) iscsi session clean up is disabled > 2024-06-13 16:19:21,890 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) Skipping the memory balloon stats period setting, since there are no VMs (active Libvirt domains) on this host. > 2024-06-13 16:19:21,891 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) The [vm.memballoon.stats.period] property is set to '0', this prevents memory statistics from being displayed correctly. Adjust (increase) the value of this parameter to correct this. > 2024-06-13 16:19:21,899 INFO [cloud.agent.Agent] (main:null) (logid:) Agent [id = 0 : type = LibvirtComputingResource : zone = 1 : pod = 1 : workers = 5 : host = 178.132.3.175 : port = 8250 > 2024-06-13 16:19:21,905 INFO [utils.nio.NioClient] (main:null) (logid:) Connecting to 178.132.3.175:8250 > 2024-06-13 16:19:21,907 INFO [utils.nio.Link] (main:null) (logid:) Conf file found: /etc/cloudstack/agent/agent.properties > 2024-06-13 16:19:22,163 INFO [utils.nio.NioClient] (main:null) (logid:) SSL: Handshake done > 2024-06-13 16:19:22,163 INFO [utils.nio.NioClient] (main:null) (logid:) Connected to 178.132.3.175:8250 > 2024-06-13 16:19:22,236 INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:null) (logid:) Fetching CPU speed from command "lscpu". > 2024-06-13 16:19:22,279 ERROR [utils.linux.KVMHostInfo] (Agent-Handler-1:null) (logid:) Unable to retrieve the CPU speed from lscpu. > java.lang.NullPointerException > at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1838) > at java.base/jdk.internal.math.FloatingDecimal.parseFloat(FloatingDecimal.java:122) > at java.base/java.lang.Float.parseFloat(Float.java:455) > at org.apache.cloudstack.utils.linux.KVMHostInfo.getCpuSpeedFromCommandLscpu(KVMHostInfo.java:124) > at org.apache.cloudstack.utils.linux.KVMHostInfo.getCpuSpeed(KVMHostInfo.java:99) > at org.apache.cloudstack.utils.linux.KVMHostInfo.getHostInfoFromLibvirt(KVMHostInfo.java:179) > at org.apache.cloudstack.utils.linux.KVMHostInfo.<init>(KVMHostInfo.java:65) > at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.initialize(LibvirtComputingResource.java:3522) > at com.cloud.agent.Agent.sendStartup(Agent.java:444) > at com.cloud.agent.Agent$ServerHandler.doTask(Agent.java:1105) > at com.cloud.utils.nio.Task.call(Task.java:83) > at com.cloud.utils.nio.Task.call(Task.java:29) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > 2024-06-13 16:19:22,281 INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:null) (logid:) Fetching CPU speed from file [/sys/devices/system/cpu/cpu0/cpufreq/base_frequency]. > 2024-06-13 16:19:22,281 ERROR [utils.linux.KVMHostInfo] (Agent-Handler-1:null) (logid:) Unable to retrieve the CPU speed from file [/sys/devices/system/cpu/cpu0/cpufreq/base_frequency] > java.io.FileNotFoundException: /sys/devices/system/cpu/cpu0/cpufreq/base_frequency (No such file or directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at java.base/java.io.FileInputStream.open(FileInputStream.java:219) > at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157) > at java.base/java.io.FileInputStream.<init>(FileInputStream.java:112) > at java.base/java.io.FileReader.<init>(FileReader.java:60) > at org.apache.cloudstack.utils.linux.KVMHostInfo.getCpuSpeedFromFile(KVMHostInfo.java:135) > at org.apache.cloudstack.utils.linux.KVMHostInfo.getCpuSpeed(KVMHostInfo.java:104) > at org.apache.cloudstack.utils.linux.KVMHostInfo.getHostInfoFromLibvirt(KVMHostInfo.java:179) > at org.apache.cloudstack.utils.linux.KVMHostInfo.<init>(KVMHostInfo.java:65) > at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.initialize(LibvirtComputingResource.java:3522) > at com.cloud.agent.Agent.sendStartup(Agent.java:444) > at com.cloud.agent.Agent$ServerHandler.doTask(Agent.java:1105) > at com.cloud.utils.nio.Task.call(Task.java:83) > at com.cloud.utils.nio.Task.call(Task.java:29) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > 2024-06-13 16:19:22,281 INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:null) (logid:) Fetching CPU speed from "host capabilities" > 2024-06-13 16:19:22,293 INFO [utils.linux.KVMHostInfo] (Agent-Handler-1:null) (logid:) Retrieved value [2794749000] from "host capabilities". This corresponds to a CPU speed of [2794] MHz. > 2024-06-13 16:19:22,302 INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:null) (logid:) Host uses control group [cgroup2fs]. > 2024-06-13 16:19:22,302 INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:null) (logid:) Calculating the max shares of the host. > 2024-06-13 16:19:22,302 INFO [kvm.resource.LibvirtComputingResource] (Agent-Handler-1:null) (logid:) The max shares of the host is [134112]. > 2024-06-13 16:19:22,344 ERROR [kvm.resource.LibvirtConnection] (Agent-Handler-1:null) (logid:) Connection with libvirtd is broken: invalid connection pointer in virConnectGetVersion > 2024-06-13 16:19:22,360 INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:null) (logid:) Attempting to create storage pool 8a76c6b2-f791-4f6b-a09e-85a581b8189f (Filesystem) in libvirt > 2024-06-13 16:19:22,382 INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:null) (logid:) Found existing defined storage pool 8a76c6b2-f791-4f6b-a09e-85a581b8189f, using it. > 2024-06-13 16:19:22,383 INFO [kvm.storage.LibvirtStorageAdaptor] (Agent-Handler-1:null) (logid:) Trying to fetch storage pool 8a76c6b2-f791-4f6b-a09e-85a581b8189f from libvirt > 2024-06-13 16:19:22,473 INFO [cloud.agent.Agent] (Agent-Handler-2:null) (logid:) Process agent startup answer, agent id = 0 > 2024-06-13 16:19:22,473 INFO [cloud.agent.Agent] (Agent-Handler-2:null) (logid:) Set agent id 0 > 2024-06-13 16:19:22,486 INFO [cloud.agent.Agent] (Agent-Handler-2:null) (logid:) Startup Response Received: agent id = 0 > 2024-06-13 16:21:22,481 INFO [cloud.agent.Agent] (AgentShutdownThread:null) (logid:) Stopping the agent: Reason = sig.kill -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org