Thanks for clarifying. The cluster certificate error you pointed out may not be related to the issue. However, can you stop additional management servers if any and keep only one running for now?
Also, what is the status of the agent under the SystemVM's? Is it UP? If not up, can you check the SSVM's can reach management server IP's on port 8250? If no issues found, upload the /var/log/cloud.log from the system VM's this should give more info on what is causing the issue. ________________________________ From: Kayo Henrique <kayo.henri...@onexdatacenter.com.br> Sent: Tuesday, August 12, 2025 2:04 PM To: users@cloudstack.apache.org <users@cloudstack.apache.org> Subject: Re: POSSIBLE SSL ERROR ON SYSTEM VMS - POSSÍVEL ERRO DE SSL NAS SYSTEM VMS These logs are from this morning. I rebuilt my CloudStack environment and now the System VMs have connectivity, so much so that I can access the VMs via SSH and ping the management, public, and storage networks through it. I solved the logs you sent me by simply rebuilding the environment (and network) from scratch. The SSL logs I mentioned are reported below: 2025-08-11 20:34:11,995 DEBUG [c.c.c.ClusterManagerImpl] (Cluster-Worker-5:[ctx-f152a4c5]) (logid:099f2acb) Cluster PDU 97591085894372 -> 97591085894372 completed. time: 4ms. agent: 0, pdu seq: 263, pdu ack seq: 0, json: {"managementServerHostId":1,"managementServerHostUuid":"4cfdc9b0-565e-4920-bc32-d079f456fc8b","managementServerRunId":1754939509540,"collectionTime":"Aug 11, 2025, 8:34:11 PM","sessions":2,"cpuUtilization":0.0,"totalJvmMemoryBytes":1124597760,"freeJvmMemoryBytes":384206344,"maxJvmMemoryBytes":1908932607,"processJvmMemoryBytes":0,"jvmUptime":15765252,"jvmStartTime":1754939486632,"availableProcessors":32,"loadAverage":0.17,"totalInit":1591017472,"totalUsed":1012164680,"totalCommitted":1401356288,"pid":217209,"jvmName":"217209@cloudstack-onexbh","jvmVendor":"Ubuntu","jvmVersion":"17.0.15+6-Ubuntu-0ubuntu120.04","osDistribution":"Ubuntu 20.04.6 LTS","agentCount":5,"heapMemoryUsed":741131168,"heapMemoryTotal":1908932608,"threadsBlockedCount":0,"threadsDaemonCount":30,"threadsRunnableCount":23,"threadsTerminatedCount":0,"threadsTotalCount":1427,"threadsWaitingCount":1317,"systemMemoryTotal":101304950784,"systemMemoryFree":94171963392,"systemMemoryUsed":1912644,"systemMemoryVirtualSize":22671781888,"logInfo":"","systemTotalCpuCycles":43622.244,"systemLoadAverages":[0.17,0.1,0.09],"systemCyclesUsage":[652771,292009,862649072],"dbLocal":true,"usageLocal":false,"systemBootTime":"Aug 8, 2025, 5:24:32 PM","kernelVersion":"5.4.0-216-generic"} 2025-08-11 20:34:11,995 DEBUG [c.c.c.ClusterManagerImpl] (Cluster-Worker-5:[ctx-f152a4c5]) (logid:099f2acb) Cluster PDU 97591085894372 -> 97591085894372. agent: 0, pdu seq: 263, pdu ack seq: 0, json: {"managementServerHostId":1,"managementServerHostUuid":"4cfdc9b0-565e-4920-bc32-d079f456fc8b","managementServerRunId":1754939509540,"collectionTime":"Aug 11, 2025, 8:34:11 PM","sessions":2,"cpuUtilization":0.0,"totalJvmMemoryBytes":1124597760,"freeJvmMemoryBytes":384206344,"maxJvmMemoryBytes":1908932607,"processJvmMemoryBytes":0,"jvmUptime":15765252,"jvmStartTime":1754939486632,"availableProcessors":32,"loadAverage":0.17,"totalInit":1591017472,"totalUsed":1012164680,"totalCommitted":1401356288,"pid":217209,"jvmName":"217209@cloudstack-onexbh","jvmVendor":"Ubuntu","jvmVersion":"17.0.15+6-Ubuntu-0ubuntu120.04","osDistribution":"Ubuntu 20.04.6 LTS","agentCount":5,"heapMemoryUsed":741131168,"heapMemoryTotal":1908932608,"threadsBlockedCount":0,"threadsDaemonCount":30,"threadsRunnableCount":23,"threadsTerminatedCount":0,"threadsTotalCount":1427,"threadsWaitingCount":1317,"systemMemoryTotal":101304950784,"systemMemoryFree":94171963392,"systemMemoryUsed":1912644,"systemMemoryVirtualSize":22671781888,"logInfo":"","systemTotalCpuCycles":43622.244,"systemLoadAverages":[0.17,0.1,0.09],"systemCyclesUsage":[652771,292009,862649072],"dbLocal":true,"usageLocal":false,"systemBootTime":"Aug 8, 2025, 5:24:32 PM","kernelVersion":"5.4.0-216-generic"} 2025-08-11 20:34:11,995 DEBUG [c.c.c.ClusterServiceServletImpl] (Cluster-Worker-5:[ctx-f152a4c5]) (logid:099f2acb) Executing ClusterServicePdu with service URL: https://127.0.0.1:9090/clusterservice 2025-08-11 20:34:11,998 ERROR [c.c.c.ClusterServiceServletImpl] (Cluster-Worker-5:[ctx-f152a4c5]) (logid:099f2acb) Exception from : https://127.0.0.1:9090/clusterservice, method : null, exception : javax.net.ssl.SSLPeerUnverifiedException: Certificate for <127.0.0.1> doesn't match any of the subject alternative names: [fe80:0:0:0:5ac2:32ff:fe02:ae4, 10.31.4.50, 10.31.3.1, cloudstack-onexbh, cloudstack.internal] at org.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at com.cloud.cluster.ClusterServiceServletImpl.executePostMethod(ClusterServiceServletImpl.java:143) at com.cloud.cluster.ClusterServiceServletImpl.execute(ClusterServiceServletImpl.java:106) at com.cloud.cluster.ClusterManagerImpl.onSendingClusterPdu(ClusterManagerImpl.java:275) at com.cloud.cluster.ClusterManagerImpl$1.runInContext(ClusterManagerImpl.java:235) at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103) at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) 2025-08-11 20:34:11,998 DEBUG [c.c.c.ClusterManagerImpl] (Cluster-Worker-5:[ctx-f152a4c5]) (logid:099f2acb) Cluster PDU 97591085894372 -> 97591085894372 completed. time: 3ms. agent: 0, pdu seq: 263, pdu ack seq: 0, json: {"managementServerHostId":1,"managementServerHostUuid":"4cfdc9b0-565e-4920-bc32-d079f456fc8b","managementServerRunId":1754939509540,"collectionTime":"Aug 11, 2025, 8:34:11 PM","sessions":2,"cpuUtilization":0.0,"totalJvmMemoryBytes":1124597760,"freeJvmMemoryBytes":384206344,"maxJvmMemoryBytes":1908932607,"processJvmMemoryBytes":0,"jvmUptime":15765252,"jvmStartTime":1754939486632,"availableProcessors":32,"loadAverage":0.17,"totalInit":1591017472,"totalUsed":1012164680,"totalCommitted":1401356288,"pid":217209,"jvmName":"217209@cloudstack-onexbh","jvmVendor":"Ubuntu","jvmVersion":"17.0.15+6-Ubuntu-0ubuntu120.04","osDistribution":"Ubuntu 20.04.6 LTS","agentCount":5,"heapMemoryUsed":741131168,"heapMemoryTotal":1908932608,"threadsBlockedCount":0,"threadsDaemonCount":30,"threadsRunnableCount":23,"threadsTerminatedCount":0,"threadsTotalCount":1427,"threadsWaitingCount":1317,"systemMemoryTotal":101304950784,"systemMemoryFree":94171963392,"systemMemoryUsed":1912644,"systemMemoryVirtualSize":22671781888,"logInfo":"","systemTotalCpuCycles":43622.244,"systemLoadAverages":[0.17,0.1,0.09],"systemCyclesUsage":[652771,292009,862649072],"dbLocal":true,"usageLocal":false,"systemBootTime":"Aug 8, 2025, 5:24:32 PM","kernelVersion":"5.4.0-216-generic"} 2025-08-11 20:34:12,063 DEBUG [c.c.s.S.VmStatsCollector] (StatsCollector-3:[ctx-e9f9863e]) (logid:3a21ac65) VmStatsCollector is running to process VMs across 3 UP hosts 2025-08-11 20:34:12,067 DEBUG [c.c.a.m.ClusteredAgentManagerImpl] (StatsCollector-3:[ctx-e9f9863e]) (logid:3a21ac65) Wait time setting on com.cloud.agent.api.GetVmStatsCommand is 1800 seconds 2025-08-11 20:34:12,067 DEBUG [c.c.a.m.ClusteredDirectAgentAttache] (StatsCollector-3:[ctx-e9f9863e]) (logid:3a21ac65) Seq 5-3902650552093245880: Routed from 97591085894372 2025-08-11 20:34:12,068 DEBUG [c.c.a.m.D.Task] (DirectAgent-90:[ctx-4073f5c6]) (logid:8267a6c4) Seq 5-3902650552093245880: Executing request 2025-08-11 20:34:12,068 DEBUG [c.c.h.v.r.VmwareResource] (DirectAgent-90:[ctx-4073f5c6]) (logid:3a21ac65) Executing resource command GetVmStatsCommand: []. I remain at your disposal and thank you for your support!! Kayo Henrique Analista de Infraestrutura e Redes OneX Data Centers Em 2025-08-12 07:03, Prashanth Reddy escreveu: > From the logs I don't see an issue with SSL certs itself, we clearly > see management servers are unable to connect to the systemVM's. > > For the SSVM and CPVM to work, management servers should be able to ssh > into the systemVM's on the POD IP's ( in case of vmware) assigned. From > the logs we see this as failing as below > > Logs for SSVM ( similar logs are seen for CPVM as well) > > 2025-08-11 11:07:53,220 DEBUG [c.c.h.v.r.VmwareResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) VM s-31-VM has been started > successfully with hostname s-31-VM. > 2025-08-11 11:07:53,220 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Trying to connect to 10.42.0.93 > 2025-08-11 11:07:56,276 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Could not connect to 10.42.0.93 > 2025-08-11 11:08:01,276 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Trying to connect to 10.42.0.93 > 2025-08-11 11:08:04,340 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Could not connect to 10.42.0.93 > 2025-08-11 11:08:09,340 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Unable to logon to 10.42.0.93 > 2025-08-11 11:08:09,340 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Trying to connect to 10.42.0.93 > > > > 2025-08-11 11:23:52,828 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Trying to connect to 10.42.0.93 > 2025-08-11 11:23:55,892 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Could not connect to 10.42.0.93 > 2025-08-11 11:24:00,892 DEBUG [c.c.a.r.v.VirtualRoutingResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Unable to logon to 10.42.0.93 > 2025-08-11 11:24:03,956 ERROR [c.c.u.FileUtil] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Failed to scp files to system VM due > to, No route to host > 2025-08-11 11:24:07,028 ERROR [c.c.u.FileUtil] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Failed to scp files to system VM due > to, No route to host > 2025-08-11 11:24:10,100 ERROR [c.c.u.FileUtil] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Failed to scp files to system VM due > to, No route to host > 2025-08-11 11:24:10,100 ERROR [c.c.h.v.r.VmwareResource] > (DirectAgent-180:[ctx-d02be6b8, 10.42.0.23, job-44/job-138, cmd: > StartCommand]) (logid:2459ff24) Failed to scp files to system VM. > Patching of systemVM failed > com.cloud.utils.exception.CloudRuntimeException: Failed to scp files to > system VM due to, No route to host > 2025-08-11 11:24:10,123 DEBUG [c.c.a.t.Request] > (Work-Job-Executor-91:[ctx-9e73a045, job-44/job-138, ctx-025b0546]) > (logid:2459ff24) Seq 1-5696209103693548290: Received: { Ans: , MgmtId: > 97591085894372, via: 1(10.42.0.23), Ver: v1, Flags: 110, { StartAnswer > } } > 2025-08-11 11:24:10,130 INFO > [c.c.v.ClusteredVirtualMachineManagerImpl] > (Work-Job-Executor-91:[ctx-9e73a045, job-44/job-138, ctx-025b0546]) > (logid:2459ff24) Unable to start VM on Host > {"id":1,"name":"10.42.0.23","type":"Routing","uuid":"8733a859-b04f-4341-9ceb-182b7917628f"} > due to Failed to scp files to system VM. Patching of systemVM failed > due to: Failed to scp files to system VM due to, No route to host > 2025-08-11 11:24:10,139 DEBUG > [c.c.v.ClusteredVirtualMachineManagerImpl] > (Work-Job-Executor-91:[ctx-9e73a045, job-44/job-138, ctx-025b0546]) > (logid:2459ff24) Cleaning up resources for the vm VM instance > {"id":31,"instanceName":"s-31-VM","state":"Starting","type":"SecondaryStorageVm","uuid":"700c9d06-fc6e-40ce-a42f-d59e657771b3"} > in Starting state > > > From the logs your pod network is using the following vswitch on vmware > - "name":"vSwitch_Storage,402,vmwaresvs" > > You can try to ssh into the SystemVM's directly from the management > servers once they are running on vmware and see if that works - You can > ssh to systemVM's on vmware using the steps here The System VM Template > — Apache CloudStack 4.20.1.0 > documentation<https://docs.cloudstack.apache.org/en/latest/adminguide/systemvm.html#accessing-system-vms> > The System VM Template — Apache CloudStack 4.20.1.0 > documentation<https://docs.cloudstack.apache.org/en/latest/adminguide/systemvm.html#accessing-system-vms> > CloudStack uses several types of system Instances to perform tasks in > the cloud. In general CloudStack manages these system VMs and creates, > starts, and stops them as needed based on scale and immediate needs. > Unlike user VMs, system VMs are expunged on destroying them. However, > the administrator should be aware of them and their roles to assist in > debugging issues. The System VM Template The ... > docs.cloudstack.apache.org > If the ssh to systemVM's is not working , To isolate the issue may be > you can deploy a VM directly on vmware with a nic on the vswitch stated > above and see if you can ssh into the VM from all your management > servers once running. > > > Thanks > Prashanth > > > > > > > > > > > > ________________________________ > From: Kayo Henrique <kayo.henri...@onexdatacenter.com.br> > Sent: Tuesday, August 12, 2025 1:15 AM > To: Users <users@cloudstack.apache.org> > Subject: POSSIBLE SSL ERROR ON SYSTEM VMS - POSSÍVEL ERRO DE SSL NAS > SYSTEM VMS > > *IN ENGLISH* > > Hello, > > I've rebuilt my CloudStack environment to VMware and I'm having a > problem. > > It appears my System VMs are powered on, have connectivity, and are > pinging all networks, but the System VM services (SSVM and CPVM) aren't > working. > > The evidence images and the management server log file are available at > the link below: > https://drive.onexdatacenter.com.br/s/gRreLjZ4bg5KPHM > > I did some research and discovered that it might be related to the SSL > certificate, but I don't quite understand how it works! > > I'm here to help! > > // > > *EM PORTUGUÊS* > > Olá, > > Refiz meu ambiente de CloudStack para VMware e estou com um problema. > > Aparentemente minhas System VMs estão ligadas, com conectividade, > pingando todas as redes, mas os serviços das System VMs (SSVM e CPVM) > não funcionam. > > As imagens de evidência e o arquivo de logs do management server estão > presentes no link abaixo: > https://drive.onexdatacenter.com.br/s/gRreLjZ4bg5KPHM > > Pesquisei um pouco sobre e descobri que pode estar relacionado ao > certificado SSL, mas não entendi muito bem como funcionaria isso! > > Fico à disposição!! > > -- > Atenciosamente, > Kayo Henrique > Analista de Infraestrutura e Redes > OneX Data Centers