gusmef opened a new issue, #13130:
URL: https://github.com/apache/cloudstack/issues/13130

   ### problem
   
   When trying to put an host in maintenance mode to trigger automatic 
migration of vms on it, the operation fails throwing a NullPointerException. 
This actually happens with both normal maintenance and rolling maintenance.
   
   ### versions
   
   - Cloudstack version: 4.22
   - Hypervisor: KVM running on oracle linux 9
   - Primary storage: local storage
   - Database: 8.0.45-36 Percona Server
   
   ### The steps to reproduce the bug
   
   1. Create custom constrained offering
   ```sql
   mysql> select * from service_offering where 
uuid='c24ce908-ffaa-4ced-9610-f99fb9f8b4c8' \G
   *************************** 1. row ***************************
                         id: 14
                        cpu: NULL
                      speed: 2000
                   ram_size: NULL
                    nw_rate: NULL
                    mc_rate: NULL
                 ha_enabled: 0
              limit_cpu_use: 0
                   host_tag: NULL
                default_use: 0
                    vm_type: NULL
                   sort_key: 0
                is_volatile: 0
         deployment_planner: NULL
    dynamic_scaling_enabled: 1
                       uuid: c24ce908-ffaa-4ced-9610-f99fb9f8b4c8
                       name: Custom-local-sparse
               display_text: Custom-local-sparse
                unique_name: NULL
                 customized: 1
                    created: 2025-05-19 14:28:59
                    removed: NULL
                      state: Active
           disk_offering_id: 18
                 system_use: 0
   disk_offering_strictness: 0
            vgpu_profile_id: NULL
                  gpu_count: NULL
                gpu_display: 0
   1 row in set (0.00 sec)
   ```
   2. Deploy vm with above offering
   3. Attempt to place the host running the VM into Maintenance mode
   # Results
   ### **Normal maintenance**
   ```txt
   2026-05-08 09:13:37,635 DEBUG [c.c.h.d.HostDaoImpl] 
(API-Job-Executor-5:[ctx-f2a20d6a, job-3207, ctx-a1adfd52]) (logid:5e43ba84) 
Resource state update: [id = 46; name = cslab02; old state = Enabled; event = 
AdminAskMaintenance; new state = PrepareForMaintenance]
   
   2026-05-08 09:13:37,640 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(API-Job-Executor-5:[ctx-f2a20d6a, job-3207, ctx-a1adfd52]) (logid:5e43ba84) 
Trying to deploy VM [error decoding VM instance 
{"id":181,"instanceName":"i-2-181-VM","state":"Running","type":"User","uuid":"fe2be975-720e-4ba3-aa12-707a3c86fd58"}]
 and details: Plan 
[{"_dcId":3,"_podId":3,"_clusterId":5,"_recreateDisks":false,"preferredHostIds":[],"migrationPlan":true,"hostPriorities":{}}];
 avoid list [{"_hostIds":[46]}] and planner: [null].
   
   2026-05-08 09:13:37,642 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-5:[ctx-f2a20d6a, job-3207]) (logid:5e43ba84) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.admin.host.PrepareForHostMaintenanceCmd 
java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" 
because the return value of "com.cloud.offering.ServiceOffering.getCpu()" is 
null
                at 
com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:314)
           at 
com.cloud.resource.ResourceManagerImpl.getDeployDestination(ResourceManagerImpl.java:1635)
           at 
com.cloud.resource.ResourceManagerImpl.migrateAwayVmWithVolumes(ResourceManagerImpl.java:1619)
           at 
com.cloud.resource.ResourceManagerImpl.doMaintain(ResourceManagerImpl.java:1557)
           at 
com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1653)
           at 
com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1710)
           at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
           at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.base/java.lang.reflect.Method.invoke(Method.java:569)
           at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
           at 
org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:109)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
           at 
com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
           at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
           at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
           at jdk.proxy3/jdk.proxy3.$Proxy217.maintain(Unknown Source)
           at 
org.apache.cloudstack.api.command.admin.host.PrepareForHostMaintenanceCmd.execute(PrepareForHostMaintenanceCmd.java:99)
           at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:173)
           at 
com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:110)
           at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:698)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
           at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:646)
           at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
           at java.base/java.lang.Thread.run(Thread.java:840)
   
   2026-05-08 09:13:37,643 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-5:[ctx-f2a20d6a, job-3207]) (logid:5e43ba84) Complete async 
job-3207, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Cannot
 invoke "java.lang.Integer.intValue()" because the return value of 
"com.cloud.offering.ServiceOffering.getCpu()" is null"}
   ```
   
   ### **Rolling maintenance**
   ```txt
   2026-05-08 09:06:36,950 DEBUG [c.c.h.d.HostDaoImpl] 
(API-Job-Executor-3:[ctx-8363e8f5, job-3205, ctx-868bd55f]) (logid:be9b63d5) 
Resource state update: [id = 46; name = cslab02; old state = Enabled; event = 
AdminAskMaintenance; new state = PrepareForMaintenance]
   
   2026-05-08 09:06:36,956 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(API-Job-Executor-3:[ctx-8363e8f5, job-3205, ctx-868bd55f]) (logid:be9b63d5) 
Trying to deploy VM [error decoding VM instance 
{"id":181,"instanceName":"i-2-181-VM","state":"Running","type":"User","uuid":"fe2be975-720e-4ba3-aa12-707a3c86fd58"}]
 and details: Plan 
[{"_dcId":3,"_podId":3,"_clusterId":5,"_recreateDisks":false,"preferredHostIds":[],"migrationPlan":true,"hostPriorities":{}}];
 avoid list [{"_hostIds":[46]}] and planner: [null].
   
   2026-05-08 09:06:36,961 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-3:[ctx-8363e8f5, job-3205]) (logid:be9b63d5) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.admin.resource.StartRollingMaintenanceCmd 
java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" 
because the return value of "com.cloud.offering.ServiceOffering.getCpu()" is 
null
                at 
com.cloud.deploy.DeploymentPlanningManagerImpl.planDeployment(DeploymentPlanningManagerImpl.java:314)
           at 
com.cloud.resource.ResourceManagerImpl.getDeployDestination(ResourceManagerImpl.java:1635)
           at 
com.cloud.resource.ResourceManagerImpl.migrateAwayVmWithVolumes(ResourceManagerImpl.java:1619)
           at 
com.cloud.resource.ResourceManagerImpl.doMaintain(ResourceManagerImpl.java:1557)
           at 
com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1653)
           at 
com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1710)
           at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
           at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.base/java.lang.reflect.Method.invoke(Method.java:569)
           at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
           at 
org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:109)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
           at 
com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
           at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
           at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
           at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
           at jdk.proxy3/jdk.proxy3.$Proxy217.maintain(Unknown Source)
           at 
com.cloud.resource.RollingMaintenanceManagerImpl.putHostIntoMaintenance(RollingMaintenanceManagerImpl.java:410)
           at 
com.cloud.resource.RollingMaintenanceManagerImpl.startRollingMaintenanceHostInCluster(RollingMaintenanceManagerImpl.java:327)
           at 
com.cloud.resource.RollingMaintenanceManagerImpl.startRollingMaintenance(RollingMaintenanceManagerImpl.java:210)
           at 
org.apache.cloudstack.api.command.admin.resource.StartRollingMaintenanceCmd.execute(StartRollingMaintenanceCmd.java:129)
           at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:173)
           at 
com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:110)
           at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:698)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
           at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
           at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
           at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:646)
           at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
           at java.base/java.lang.Thread.run(Thread.java:840)
   
   2026-05-08 09:06:36,962 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-3:[ctx-8363e8f5, job-3205]) (logid:be9b63d5) Complete async 
job-3205, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Cannot
 invoke "java.lang.Integer.intValue()" because the return value of 
"com.cloud.offering.ServiceOffering.getCpu()" is null"}
   ```
   ## Workaround
   Update service offering in the database:
   ```sql
   mysql> update service_offering set cpu = 2, ram_size = 2048 where 
uuid='c24ce908-ffaa-4ced-9610-f99fb9f8b4c8' \G
   ```
   
   ### Note
   After applying the workaround normal migration works fine between local 
storages, vm is live migrated without a problem. However rolling migration 
still fails, is it related to the storage locality? I see in the logs that is 
putting the entire cluster in the disabled state
   ```txt
   2026-05-08 09:16:16,047 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(API-Job-Executor-7:[ctx-241b2d24, job-3209, ctx-4f1545be]) (logid:e374bad0) 
Trying to allocate a host and storage pools from datacenter [Zone {"id": "3", 
"name": "zone-lab", "uuid": "80be5019-767a-47e1-8ccf-1a91d5032c0c"}], pod 
[HostPod 
{"id":3,"name":"pod-lab","uuid":"0aba438c-f7a6-45db-864c-c82a49e3bcd6"}], 
cluster [Cluster {id: "5", name: "cluster-lab", uuid: 
"10364cd3-0b5b-4612-876a-ef28f00ff25e"}], to deploy VM [VM instance 
{"id":181,"instanceName":"i-2-181-VM","state":"Running","type":"User","uuid":"fe2be975-720e-4ba3-aa12-707a3c86fd58"}]
 with requested CPU [4000] and requested RAM [(2.00 GB) 2147483648].
   2026-05-08 09:16:16,048 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(API-Job-Executor-7:[ctx-241b2d24, job-3209, ctx-4f1545be]) (logid:e374bad0) 
ROOT volume [Volume 
{"id":344,"instanceId":181,"name":"ROOT-181","uuid":"0b1bf8e0-3296-48b4-940b-fa6461b2b181","volumeType":"ROOT"}]
 is not ready to deploy VM [VM instance 
{"id":181,"instanceName":"i-2-181-VM","state":"Running","type":"User","uuid":"fe2be975-720e-4ba3-aa12-707a3c86fd58"}].
   2026-05-08 09:16:16,050 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(API-Job-Executor-7:[ctx-241b2d24, job-3209, ctx-4f1545be]) (logid:e374bad0) 
Adding pods [] to the avoid set because these pods are in the Disabled state.
   2026-05-08 09:16:16,051 DEBUG [c.c.d.DeploymentPlanningManagerImpl] 
(API-Job-Executor-7:[ctx-241b2d24, job-3209, ctx-4f1545be]) (logid:e374bad0) 
Adding clusters [5] of pod [3] to the void set because these clusters are in 
the Disabled state.
   
   ### What to do about it?
   
   I saw https://github.com/apache/cloudstack/pull/9844 that should have fixed 
the rolling maintenance NPE, but it seems that the file throwing the error is 
now the DeploymentPlanningManagerImpl.java in the planDeployment method. The 
fix used for rolling maintenance should be ported to the "normal" maintenance 
too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to