[jira] [Commented] (CLOUDSTACK-3294) CLONE - System VMs not coming up due to “InsufficientServerCapacityException”.(not consistently reproducible)

Nitin Mehta (JIRA) Sun, 30 Jun 2013 02:30:58 -0700

    [ 
https://issues.apache.org/jira/browse/CLOUDSTACK-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13696310#comment-13696310
 ]


Nitin Mehta commented on CLOUDSTACK-3294:
-----------------------------------------

CLOUDSTACK-2813 has the short term fix, but we need to looking up at the 
cleaning up resources holistically atleast for virtual machines and have a 
better failover in case the cleanup fails.

Some ideas
add something like a cleanup flag in case the cleanup didn't work, and probably 
releasing the resources before next retry of vm deployment, expunge thread etc, 
but I am not convinced if this is the most elegant solution. Is this ok ?
Was talking to Murali and he was suggesting if long term, can can make 
acquiring resources transactional ? Or enhance framework like Journal to keep a 
log of resources acquired and then releasing them ? Any ideas ?
If we go down this path of checking each use case why cleanup resources can 
fail like for fix in CLOUDSTACK-2813, we will end up with a lot of flags and if 
else conditions. While it fixes this problem, I still see loopholes in our 
cleanup approach. At the minimum we should start checking the cleanup() 
response. If it returns false, cleanup is not done yet and needs to be taken 
care of in the future (say before another retry of vm deployment or expunge 
cycle). Next step, could be making cleanup function itself more robust(example 
– _networkMgr.release throws an exception and we just do nothing right now). 

                
> CLONE - System VMs not coming up due to 
> “InsufficientServerCapacityException”.(not consistently reproducible)
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-3294
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3294
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>          Components: Management Server
>    Affects Versions: 4.2.0
>            Reporter: Nitin Mehta
>            Priority: Critical
>             Fix For: 4.2.0
>
>         Attachments: management-server.zip
>
>
> Seps:t
> 1.    Have a CS with advanced zone .
> 2.    Created some user VMs.
> 3.    Created VPCs and VMs under VPCs.
> 4.    Shutdown the Host(Xen) and MS.
> 5.    Start the Host and MS.
> Observation:
> The SSVM and CPVM were not coming up with 
> “InsufficientServerCapacityException” exception.
> The Dashboard was showing exhausted  management IPs .
> Deleted all the VMS ,still the IPs were not released.
> Below is the table which shows that all the management ips are reserved.
> mysql> select * from op_dc_ip_address_alloc;
> +----+---------------+----------------+--------+--------+--------------------------------------+---------------------+-------------+
> | id | ip_address    | data_center_id | pod_id | nic_id | reservation_id      
>                  | taken               | mac_address |
> +----+---------------+----------------+--------+--------+--------------------------------------+---------------------+-------------+
> |  1 | 10.147.40.181 |              1 |      1 |     34 | 
> 48d95839-6fb1-4bc4-b23a-c9f1891bf1fa | 2013-05-31 17:10:06 |           1 |
> |  2 | 10.147.40.182 |              1 |      1 |      3 | 
> a7b9610c-9319-478c-84e4-e70be099cd9d | 2013-05-31 17:07:29 |           2 |
> |  3 | 10.147.40.183 |              1 |      1 |      7 | 
> 238830cd-8cbe-411e-8016-352129885df6 | 2013-05-31 17:07:30 |           3 |
> |  4 | 10.147.40.184 |              1 |      1 |      7 | 
> 70f091d4-acb4-435b-bfde-9bdb35bcfa6b | 2013-05-31 17:09:15 |           4 |
> |  5 | 10.147.40.185 |              1 |      1 |     29 | 
> 14690352-e9a0-4695-a834-0552175f7684 | 2013-05-31 17:08:45 |           5 |
> |  6 | 10.147.40.186 |              1 |      1 |     30 | 
> 14690352-e9a0-4695-a834-0552175f7684 | 2013-05-31 17:08:45 |           6 |
> |  7 | 10.147.40.187 |              1 |      1 |      4 | 
> a7b9610c-9319-478c-84e4-e70be099cd9d | 2013-05-31 17:07:29 |           7 |
> |  8 | 10.147.40.188 |              1 |      1 |      7 | 
> ea8644d1-7801-4dbb-aa0c-204f31e922a1 | 2013-05-31 17:08:25 |           8 |
> |  9 | 10.147.40.189 |              1 |      1 |      7 | 
> 245e0082-d697-454d-9689-b36cc3b6e113 | 2013-05-31 17:11:16 |           9 |
> | 10 | 10.147.40.190 |              1 |      1 |      7 | 
> 094e371a-da69-44e0-80fd-14c2d090e935 | 2013-05-31 17:10:15 |          10 |
> +----+---------------+----------------+--------+--------+--------------------------------------+---------------------+-------------+
> As all the IPs were in  reserved state ,SSVM and CPVM were not coming up.
> Was not able to reproduce this issue again .
> Attached is the server log.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CLOUDSTACK-3294) CLONE - System VMs not coming up due to “InsufficientServerCapacityException”.(not consistently reproducible)

Reply via email to