Hi Alena,
thank you for your help.
The query returns no rows, i.e. nics.removed was not null, but I removed
the row though to see what happens: a new virtual router was created
which also couldn't be started due to the same NPE. I reverted the
change by restoring from the dump.
I have to mention that prior to the restart, r-7-VM was the router which
was used by my instances. I deleted the router using the UI after the first
occurrence of the NPE, because a post with a similar problem suggested
that the deleted router would be recreated again (and this procedure
solved the problem).
Below I have attached the state of the two tables.
Anything else I can try?
Thank you
Kambiz
mysql> select n.id, n.removed, n.ip4_address, n.netmask, n.gateway, n.ip_type,
n.reserver_name, n.network_id, i.id as instance_id, i.name, i.state, i.type
from vm_instance i join nics n on n.instance_id = i.id where i.type =
'DomainRouter';
+----+---------------------+---------------+---------------+-------------+---------+--------------------------+------------+-------------+---------+-----------+--------------+
| id | removed | ip4_address | netmask | gateway |
ip_type | reserver_name | network_id | instance_id | name | state
| type |
+----+---------------------+---------------+---------------+-------------+---------+--------------------------+------------+-------------+---------+-----------+--------------+
| 9 | 2014-03-17 11:27:58 | 10.124.99.1 | 255.255.255.0 | NULL | NULL
| ExternalGuestNetworkGuru | 204 | 4 | r-4-VM | Expunging
| DomainRouter |
| 10 | 2014-03-17 11:27:58 | NULL | NULL | NULL | NULL
| ControlNetworkGuru | 202 | 4 | r-4-VM | Expunging
| DomainRouter |
| 11 | 2014-03-17 11:27:58 | 10.193.17.139 | 255.255.255.0 | 10.193.17.1 | NULL
| PublicNetworkGuru | 200 | 4 | r-4-VM | Expunging
| DomainRouter |
| 14 | 2014-03-17 11:27:52 | 10.124.99.1 | 255.255.255.0 | NULL | NULL
| ExternalGuestNetworkGuru | 205 | 7 | r-7-VM | Expunging
| DomainRouter |
| 15 | 2014-03-17 11:27:52 | NULL | NULL | NULL | NULL
| ControlNetworkGuru | 202 | 7 | r-7-VM | Expunging
| DomainRouter |
| 16 | 2014-03-17 11:27:52 | 10.193.17.190 | 255.255.255.0 | 10.193.17.1 | NULL
| PublicNetworkGuru | 200 | 7 | r-7-VM | Expunging
| DomainRouter |
| 26 | 2014-03-18 08:11:16 | 10.124.99.1 | 255.255.255.0 | NULL | NULL
| ExternalGuestNetworkGuru | 205 | 18 | r-18-VM | Expunging
| DomainRouter |
| 27 | 2014-03-18 08:11:16 | NULL | NULL | NULL | NULL
| ControlNetworkGuru | 202 | 18 | r-18-VM | Expunging
| DomainRouter |
| 28 | 2014-03-18 08:11:16 | 10.193.17.190 | 255.255.255.0 | 10.193.17.1 | NULL
| PublicNetworkGuru | 200 | 18 | r-18-VM | Expunging
| DomainRouter |
| 29 | NULL | 10.124.99.1 | 255.255.255.0 | NULL | NULL
| ExternalGuestNetworkGuru | 205 | 19 | r-19-VM | Stopped
| DomainRouter |
| 30 | NULL | NULL | NULL | NULL | NULL
| ControlNetworkGuru | 202 | 19 | r-19-VM | Stopped
| DomainRouter |
| 31 | NULL | 10.193.17.190 | 255.255.255.0 | 10.193.17.1 | NULL
| PublicNetworkGuru | 200 | 19 | r-19-VM | Stopped
| DomainRouter |
+----+---------------------+---------------+---------------+-------------+---------+--------------------------+------------+-------------+---------+-----------+--------------+
mysql> select * from router_network_ref;
+----+-----------+------------+------------+
| id | router_id | network_id | guest_type |
+----+-----------+------------+------------+
| 1 | 4 | 204 | Isolated |
| 2 | 7 | 205 | Isolated |
| 3 | 18 | 205 | Isolated |
| 4 | 19 | 205 | Isolated |
+----+-----------+------------+------------+
Alena Prokharchyk <[email protected]> wrote:
>
> The error happens not because Ip is null, but because the nic in a certain
> network can¹t be found. Looks like there is some bug in VPC nic
> plug/unplug for Guest networks process.
>
> Kambiz, please do the following to fix it:
>
> 1) Stop the MS
> 2) Take the DB dump of cloud db in case you have to revert back.
> 3) Run the query:
>
> select * from router_network_ref where router_id=<id of your VR) and
> network_id not in (select network_id from nics where instance_id=<ID of
> your VR> and removed is null);
>
> It will give you the list of networks refs that somehow weren¹t cleaned
> during the nic detach. Remove the entry returned from router_network_ref
> table.
>
> Let me know how it works.
>
> -Alena.
>
>
> On 3/21/14, 3:36 PM, "Kambiz Darabi" <[email protected]> wrote:
>
>>Hello,
>>
>>as this is my first post to the list, I would like to thank all
>>contributors for Cloudstack which I use since last fall without any
>>problems. I run 4.1.1 with KVM and advanced networking.
>>
>>After a restart of the management server (stopping and starting the java
>>process), the virtual domain router doesn't start and
>>management-server.log shows a NullPointerException in
>>NetworkModelImpl.getIpInNetwork (cf. stack trace below).
>>
>>By putting the server in debug mode and remote debugging, I found out
>>that the reason is a row in the table nics which has NULL in ip (cf. row
>>with id 30 in the result of the select statement below).
>>
>>What can I do to quickly solve this problem? Any pointers or suggestions
>>are appreciated as the system is currently unusable.
>>
>>Thank you for your help
>>
>>
>>Kambiz
>>
>>
>>management-server.log:
>>
>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking VirtualRouter to prepare for
>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1]
>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking Ovs to prepare for
>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1]
>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking SecurityGroupProvider to prepare for
>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1]
>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking VpcVirtualRouter to prepare for
>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1]
>>2014-03-18 10:03:27,151 WARN [network.element.VpcVirtualRouterElement]
>>(Job-Executor-1:job-176) Network Ntwk[205|Guest|8] is not associated with
>>any VPC
>>2014-03-18 10:03:27,151 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking NiciraNvp to prepare for
>>Nic[29-19-30e229ba-21bd-4ab5-8570-9f495bce5019-10.124.99.1]
>>2014-03-18 10:03:27,151 DEBUG [network.element.NiciraNvpElement]
>>(Job-Executor-1:job-176) Checking if NiciraNvpElement can handle service
>>Connectivity on network net1
>>2014-03-18 10:03:27,153 DEBUG [cloud.network.NetworkModelImpl]
>>(Job-Executor-1:job-176) Service SecurityGroup is not supported in the
>>network id=205
>>2014-03-18 10:03:27,156 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Lock is acquired for network id 202 as a part of
>>network implement
>>2014-03-18 10:03:27,156 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Network id=202 is already implemented
>>2014-03-18 10:03:27,157 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Lock is released for network id 202 as a part of
>>network implement
>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking VirtualRouter to prepare for
>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99]
>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking Ovs to prepare for
>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99]
>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking SecurityGroupProvider to prepare for
>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99]
>>2014-03-18 10:03:27,187 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking VpcVirtualRouter to prepare for
>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99]
>>2014-03-18 10:03:27,187 WARN [network.element.VpcVirtualRouterElement]
>>(Job-Executor-1:job-176) Network Ntwk[202|Control|3] is not associated
>>with any VPC
>>2014-03-18 10:03:27,188 DEBUG [cloud.network.NetworkManagerImpl]
>>(Job-Executor-1:job-176) Asking NiciraNvp to prepare for
>>Nic[30-19-30e229ba-21bd-4ab5-8570-9f495bce5019-169.254.3.99]
>>2014-03-18 10:03:27,188 DEBUG [network.element.NiciraNvpElement]
>>(Job-Executor-1:job-176) Checking if NiciraNvpElement can handle service
>>Connectivity on network null
>>2014-03-18 10:03:27,190 DEBUG [cloud.storage.StorageManagerImpl]
>>(Job-Executor-1:job-176) Checking if we need to prepare 1 volumes for
>>VM[DomainRouter|r-19-VM]
>>2014-03-18 10:03:27,190 DEBUG [cloud.storage.StorageManagerImpl]
>>(Job-Executor-1:job-176) No need to recreate the volume:
>>Vol[24|vm=19|ROOT], since it already has a pool assigned: 200, adding
>>disk to VM
>>2014-03-18 10:03:27,224 DEBUG
>>[network.router.VirtualNetworkApplianceManagerImpl]
>>(Job-Executor-1:job-176) Boot Args for VM[DomainRouter|r-19-VM]:
>>template=domP name=r-19-VM eth2ip=10.193.17.190 eth2mask=255.255.255.0
>>gateway=10.193.17.1 eth0ip=10.124.99.1 eth0mask=255.255.255.0
>>domain=cs6cloud.internal dhcprange=10.124.99.1 eth0ip=169.254.3.99
>>eth0mask=255.255.0.0 type=router disable_rp_filter=true dns1=10.193.17.1
>>2014-03-18 10:03:27,343 DEBUG
>>[network.router.VirtualNetworkApplianceManagerImpl]
>>(Job-Executor-1:job-176) Found 8 ip(s) to apply as a part of domR
>>VM[DomainRouter|r-19-VM] start.
>>2014-03-18 10:03:27,415 DEBUG
>>[network.router.VirtualNetworkApplianceManagerImpl]
>>(Job-Executor-1:job-176) Resending ipAssoc, port forwarding, load
>>balancing rules as a part of Virtual router start
>>2014-03-18 10:03:27,499 DEBUG
>>[network.router.VirtualNetworkApplianceManagerImpl]
>>(Job-Executor-1:job-176) Found 12 firewall Egress rule(s) to apply as a
>>part of domR VM[DomainRouter|r-19-VM] start.
>>2014-03-18 10:03:27,593 ERROR [cloud.vm.VirtualMachineManagerImpl]
>>(Job-Executor-1:job-176) Failed to start instance VM[DomainRouter|r-19-VM]
>>java.lang.NullPointerException
>> at
>>com.cloud.network.NetworkModelImpl.getIpInNetwork(NetworkModelImpl.java:76
>>3)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.finalizeNetwor
>>kRulesForNetwork(VirtualNetworkApplianceManagerImpl.java:2346)
>> at
>>com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl.finalizeNet
>>workRulesForNetwork(VpcVirtualNetworkApplianceManagerImpl.java:928)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.finalizeComman
>>dsOnStart(VirtualNetworkApplianceManagerImpl.java:2241)
>> at
>>com.cloud.network.router.VpcVirtualNetworkApplianceManagerImpl.finalizeCom
>>mandsOnStart(VpcVirtualNetworkApplianceManagerImpl.java:767)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.finalizeDeploy
>>ment(VirtualNetworkApplianceManagerImpl.java:2205)
>> at
>>com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerI
>>mpl.java:763)
>> at
>>com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.jav
>>a:471)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.start(VirtualN
>>etworkApplianceManagerImpl.java:2616)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startVirtualRo
>>uter(VirtualNetworkApplianceManagerImpl.java:1824)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startRouters(V
>>irtualNetworkApplianceManagerImpl.java:1924)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.deployVirtualR
>>outerInGuestNetwork(VirtualNetworkApplianceManagerImpl.java:1902)
>> at
>>com.cloud.network.element.VirtualRouterElement.implement(VirtualRouterElem
>>ent.java:175)
>> at
>>com.cloud.network.NetworkManagerImpl.implementNetworkElementsAndResources(
>>NetworkManagerImpl.java:1518)
>> at
>>com.cloud.network.NetworkManagerImpl.implementNetwork(NetworkManagerImpl.j
>>ava:1434)
>> at
>>com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorD
>>ispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>> at
>>com.cloud.network.NetworkManagerImpl.startNetwork(NetworkManagerImpl.java:
>>2435)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startRouter(Vi
>>rtualNetworkApplianceManagerImpl.java:2855)
>> at
>>com.cloud.network.router.VirtualNetworkApplianceManagerImpl.startRouter(Vi
>>rtualNetworkApplianceManagerImpl.java:2824)
>> at
>>com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorD
>>ispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>> at
>>org.apache.cloudstack.api.command.admin.router.StartRouterCmd.execute(Star
>>tRouterCmd.java:103)
>>
>>
>>table nics:
>>
>>mysql> select * from nics where reserver_name = 'ControlNetworkGuru';
>>+----+--------------------------------------+-------------+---------------
>>----+---------------+-------------+-------------+---------+---------------
>>+------------+--------+--------------+----------+--------------------+----
>>----------------------------------+-----------+---------------------+-----
>>----------+-------------+-------------+--------------------+--------------
>>-------+---------------------+-------------+----------+
>>| id | uuid | instance_id | mac_address
>> | ip4_address | netmask | gateway | ip_type | broadcast_uri
>>| network_id | mode | state | strategy | reserver_name |
>>reservation_id | device_id | update_time |
>>isolation_uri | ip6_address | default_nic | vm_type | created
>> | removed | ip6_gateway | ip6_cidr |
>>+----+--------------------------------------+-------------+---------------
>>----+---------------+-------------+-------------+---------+---------------
>>+------------+--------+--------------+----------+--------------------+----
>>----------------------------------+-----------+---------------------+-----
>>----------+-------------+-------------+--------------------+--------------
>>-------+---------------------+-------------+----------+
>>| 2 | 289aacb8-cfd7-4879-a632-6cfbda36cbf4 | 1 |
>>0e:00:a9:fe:00:55 | 169.254.0.85 | 255.255.0.0 | 169.254.0.1 | Ip4 |
>>NULL | 202 | Static | Reserved | Start |
>>ControlNetworkGuru | 993864b4-9dde-47d6-8fd6-cf94050442c6 | 0 |
>>2014-03-17 22:21:38 | NULL | NULL | 0 |
>>SecondaryStorageVm | 2013-09-06 12:44:42 | NULL | NULL
>> | NULL |
>>| 6 | 5fdf4b1a-b90c-4c79-9d42-9eaf87eaa042 | 2 |
>>0e:00:a9:fe:02:d3 | 169.254.2.211 | 255.255.0.0 | 169.254.0.1 | Ip4 |
>>NULL | 202 | Static | Reserved | Start |
>>ControlNetworkGuru | 852e0a65-c72a-448f-ac71-2bb3549a5a41 | 0 |
>>2014-03-17 22:21:38 | NULL | NULL | 0 |
>>ConsoleProxy | 2013-09-06 12:44:42 | NULL | NULL
>> | NULL |
>>| 10 | 4c4e6368-95d7-419a-a9b3-a5bb394197f0 | 4 | NULL
>> | NULL | NULL | NULL | NULL | NULL
>>| 202 | Static | Deallocating | Start | ControlNetworkGuru |
>>c28e8ddc-c106-462e-96c8-5d5216dad9b7 | 1 | 2014-03-17 12:27:58 |
>>NULL | NULL | 0 | DomainRouter |
>>2013-09-10 08:08:39 | 2014-03-17 11:27:58 | NULL | NULL |
>>| 15 | 1f2e99c0-9cd9-47aa-ab10-f190efd7a2dc | 7 | NULL
>> | NULL | NULL | NULL | NULL | NULL
>>| 202 | Static | Deallocating | Start | ControlNetworkGuru |
>>ca1aa99e-e630-4533-9642-523d8a8b1fea | 1 | 2014-03-17 12:27:52 |
>>NULL | NULL | 0 | DomainRouter |
>>2013-09-12 10:58:03 | 2014-03-17 11:27:52 | NULL | NULL |
>>| 27 | 1c98c4f2-f604-4a38-a813-f68833b1d250 | 18 | NULL
>> | NULL | NULL | NULL | NULL | NULL
>>| 202 | Static | Deallocating | Start | ControlNetworkGuru |
>>ad8e0e50-72aa-4c68-8634-8dc89f12fe01 | 1 | 2014-03-18 09:11:16 |
>>NULL | NULL | 0 | DomainRouter |
>>2014-03-17 11:28:50 | 2014-03-18 08:11:16 | NULL | NULL |
>>| 30 | cabd4cd9-c39f-423f-ad6a-ee3affe0bd9d | 19 | NULL
>> | NULL | NULL | NULL | NULL | NULL
>>| 202 | Static | Allocated | Start | ControlNetworkGuru |
>>e81ba56d-a101-4c60-b44f-a0890d56aad9 | 1 | 2014-03-18 09:11:44 |
>>NULL | NULL | 0 | DomainRouter |
>>2014-03-18 08:11:32 | NULL | NULL | NULL |
>>+----+--------------------------------------+-------------+---------------
>>----+---------------+-------------+-------------+---------+---------------
>>+------------+--------+--------------+----------+--------------------+----
>>----------------------------------+-----------+---------------------+-----
>>----------+-------------+-------------+--------------------+--------------
>>-------+---------------------+-------------+----------+
>>
>>