Hi guys,

The ACS 4.2 [1] checks if /_pifs.get("private") /is null only once, at the line 815.

if (_pifs.get("private") == null) {
   s_logger.debug("Failed to get private nic name");
   throw new ConfigurationException("Failed to get private nic name");
}

Also, it keeps looking for a file “/sys/devices/virtual/net" + _guestBridgeName and executing the command "/ovs-vsctl list-br | sed '{:q;N;s///\\//n/%/g;t q}'/".

The script that configures the pifs at "/sys/devices/virtual/net" is /modifyvxlan.sh/ [2]; it is called at the com.cloud.hypervisor.kvm.resource.BridgeVifDriver.java [3] when creating or deleting a Vnet (createVnet, deleteVnetBr).

We need to understand if there is something missing in the flow that would write those files at the expected path. It may be related to the fact that you deleted those bridges; just to check, how you deleted them?

I will see if I can have a better clue of what is going on.

Cheers,
Gabriel.

[1] https://github.com/apache/cloudstack/blob/4.2/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java [2] https://github.com/apache/cloudstack/blob/87ef8137534fa798101f65c6691fcf71513ac978/scripts/vm/network/vnet/modifyvxlan.sh [3] https://github.com/apache/cloudstack/blob/87ef8137534fa798101f65c6691fcf71513ac978/plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/resource/BridgeVifDriver.java

Em 20/10/2016 13:29, Cloud List escreveu:
Hi Rafael,

Many thanks for your help and reply!

Here's the list of files inside /sys/class/net folder:

root@test-kvm-03:/sys/class/net# ls -la
total 0
drwxr-xr-x  2 root root 0 Sep 28 14:49 .
drwxr-xr-x 54 root root 0 Sep 28 14:49 ..
lrwxrwxrwx  1 root root 0 Sep 28 14:49 cloudbr1 ->
../../devices/virtual/net/cloudbr1
lrwxrwxrwx  1 root root 0 Sep 28 14:49 eth0 ->
../../devices/pci0000:00/0000:00:01.0/0000:01:00.1/net/eth0
lrwxrwxrwx  1 root root 0 Sep 28 14:49 eth1 ->
../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/net/eth1
lrwxrwxrwx  1 root root 0 Sep 28 14:49 eth2 ->
../../devices/pci0000:00/0000:00:03.0/0000:02:00.0/net/eth2
lrwxrwxrwx  1 root root 0 Sep 28 14:49 eth3 ->
../../devices/pci0000:00/0000:00:03.0/0000:02:00.1/net/eth3
lrwxrwxrwx  1 root root 0 Sep 28 14:49 eth4 ->
../../devices/pci0000:00/0000:00:09.0/0000:07:00.0/net/eth4
lrwxrwxrwx  1 root root 0 Sep 28 14:49 eth5 ->
../../devices/pci0000:00/0000:00:09.0/0000:07:00.1/net/eth5
lrwxrwxrwx  1 root root 0 Sep 28 14:49 lo -> ../../devices/virtual/net/lo
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet0 ->
../../devices/virtual/net/vnet0
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet1 ->
../../devices/virtual/net/vnet1
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet10 ->
../../devices/virtual/net/vnet10
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet11 ->
../../devices/virtual/net/vnet11
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet12 ->
../../devices/virtual/net/vnet12
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet13 ->
../../devices/virtual/net/vnet13
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet14 ->
../../devices/virtual/net/vnet14
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet2 ->
../../devices/virtual/net/vnet2
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet3 ->
../../devices/virtual/net/vnet3
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet4 ->
../../devices/virtual/net/vnet4
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet5 ->
../../devices/virtual/net/vnet5
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet6 ->
../../devices/virtual/net/vnet6
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet7 ->
../../devices/virtual/net/vnet7
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet8 ->
../../devices/virtual/net/vnet8
lrwxrwxrwx  1 root root 0 Oct 20 16:35 vnet9 ->
../../devices/virtual/net/vnet9

And this is the list of files under /sys/devices/virtual/net folder:

root@test-kvm-03:/sys/devices/virtual/net# pwd
/sys/devices/virtual/net
root@test-kvm-03:/sys/devices/virtual/net# ls -la
total 0
drwxr-xr-x 19 root root 0 Sep 28 14:49 .
drwxr-xr-x 16 root root 0 Sep 28 14:49 ..
drwxr-xr-x  7 root root 0 Sep 28 14:49 cloudbr1
drwxr-xr-x  5 root root 0 Sep 28 14:49 lo
drwxr-xr-x  6 root root 0 Oct 20 16:20 vnet0
drwxr-xr-x  5 root root 0 Oct 20 16:20 vnet1
drwxr-xr-x  6 root root 0 Oct 20 16:22 vnet10
drwxr-xr-x  6 root root 0 Oct 20 16:23 vnet11
drwxr-xr-x  6 root root 0 Oct 20 16:23 vnet12
drwxr-xr-x  6 root root 0 Oct 20 16:23 vnet13
drwxr-xr-x  6 root root 0 Oct 20 16:23 vnet14
drwxr-xr-x  5 root root 0 Oct 20 16:20 vnet2
drwxr-xr-x  6 root root 0 Oct 20 16:20 vnet3
drwxr-xr-x  6 root root 0 Oct 20 16:20 vnet4
drwxr-xr-x  5 root root 0 Oct 20 16:21 vnet5
drwxr-xr-x  6 root root 0 Oct 20 16:21 vnet6
drwxr-xr-x  6 root root 0 Oct 20 16:21 vnet7
drwxr-xr-x  6 root root 0 Oct 20 16:21 vnet8
drwxr-xr-x  6 root root 0 Oct 20 16:21 vnet9

Notably, cloud0 and virbr0 are not inside, but it could be because I have
deleted both bridges earlier to test. Here's the latest status of "brctl
show" command result:

root@test-kvm-03:/sys/devices/virtual/net/cloudbr1# brctl show
bridge name     bridge id               STP enabled     interfaces
cloudbr1                8000.d067e5ec82c0       no              eth1
                                                         vnet0
                                                         vnet10
                                                         vnet11
                                                         vnet12
                                                         vnet13
                                                         vnet14
                                                         vnet3
                                                         vnet4
                                                         vnet6
                                                         vnet7
                                                         vnet8
                                                         vnet9

Really appreciate your help.

Looking forward to your reply, thank you.

Cheers.

-ip-


On Thu, Oct 20, 2016 at 10:11 PM, Rafael Weingärtner <
rafaelweingart...@gmail.com> wrote:

I think the source code of 4.9 will be more or less the same.

I missed another bit of code. Lines 1087 – 1094. “if (_pifs.get("private")
== null)”.

It tries to look for a file at “/sys/class/net/" + _guestBridgeName. If it
exist it executes “_pifs.put("private", _guestBridgeName)”. Does that exist
for you?

There is also another piece of code at 1059-1071 that may also be used to
get the private PIF name. For that, it lists the files from
“sys/devices/virtual/net”.

Can you list directories “/sys/class/net/" and “/sys/devices/virtual/net”?

I will now check the source code for ACS 4.2, and see if it is the same.

On Thu, Oct 20, 2016 at 12:01 PM, Cloud List <cloud-l...@sg.or.id> wrote:

Hi Rafael,

Thanks for your reply.

Here's the output of the command:

root@test-kvm-03:/var/log/cloudstack/agent# ovs-vsctl list-br | sed
'{:q;N;s/\\n/%/g;t q}'
The program 'ovs-vsctl' is currently not installed.  You can install it
by
typing:
apt-get install openvswitch-switch

I believe the command is only applicable if we are using OpenVSwitch. We
are using the normal Ubuntu network bridges rather than using
OpenVSwitch.
Furthermore, we are trying to roll back to 4.2 and this issue happens
when
I want to start the agent after I downgraded the agent to 4.2. Shouldn't
we
be checking the 4.2 source code instead?

Looking forward to your reply, thank you.

Cheers.


On Thu, Oct 20, 2016 at 8:38 PM, Rafael Weingärtner <
rafaelweingart...@gmail.com> wrote:

Hi, Anonymous fellow ;)
Let’s see if I can help you a little bit. I am checking the ACS 4.9
source
code.

The error is thrown at line 896 of class
“com.cloud.hypervisor.kvm.resource.LibvirtComputingResource”.
The condition that causes the error is “_pifs.get("private") == null”.
“_pifs” if a map. The key “private” is added to the map at line 1124,
if
condition “_guestBridgeName != null && bridge.equals(_
guestBridgeName)”
is
met.
The variable “_guestBridgeName” is a String that can receive the value
of
“guest.network.device” parameter or “_privBridgeName” variable. This
process happens at lines 752-755. The process of assigning a value to
“_privBridgeName” happens at lines 747-750. The default value for
“_privBridgeName” is “cloudbr1”. The default can be overridden by
“private.network.device” parameter.

Having detailed the parameter. Let's see how ACS gets the “bridge”
value.
It gets that value from code at line 1113 “cmdout.split("%")”. The
variable
“cmdout” contains the output of the following OS command: “ovs-vsctl
list-br | sed '{:q;N;s/\\n/%/g;t q}'”.

Can you run the command and check its output?


On Thu, Oct 20, 2016 at 8:49 AM, Cloud List <cloud-l...@sg.or.id>
wrote:
Hi,

We are using ACS version 4.2 / 4.9 on our test environment. We are
using
Ubuntu 12.04 as the operating system and KVM as the hypervisor.

We are trying to simulate an upgrade from ACS 4.2 to 4.9 and
roll-back
from
4.9 to 4.2 on our test environment. The upgrade went smooth, and the
roll-back went well as well except when we need to start the agent
after
downgrading the agent.

After uninstalling cloudstack-agent version 4.9 and installing back
cloudstack-agent version 4.2, I am not able to start the agent with
below
error messages:

====
2016-10-20 17:32:28,187 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
cloud0/bridge
2016-10-20 17:32:28,187 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) Found bridge cloud0
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/lo/
bridge
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file
/sys/devices/virtual/net/cloudbr1/bridge
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) Found bridge cloudbr1
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet4/bridge
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet5/bridge
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet6/bridge
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet7/bridge
2016-10-20 17:32:28,188 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet8/bridge
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet9/bridge
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet0/bridge
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet1/bridge
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet2/bridge
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet3/bridge
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
virbr0/bridge
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) Found bridge virbr0
2016-10-20 17:32:28,189 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet11/bridge
2016-10-20 17:32:28,190 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet12/bridge
2016-10-20 17:32:28,190 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet13/bridge
2016-10-20 17:32:28,190 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet14/bridge
2016-10-20 17:32:28,190 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet10/bridge
2016-10-20 17:32:28,190 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking for pif for bridge cloud0
2016-10-20 17:32:28,190 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) matchPifFileInDirectory: file name 'vnet1'
2016-10-20 17:32:28,191 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) matchPifFileInDirectory: file name 'vnet2'
2016-10-20 17:32:28,191 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) matchPifFileInDirectory: file name 'vnet5'
2016-10-20 17:32:28,191 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) failing to get physical interface from bridge
cloud0,
did not find an eth*, bond*, vlan*, em*, or p*p* in
/sys/devices/virtual/net/cloud0/brif
2016-10-20 17:32:28,191 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking for pif for bridge cloudbr1
2016-10-20 17:32:28,191 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) matchPifFileInDirectory: file name 'eth1'
2016-10-20 17:32:28,192 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking for pif for bridge virbr0
2016-10-20 17:32:28,192 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) failing to get physical interface from bridge
virbr0,
did not find an eth*, bond*, vlan*, em*, or p*p* in
/sys/devices/virtual/net/virbr0/brif
2016-10-20 17:32:28,192 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) done looking for pifs, no more bridges
2016-10-20 17:32:28,192 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) Failed to get private nic name
2016-10-20 17:32:28,192 ERROR [cloud.agent.AgentShell] (main:null)
(logid:)
Unable to start agent: Failed to get private nic name
====

Below is the result of brctl show and the content of
/etc/network/interfaces:

====
root@test-kvm-03:/var/log/cloudstack/agent# brctl show
bridge name     bridge id               STP enabled     interfaces
cloud0          8000.fe00a9fe00f8       no              vnet1
                                                         vnet2
                                                         vnet5
cloudbr1                8000.d067e5ec82c0       no              eth1
                                                         vnet0
                                                         vnet10
                                                         vnet11
                                                         vnet12
                                                         vnet13
                                                         vnet14
                                                         vnet3
                                                         vnet4
                                                         vnet6
                                                         vnet7
                                                         vnet8
                                                         vnet9
virbr0          8000.000000000000       yes
====

/etc/network/interfaces:

====
# The loopback network interface
auto lo
iface lo inet loopback

auto eth1
#iface eth1 inet static
iface eth1 inet manual

auto cloudbr1
iface cloudbr1 inet static
bridge_ports eth1
         address 192.168.0.201
         netmask 255.255.255.0
         network 192.168.0.0
         broadcast 192.168.0.255
         gateway 192.168.0.1
         dns-nameservers 8.8.8.8 8.8.4.4
         dns-search xxxxx.com

auto cloudbr1:0
iface cloudbr1:0 inet static
         address 192.168.3.201
         netmask 255.255.255.0
         network 192.168.3.0
         broadcast 192.168.3.255
====

It seems that the error messages are complaining about no physical
interface being added into cloud0 and virbr0 bridges. I tried to add
eth1,
cloudbr1 into the bridges but it didn't work. Deleting the cloud0 and
virbr0 bridges doesn't help either, agent is still complaining about
cannot
find "pifs":

====
2016-10-20 18:00:48,331 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/lo/
bridge
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file
/sys/devices/virtual/net/cloudbr1/bridge
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) Found bridge cloudbr1
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet4/bridge
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet5/bridge
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet6/bridge
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet7/bridge
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet8/bridge
2016-10-20 18:00:48,332 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet9/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet0/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet1/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet2/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet3/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet11/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet12/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet13/bridge
2016-10-20 18:00:48,333 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet14/bridge
2016-10-20 18:00:48,334 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking in file /sys/devices/virtual/net/
vnet10/bridge
2016-10-20 18:00:48,334 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) looking for pif for bridge cloudbr1
2016-10-20 18:00:48,334 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) matchPifFileInDirectory: file name 'eth1'
2016-10-20 18:00:48,334 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) done looking for pifs, no more bridges
2016-10-20 18:00:48,334 DEBUG [kvm.resource.
LibvirtComputingResource]
(main:null) (logid:) Failed to get private nic name
2016-10-20 18:00:48,334 ERROR [cloud.agent.AgentShell] (main:null)
(logid:)
Unable to start agent: Failed to get private nic name
====

I understand that the required bridge information is supposed to be
added
by CloudStack during the time when the host is added. Is there a way
how
I
can add the bridge information again manually without having to
delete
and
re-add the host into CloudStack? The reason is because we want to
keep
the
VMs running during the downgrade, deleting and re-adding the host
into
CloudStack will shutdown the VMs.

Any advice is greatly appreciated.

Thank you.

-ip-



--
Rafael Weingärtner



--
Rafael Weingärtner


Reply via email to