Re: intermittent packet loss after upgrading and restarting networks

ilya musayev Mon, 18 Aug 2014 13:10:38 -0700

Or perhaps this one.

http://cloudstack.apache.org/docs/api/apidocs-4.4/root_admin/restartNetwork.html


On 8/18/14, 1:07 PM, ilya musayev wrote:

Nick,

I dont believe we throttle disks unless you have a storage that hasdirect integration to limits iops like solidfire or possibly netapp.

The change is rather simple, in the global settings level - overridethe throttle configs. They should generally be inherited fromupstream, if it did not - let me know and i can try to point you to adb update you can do.

Once thats done, next time you do a deployment of a vm, it will checkthe network portgroup it has created and update it. You can also trydoing stop and start of the VM, it may update the portgroup configs aswell (not 100% certain, but i think it will work). This behaviordefinitely applies to vmware, i'd think the same would go for otherhypervisors like XEN and KVM - but I dont have XEN or KVM to try this on.

One other suggestion, I would ask on dev list, there is anupdateNetwork api call that you could make - that presumable willupdate these settings, the description for this call is rather brief,hence devs would know better.

http://cloudstack.apache.org/docs/api/apidocs-4.4/root_admin/updateNetwork.html


Regards
ilya

On 8/17/14, 4:38 PM, Nick Burke wrote:

Another update:


100% confirmed to be traffic shapping set by CloudStack. I don't know
where/how/why, and I'd love some help with this. Should I create a new
thread? As previously mentioned, I don't believe I've set a cap of below

100Mbs ANYWHERE in Cloudstack. Not in compute offerings, networkofferings,

and not in the default throttle (which is set at 200).

What am I missing?

I removed tc rules on the host for two test instances and bandwidthshot up.


Before:

ubuntu@testserver01:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.4 sec  6.62 MBytes  5.35 Mbits/sec
[  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
[  5]  0.0-10.5 sec  6.62 MBytes  5.28 Mbits/sec
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
[  4]  0.0-10.4 sec  6.62 MBytes  5.37 Mbits/sec
[  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
[  5]  0.0-10.3 sec  6.62 MBytes  5.37 Mbits/sec
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
[  4]  0.0-10.5 sec  6.62 MBytes  5.30 Mbits/sec

Removed the rules for two instances on the same host:

ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
ubuntu@dom02:~$ tc -s qdisc ls dev vnet1

qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 11 1 1

1 1 1 1
  Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0

And all of a sudden, those two instances are at blazing speeds:

ubuntu@testserver01:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  14.8 GBytes  12.7 Gbits/sec
[  5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
[  5]  0.0-10.0 sec  19.1 GBytes  16.4 Gbits/sec
[  4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
[  4]  0.0-10.0 sec  19.0 GBytes  16.3 Gbits/sec





On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <[email protected]> wrote:

First,

THANK YOU FOR REPLYING!

Second, yes, it's currently set at 200.

The compute offering for network is either blank (or when I tested it,
1000)
The network offering for network limit is either 100, 1000, or blank.


Those are the only network throttling parameters that I'm aware of, are

there any others that I missed? Is it possible disk i/o is for somereason

coming into play here?

This happens regardless of if the instance network is either a virtual

router or is directly connected to a vlan(ie, no virtual router)when two

instances are directly connected to each other.





On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
[email protected]> wrote:

Nick

Have you checked network throttle settings in "global setting" andwhere

ever else it may be defined?

regads
ilya

On 8/17/14, 11:27 AM, Nick Burke wrote:

Update:

After running nperf on same instances on the same virtual network, it
looks
like all instances can get no more than 2Mb/s. Additionally, it's
sporadic
and ranges from <1Mb/s, but never more than 2Mb/s:

user@localhost:~$ iperf -c 10.1.0.1 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.1.0.1, TCP port 5001
TCP window size: 86.8 KByte (default)
------------------------------------------------------------
[  5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-11.0 sec  1.25 MBytes   950 Kbits/sec
[  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
[  4]  0.0-11.1 sec  2.50 MBytes  1.89 Mbits/sec
user@localhost:~$ iperf -c 10.1.0.1 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.1.0.1, TCP port 5001
TCP window size: 50.3 KByte (default)
------------------------------------------------------------
[  5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-12.6 sec  1.25 MBytes   834 Kbits/sec
[  4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
[  4]  0.0-11.9 sec  2.13 MBytes  1.49 Mbits/sec

On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <[email protected]>wrote:

I upgraded from 4.0 to 4.3.0 some time ago. I didn't restartanything

and
it was all working great. However, I had to perform somemaintenance andhad to restart everything. Now, I'm seeing packet loss on allvirtuals,
even ones on the same host.

sudo ping -c 500  -f 172.20.1.1
PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
........................................
--- 172.20.1.1 ping statistics ---
500 packets transmitted, 460 received, 8% packet loss, time 864ms
rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma1.731/0.328
ms
No interface errors reported anywhere. The host itself isn'tunder load
at
all. Doesn't matter if the instance uses e1000 or virtio for the
drivers.
The only thing that I'm aware of that changed was that I had toreboot
all
the physical servers.


Could be related, but I was hit with the

https://issues.apache.org/jira/browse/CLOUDSTACK-6464

bug. I did follow with Marcus' suggestion:


*"This is a shot in the dark, but there have been some issues around
upgrades that involve the cloud.vlan table expected contentschanging.
New
4.3 installs using vlan isolation don't seem to reproduce the issue.
I'll
see if I can reproduce anything like this with basic and/or non-vlan
isolated upgrades/installs. Can anyone experiencing an issue look at
their
database via something like "select * from cloud.vlan" and lookat the
vlan_id. If you see something like "untagged" instead of
"vlan://untagged",
please try changing it and see if that helps."*

--
Nick
*'What is a human being, then?' 'A seed' 'A... seed?' 'An acornthat is
unafraid to destroy itself in growing into a tree.' -DavidZindell, A
Requiem for Homo Sapiens*


--
Nick





*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
unafraid to destroy itself in growing into a tree.' -David Zindell, A
Requiem for Homo Sapiens*

Re: intermittent packet loss after upgrading and restarting networks

Reply via email to