Another update:
100% confirmed to be traffic shapping set by CloudStack. I don't know
where/how/why, and I'd love some help with this. Should I create a new
thread? As previously mentioned, I don't believe I've set a cap of below
100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network
offerings,
and not in the default throttle (which is set at 200).
What am I missing?
I removed tc rules on the host for two test instances and bandwidth
shot up.
Before:
ubuntu@testserver01:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.4 sec 6.62 MBytes 5.35 Mbits/sec
[ 5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277
[ 5] 0.0-10.5 sec 6.62 MBytes 5.28 Mbits/sec
[ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278
[ 4] 0.0-10.4 sec 6.62 MBytes 5.37 Mbits/sec
[ 5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291
[ 5] 0.0-10.3 sec 6.62 MBytes 5.37 Mbits/sec
[ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306
[ 4] 0.0-10.5 sec 6.62 MBytes 5.30 Mbits/sec
Removed the rules for two instances on the same host:
ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root
ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root
ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress
ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress
ubuntu@dom02:~$ tc -s qdisc ls dev vnet1
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1
1 1 1
1 1 1 1
Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
And all of a sudden, those two instances are at blazing speeds:
ubuntu@testserver01:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 14.8 GBytes 12.7 Gbits/sec
[ 5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329
[ 5] 0.0-10.0 sec 19.1 GBytes 16.4 Gbits/sec
[ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330
[ 4] 0.0-10.0 sec 19.0 GBytes 16.3 Gbits/sec
On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <[email protected]> wrote:
First,
THANK YOU FOR REPLYING!
Second, yes, it's currently set at 200.
The compute offering for network is either blank (or when I tested it,
1000)
The network offering for network limit is either 100, 1000, or blank.
Those are the only network throttling parameters that I'm aware of, are
there any others that I missed? Is it possible disk i/o is for some
reason
coming into play here?
This happens regardless of if the instance network is either a virtual
router or is directly connected to a vlan(ie, no virtual router)
when two
instances are directly connected to each other.
On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev <
[email protected]> wrote:
Nick
Have you checked network throttle settings in "global setting" and
where
ever else it may be defined?
regads
ilya
On 8/17/14, 11:27 AM, Nick Burke wrote:
Update:
After running nperf on same instances on the same virtual network, it
looks
like all instances can get no more than 2Mb/s. Additionally, it's
sporadic
and ranges from <1Mb/s, but never more than 2Mb/s:
user@localhost:~$ iperf -c 10.1.0.1 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.1.0.1, TCP port 5001
TCP window size: 86.8 KByte (default)
------------------------------------------------------------
[ 5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-11.0 sec 1.25 MBytes 950 Kbits/sec
[ 4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839
[ 4] 0.0-11.1 sec 2.50 MBytes 1.89 Mbits/sec
user@localhost:~$ iperf -c 10.1.0.1 -d
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
------------------------------------------------------------
Client connecting to 10.1.0.1, TCP port 5001
TCP window size: 50.3 KByte (default)
------------------------------------------------------------
[ 5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 5] 0.0-12.6 sec 1.25 MBytes 834 Kbits/sec
[ 4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840
[ 4] 0.0-11.9 sec 2.13 MBytes 1.49 Mbits/sec
On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <[email protected]>
wrote:
I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart
anything
and
it was all working great. However, I had to perform some
maintenance and
had to restart everything. Now, I'm seeing packet loss on all
virtuals,
even ones on the same host.
sudo ping -c 500 -f 172.20.1.1
PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data.
........................................
--- 172.20.1.1 ping statistics ---
500 packets transmitted, 460 received, 8% packet loss, time 864ms
rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma
1.731/0.328
ms
No interface errors reported anywhere. The host itself isn't
under load
at
all. Doesn't matter if the instance uses e1000 or virtio for the
drivers.
The only thing that I'm aware of that changed was that I had to
reboot
all
the physical servers.
Could be related, but I was hit with the
https://issues.apache.org/jira/browse/CLOUDSTACK-6464
bug. I did follow with Marcus' suggestion:
*"This is a shot in the dark, but there have been some issues around
upgrades that involve the cloud.vlan table expected contents
changing.
New
4.3 installs using vlan isolation don't seem to reproduce the issue.
I'll
see if I can reproduce anything like this with basic and/or non-vlan
isolated upgrades/installs. Can anyone experiencing an issue look at
their
database via something like "select * from cloud.vlan" and look
at the
vlan_id. If you see something like "untagged" instead of
"vlan://untagged",
please try changing it and see if that helps."*
--
Nick
*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn
that is
unafraid to destroy itself in growing into a tree.' -David
Zindell, A
Requiem for Homo Sapiens*
--
Nick
*'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is
unafraid to destroy itself in growing into a tree.' -David Zindell, A
Requiem for Homo Sapiens*