Another update:
100% confirmed to be traffic shapping set by CloudStack. I don't know where/how/why, and I'd love some help with this. Should I create a new thread? As previously mentioned, I don't believe I've set a cap of below 100Mbs ANYWHERE in Cloudstack. Not in compute offerings, network offerings, and not in the default throttle (which is set at 200). What am I missing? I removed tc rules on the host for two test instances and bandwidth shot up. Before: ubuntu@testserver01:~$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59276 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.4 sec 6.62 MBytes 5.35 Mbits/sec [ 5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59277 [ 5] 0.0-10.5 sec 6.62 MBytes 5.28 Mbits/sec [ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59278 [ 4] 0.0-10.4 sec 6.62 MBytes 5.37 Mbits/sec [ 5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59291 [ 5] 0.0-10.3 sec 6.62 MBytes 5.37 Mbits/sec [ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59306 [ 4] 0.0-10.5 sec 6.62 MBytes 5.30 Mbits/sec Removed the rules for two instances on the same host: ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 root ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 root ubuntu@dom02:~$ sudo tc qdisc del dev vnet3 ingress ubuntu@dom02:~$ sudo tc qdisc del dev vnet1 ingress ubuntu@dom02:~$ tc -s qdisc ls dev vnet1 qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 7136572 bytes 1048 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 And all of a sudden, those two instances are at blazing speeds: ubuntu@testserver01:~$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59322 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 14.8 GBytes 12.7 Gbits/sec [ 5] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59329 [ 5] 0.0-10.0 sec 19.1 GBytes 16.4 Gbits/sec [ 4] local 10.1.1.101 port 5001 connected with 10.1.1.102 port 59330 [ 4] 0.0-10.0 sec 19.0 GBytes 16.3 Gbits/sec On Sun, Aug 17, 2014 at 12:46 PM, Nick Burke <[email protected]> wrote: > First, > > THANK YOU FOR REPLYING! > > Second, yes, it's currently set at 200. > > The compute offering for network is either blank (or when I tested it, > 1000) > The network offering for network limit is either 100, 1000, or blank. > > > Those are the only network throttling parameters that I'm aware of, are > there any others that I missed? Is it possible disk i/o is for some reason > coming into play here? > > This happens regardless of if the instance network is either a virtual > router or is directly connected to a vlan(ie, no virtual router) when two > instances are directly connected to each other. > > > > > > On Sun, Aug 17, 2014 at 12:09 PM, ilya musayev < > [email protected]> wrote: > >> Nick >> >> Have you checked network throttle settings in "global setting" and where >> ever else it may be defined? >> >> regads >> ilya >> >> On 8/17/14, 11:27 AM, Nick Burke wrote: >> >>> Update: >>> >>> After running nperf on same instances on the same virtual network, it >>> looks >>> like all instances can get no more than 2Mb/s. Additionally, it's >>> sporadic >>> and ranges from <1Mb/s, but never more than 2Mb/s: >>> >>> user@localhost:~$ iperf -c 10.1.0.1 -d >>> ------------------------------------------------------------ >>> Server listening on TCP port 5001 >>> TCP window size: 85.3 KByte (default) >>> ------------------------------------------------------------ >>> ------------------------------------------------------------ >>> Client connecting to 10.1.0.1, TCP port 5001 >>> TCP window size: 86.8 KByte (default) >>> ------------------------------------------------------------ >>> [ 5] local 10.1.0.10 port 50432 connected with 10.1.0.1 port 5001 >>> [ ID] Interval Transfer Bandwidth >>> [ 5] 0.0-11.0 sec 1.25 MBytes 950 Kbits/sec >>> [ 4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53839 >>> [ 4] 0.0-11.1 sec 2.50 MBytes 1.89 Mbits/sec >>> user@localhost:~$ iperf -c 10.1.0.1 -d >>> ------------------------------------------------------------ >>> Server listening on TCP port 5001 >>> TCP window size: 85.3 KByte (default) >>> ------------------------------------------------------------ >>> ------------------------------------------------------------ >>> Client connecting to 10.1.0.1, TCP port 5001 >>> TCP window size: 50.3 KByte (default) >>> ------------------------------------------------------------ >>> [ 5] local 10.1.0.10 port 52248 connected with 10.1.0.1 port 5001 >>> [ ID] Interval Transfer Bandwidth >>> [ 5] 0.0-12.6 sec 1.25 MBytes 834 Kbits/sec >>> [ 4] local 10.1.0.10 port 5001 connected with 10.1.0.1 port 53840 >>> [ 4] 0.0-11.9 sec 2.13 MBytes 1.49 Mbits/sec >>> >>> >>> >>> On Fri, Aug 15, 2014 at 11:40 AM, Nick Burke <[email protected]> wrote: >>> >>> I upgraded from 4.0 to 4.3.0 some time ago. I didn't restart anything >>>> and >>>> it was all working great. However, I had to perform some maintenance and >>>> had to restart everything. Now, I'm seeing packet loss on all virtuals, >>>> even ones on the same host. >>>> >>>> sudo ping -c 500 -f 172.20.1.1 >>>> PING 172.20.1.1 (172.20.1.1) 56(84) bytes of data. >>>> ........................................ >>>> --- 172.20.1.1 ping statistics --- >>>> 500 packets transmitted, 460 received, 8% packet loss, time 864ms >>>> rtt min/avg/max/mdev = 0.069/0.218/1.290/0.139 ms, ipg/ewma 1.731/0.328 >>>> ms >>>> >>>> No interface errors reported anywhere. The host itself isn't under load >>>> at >>>> all. Doesn't matter if the instance uses e1000 or virtio for the >>>> drivers. >>>> The only thing that I'm aware of that changed was that I had to reboot >>>> all >>>> the physical servers. >>>> >>>> >>>> Could be related, but I was hit with the >>>> >>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6464 >>>> >>>> bug. I did follow with Marcus' suggestion: >>>> >>>> >>>> *"This is a shot in the dark, but there have been some issues around >>>> >>>> upgrades that involve the cloud.vlan table expected contents changing. >>>> New >>>> 4.3 installs using vlan isolation don't seem to reproduce the issue. >>>> I'll >>>> see if I can reproduce anything like this with basic and/or non-vlan >>>> isolated upgrades/installs. Can anyone experiencing an issue look at >>>> their >>>> database via something like "select * from cloud.vlan" and look at the >>>> vlan_id. If you see something like "untagged" instead of >>>> "vlan://untagged", >>>> please try changing it and see if that helps."* >>>> >>>> -- >>>> Nick >>>> >>>> >>>> >>>> >>>> >>>> *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is >>>> >>>> unafraid to destroy itself in growing into a tree.' -David Zindell, A >>>> Requiem for Homo Sapiens* >>>> >>>> >>> >>> >> > > > -- > Nick > > > > > > *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is > unafraid to destroy itself in growing into a tree.' -David Zindell, A > Requiem for Homo Sapiens* > -- Nick *'What is a human being, then?' 'A seed' 'A... seed?' 'An acorn that is unafraid to destroy itself in growing into a tree.' -David Zindell, A Requiem for Homo Sapiens*
