FreeBSD-8.2: Channel Bounding: LACP or Roundrobin? With Cisco Catalyst
hi, I want to bound two e1000 (1Gb/s) channels and use at the moment LCAP, but the max throughput is slower, than without channel bounding. I've got round about 70MB/s instead of > 150MB/s - 200MB/s. I used iperf with standard options: :~$ iperf -f M -c 1.2.3.4 Client connecting to 1.2.3.4, TCP port 5001 TCP window size: 0.02 MByte (default) [ 3] local 1.2.3.5 port 58637 connected with 1.2.3.4 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec705 MBytes 70.5 MBytes/sec If a second PC do the same, than my 70MB/s splittet into ~30MB/s and ~40MB/s config: root@iscsihead-m:~# ifconfig lagg0 lagg0: flags=8843 metric 0 mtu 1500 options=219b ether 00:15:17:f1:5d:5f inet6 fe80::215:17ff:fef1:5d5f%lagg0 prefixlen 64 scopeid 0x5 inet 1.2.3.4 netmask 0xffc0 broadcast 1.2.3.255 nd6 options=3 media: Ethernet autoselect status: active laggproto lacp laggport: em1 flags=1c laggport: em0 flags=1c Config from the Cisco: cisco#sh run int po3 Building configuration... Current configuration : 119 bytes ! interface Port-channel3 description iscsi-test switchport switchport access vlan 111 switchport mode access end #sh etherchannel summary Flags: D - downP - bundled in port-channel I - stand-alone s - suspended R - Layer3 S - Layer2 U - in use f - failed to allocate aggregator M - not in use, minimum links not met u - unsuitable for bundling w - waiting to be aggregated d - default port Number of channel-groups in use: 3 Number of aggregators: 3 Group Port-channel ProtocolPorts --+-+---+--- 1 Po1(SU) -Gi1/1(P)Gi1/2(P) 2 Po2(SU) -Gi6/17(P) Gi6/18(P) Gi6/19(P) 3 Po3(SU) LACP Gi5/41(P) Gi5/44(P) cisco#sh run int gi5/41 Building configuration... Current configuration : 182 bytes ! interface GigabitEthernet5/41 description iscsi-head1 switchport access vlan 111 switchport mode access no cdp enable channel-group 3 mode active spanning-tree portfast end #sh run int gi5/44 Building configuration... Current configuration : 183 bytes ! interface GigabitEthernet5/44 description GRAU1_iscsi2 switchport access vlan 111 switchport mode access no cdp enable channel-group 3 mode active spanning-tree portfast end #sh etherchannel load-balance EtherChannel Load-Balancing Configuration: src-dst-mac EtherChannel Load-Balancing Addresses Used Per-Protocol: Non-IP: Source XOR Destination MAC address IPv4: Source XOR Destination MAC address IPv6: Source XOR Destination MAC address What I saw with tcpdump: it seems, that only one device is used. Maybe, Cisco uses mac and FreeBSD IP ? Any suggestions? signature.asc Description: This is a digitally signed message part
powerd / cpufreq question
Hello guys, I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like to utilize powerd(8) on it however, when I run `powerd -v -r90' I see something like this: load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz even though the machine is according to top(1) ~90% idle; So I realized, that powerd might take the load as the sum of loads of all the cores (12), so I tried to tweak powerd arguments like this: `powerd -v -r 1000 -i 600' but that errors for me with: root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load > 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): dev.cpu.0.freq_levels: 2668/109000 2533/81000 2400/69000 2267/58000 2133/48000 2000/4 1867/32000 1733/26000 1600/2 1400/17500 1200/15000 1000/12500 The symptoms seem to show that there's a bug in the code calculating the cpu load. Any ideas what may be wrong? Examle of two consecutive cp_times sysctl output: kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110 14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650 2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 0 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 178894 36963 15466280 1607095 0 117396 4197 16410185 2127878 0 147639 30804 15832552 1406621 0 92686 1058 16638508 kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110 14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735 2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 0 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 178897 36963 15466358 1607098 0 117398 4197 16410269 2127880 0 147640 30804 15832638 1406621 0 92686 1058 16638597 Thanks! -- S pozdravom / Best regards Daniel Gerzo ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: powerd / cpufreq question
Hi. On 08.04.2011 14:12, Daniel Geržo wrote: I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like to utilize powerd(8) on it however, when I run `powerd -v -r90' I see something like this: load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz even though the machine is according to top(1) ~90% idle; So I realized, that powerd might take the load as the sum of loads of all the cores (12), so I tried to tweak powerd arguments like this: `powerd -v -r 1000 -i 600' but that errors for me with: root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load > 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... It is reasonable limitation. powerd can't know how load distributed among multiple cores in time. If all cores are equally busy at lets say 10% (that gives 120% total) and cores are never waiting for each other then obviously frequency could be reduced. But if the same 120% mean 100%+20%, or if load is equally spread, but processes on different cores are waiting for each other, then reducing frequency will reduce performance. powerd can't know that and so stays on a safe side. Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): You may see there it is a "wanted" frequency, not real one. :) It is internal implementation details. In such way powerd implements keeping a full frequency for some time after the load dropped. It's not a bug. On multi-core systems like this power management can better be done on per-core bases. Powerd can't control frequencies on per-core basis (also because it require non-trivial interoperation with scheduler). But if your ACPI BIOS allows, you can try to put unused cores into deeper C-states, that may give better power saving and TurboBoost on busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses still could be achieved. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: FreeBSD-8.2: Channel Bounding: LACP or Roundrobin? With Cisco Catalyst
Denny, Since LACP uses hashing to determine which channel to send the packet to the traffic between two nodes (ip:mac ip:mac) will always get bound to only one of the two channels. I am using HP procurve's here and they do seem to hash by ip too although I can't see/tweak that as in catalyst. If you can add the IP address in the hashing function of the switch then you have a better chance of achieving what you're trying to by bringing an IP alias (might have to try a few until you find one that maps to the opposite channel) and then having two iperfs between different IP addresses. In the real live scenario it will be equivalent to having multiple iscsi targets between the two hosts on top of different IPs if this works for you. As for the load balancing option - my HP 2910al's seemed to get CPU overloaded when I push a gigabit of traffic with this option. My guess is the crazy mac address change might be exhausting the cpu much faster. I'd be interested to see how this affects catalyst - let me know. Good luck, Rumen Telbizov On Fri, Apr 8, 2011 at 3:29 AM, Denny Schierz wrote: > hi, > > I want to bound two e1000 (1Gb/s) channels and use at the moment LCAP, > but the max throughput is slower, than without channel bounding. I've > got round about 70MB/s instead of > 150MB/s - 200MB/s. > > I used iperf with standard options: > > :~$ iperf -f M -c 1.2.3.4 > > Client connecting to 1.2.3.4, TCP port 5001 > TCP window size: 0.02 MByte (default) > > [ 3] local 1.2.3.5 port 58637 connected with 1.2.3.4 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0-10.0 sec705 MBytes 70.5 MBytes/sec > > If a second PC do the same, than my 70MB/s splittet into ~30MB/s and > ~40MB/s > > config: > > root@iscsihead-m:~# ifconfig lagg0 > lagg0: flags=8843 metric 0 mtu > 1500 > > > options=219b >ether 00:15:17:f1:5d:5f >inet6 fe80::215:17ff:fef1:5d5f%lagg0 prefixlen 64 scopeid 0x5 >inet 1.2.3.4 netmask 0xffc0 broadcast 1.2.3.255 >nd6 options=3 >media: Ethernet autoselect >status: active >laggproto lacp >laggport: em1 flags=1c >laggport: em0 flags=1c > > > Config from the Cisco: > > > cisco#sh run int po3 > Building configuration... > > Current configuration : 119 bytes > ! > interface Port-channel3 > description iscsi-test > switchport > switchport access vlan 111 > switchport mode access > end > > > > #sh etherchannel summary > Flags: D - downP - bundled in port-channel >I - stand-alone s - suspended >R - Layer3 S - Layer2 >U - in use f - failed to allocate aggregator > >M - not in use, minimum links not met >u - unsuitable for bundling >w - waiting to be aggregated >d - default port > > > Number of channel-groups in use: 3 > Number of aggregators: 3 > > Group Port-channel ProtocolPorts > > --+-+---+--- > 1 Po1(SU) -Gi1/1(P)Gi1/2(P) > 2 Po2(SU) -Gi6/17(P) Gi6/18(P) Gi6/19(P) > 3 Po3(SU) LACP Gi5/41(P) Gi5/44(P) > > cisco#sh run int gi5/41 > Building configuration... > > Current configuration : 182 bytes > ! > interface GigabitEthernet5/41 > description iscsi-head1 > switchport access vlan 111 > switchport mode access > no cdp enable > channel-group 3 mode active > spanning-tree portfast > end > > > > #sh run int gi5/44 > Building configuration... > > Current configuration : 183 bytes > ! > interface GigabitEthernet5/44 > description GRAU1_iscsi2 > switchport access vlan 111 > switchport mode access > no cdp enable > channel-group 3 mode active > spanning-tree portfast > end > > > #sh etherchannel load-balance > EtherChannel Load-Balancing Configuration: >src-dst-mac > > EtherChannel Load-Balancing Addresses Used Per-Protocol: > Non-IP: Source XOR Destination MAC address > IPv4: Source XOR Destination MAC address > IPv6: Source XOR Destination MAC address > > > What I saw with tcpdump: it seems, that only one device is used. > > Maybe, Cisco uses mac and FreeBSD IP ? > > Any suggestions? > > -- Rumen Telbizov http://telbizov.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: powerd / cpufreq question
On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote: Hello Alexander, thanks for quick reply; root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load > 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... It is reasonable limitation. powerd can't know how load distributed among multiple cores in time. If all cores are equally busy at lets say 10% (that gives 120% total) and cores are never waiting for each other then obviously frequency could be reduced. But if the same 120% mean 100%+20%, or if load is equally spread, but processes on different cores are waiting for each other, then reducing frequency will reduce performance. powerd can't know that and so stays on a safe side. OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): You may see there it is a "wanted" frequency, not real one. :) It is internal implementation details. In such way powerd implements keeping a full frequency for some time after the load dropped. It's not a bug. OK :-) I actually though powerd always honors the values from dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little weird to me. On multi-core systems like this power management can better be done on per-core bases. Powerd can't control frequencies on per-core basis (also because it require non-trivial interoperation with scheduler). But if your ACPI BIOS allows, you can try to put unused cores into deeper C-states, that may give better power saving and TurboBoost on busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses still could be achieved. Any idea what I should look for in the BIOS? This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) Thanks. -- Kind regards Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: powerd / cpufreq question
On 08.04.2011 17:42, Daniel Gerzo wrote: On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote: root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load > 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... It is reasonable limitation. powerd can't know how load distributed among multiple cores in time. If all cores are equally busy at lets say 10% (that gives 120% total) and cores are never waiting for each other then obviously frequency could be reduced. But if the same 120% mean 100%+20%, or if load is equally spread, but processes on different cores are waiting for each other, then reducing frequency will reduce performance. powerd can't know that and so stays on a safe side. OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? I think it should be possible with minimal changes. Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): You may see there it is a "wanted" frequency, not real one. :) It is internal implementation details. In such way powerd implements keeping a full frequency for some time after the load dropped. It's not a bug. OK :-) I actually though powerd always honors the values from dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little weird to me. It does it on left side, but no longer on the right side. Abstracting from real frequencies made behavior more universal and predictable. On multi-core systems like this power management can better be done on per-core bases. Powerd can't control frequencies on per-core basis (also because it require non-trivial interoperation with scheduler). But if your ACPI BIOS allows, you can try to put unused cores into deeper C-states, that may give better power saving and TurboBoost on busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses still could be achieved. Any idea what I should look for in the BIOS? Something about C-states, or Cx-states on the CPU page. But first look at dev.cpu.X.cx_supported to make sure it is not already present and just unused. This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) The deeper state, the more power saved. To get most of it and to get TurboBoost working you need at least C3 CPU state (ACPI may report it with different number). Some latest Intel CPUs have no described problems with C3 and LAPIC, for others described system tuning requited. PS: Using powerd in best case wont hurt performance, while using C-states may even increase it in some cases because of TurboBoost. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: powerd / cpufreq question
On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote: OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? I think it should be possible with minimal changes. So, here is my attempt to implement it: http://danger.rulez.sk/powerd.diff Can you please review & comment? I should be able to commit it mysqlf if you consider it acceptable. It seems to work for me :) Any idea what I should look for in the BIOS? Something about C-states, or Cx-states on the CPU page. But first look at dev.cpu.X.cx_supported to make sure it is not already present and just unused. Seems like it was enabled by default. I have like these: dev.cpu.0.cx_supported: C1/3 C2/96 C3/128 Does that mean I only need to set these in rc.conf?: performance_cx_lowest="C3" economy_cx_lowest="C3" Then run /etc/rc.d/power_profile 0x00? May it cause any instability? This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. Do you have some patches? If not you don't really need to make them just for me, I can wait a little. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) The deeper state, the more power saved. To get most of it and to get TurboBoost working you need at least C3 CPU state (ACPI may report it with different number). Some latest Intel CPUs have no described problems with C3 and LAPIC, for others described system tuning requited. I believe this is pretty recent CPU (6 core Xeon X5650). Do you know about any problems? PS: Using powerd in best case wont hurt performance, while using C-states may even increase it in some cases because of TurboBoost. If I want to use C-states, should I stop to use powerd, or is it possible to use them both together? Thanks! -- Kind regards Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: powerd / cpufreq question
On 08.04.2011 19:53, Daniel Gerzo wrote: On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote: OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? I think it should be possible with minimal changes. So, here is my attempt to implement it: http://danger.rulez.sk/powerd.diff Can you please review & comment? I should be able to commit it mysqlf if you consider it acceptable. It seems to work for me :) Looks fine, except that -f option have to be the first, that is not obvious. Another moment -- I've noticed some load constants hardcoded there. They should also be handled to make higher values to work properly. Any idea what I should look for in the BIOS? Something about C-states, or Cx-states on the CPU page. But first look at dev.cpu.X.cx_supported to make sure it is not already present and just unused. Seems like it was enabled by default. I have like these: dev.cpu.0.cx_supported: C1/3 C2/96 C3/128 Does that mean I only need to set these in rc.conf?: performance_cx_lowest="C3" economy_cx_lowest="C3" Then run /etc/rc.d/power_profile 0x00? It short - yes. In long - read the link I've given. May it cause any instability? It you won't switch from LAPIC to other timer and it stop - your system will freeze, or at least not work well. You should notice problems immediately, if there are. This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. Do you have some patches? If not you don't really need to make them just for me, I can wait a little. Last ones I've generated are five months old: http://people.freebsd.org/~mav/timers_merge/ They are large and I am not sure how good they apply now. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) The deeper state, the more power saved. To get most of it and to get TurboBoost working you need at least C3 CPU state (ACPI may report it with different number). Some latest Intel CPUs have no described problems with C3 and LAPIC, for others described system tuning requited. I believe this is pretty recent CPU (6 core Xeon X5650). Do you know about any problems? I have no idea about these Xeons. I know just that LAPIC of the my Core i5 works fine in C3, while one of the my Core i7 doesn't. PS: Using powerd in best case wont hurt performance, while using C-states may even increase it in some cases because of TurboBoost. If I want to use C-states, should I stop to use powerd, or is it possible to use them both together? I am using both together on my laptop. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"