FreeBSD-8.2: Channel Bounding: LACP or Roundrobin? With Cisco Catalyst

2011-04-08 Thread Denny Schierz
hi,

I want to bound two e1000 (1Gb/s) channels and use at the moment LCAP,
but the max throughput is slower, than without channel bounding. I've
got round about 70MB/s instead of > 150MB/s - 200MB/s.

I used iperf with standard options:

:~$ iperf -f M -c 1.2.3.4

Client connecting to 1.2.3.4, TCP port 5001
TCP window size: 0.02 MByte (default)

[  3] local 1.2.3.5 port 58637 connected with 1.2.3.4 port 5001
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec705 MBytes  70.5 MBytes/sec

If a second PC do the same, than my 70MB/s splittet into ~30MB/s and
~40MB/s

config:

root@iscsihead-m:~# ifconfig lagg0
lagg0: flags=8843 metric 0 mtu
1500

options=219b
ether 00:15:17:f1:5d:5f
inet6 fe80::215:17ff:fef1:5d5f%lagg0 prefixlen 64 scopeid 0x5 
inet 1.2.3.4 netmask 0xffc0 broadcast 1.2.3.255
nd6 options=3
media: Ethernet autoselect
status: active
laggproto lacp
laggport: em1 flags=1c
laggport: em0 flags=1c


Config from the Cisco:


cisco#sh run int po3
Building configuration...

Current configuration : 119 bytes
!
interface Port-channel3
 description iscsi-test
 switchport
 switchport access vlan 111
 switchport mode access
end



#sh etherchannel summary 
Flags:  D - downP - bundled in port-channel
I - stand-alone s - suspended
R - Layer3  S - Layer2
U - in use  f - failed to allocate aggregator

M - not in use, minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port


Number of channel-groups in use: 3
Number of aggregators:   3

Group  Port-channel  ProtocolPorts
--+-+---+---
1  Po1(SU)  -Gi1/1(P)Gi1/2(P)
2  Po2(SU)  -Gi6/17(P)   Gi6/18(P)   Gi6/19(P)   
3  Po3(SU) LACP  Gi5/41(P)   Gi5/44(P)   

cisco#sh run int gi5/41
Building configuration...

Current configuration : 182 bytes
!
interface GigabitEthernet5/41
 description iscsi-head1
 switchport access vlan 111
 switchport mode access
 no cdp enable
 channel-group 3 mode active
 spanning-tree portfast
end



#sh run int gi5/44
Building configuration...

Current configuration : 183 bytes
!
interface GigabitEthernet5/44
 description GRAU1_iscsi2
 switchport access vlan 111
 switchport mode access
 no cdp enable
 channel-group 3 mode active
 spanning-tree portfast
end


#sh etherchannel load-balance 
EtherChannel Load-Balancing Configuration:
src-dst-mac

EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
  IPv4: Source XOR Destination MAC address
  IPv6: Source XOR Destination MAC address


What I saw with tcpdump: it seems, that only one device is used.

Maybe, Cisco uses mac and FreeBSD IP ?

Any suggestions?



signature.asc
Description: This is a digitally signed message part


powerd / cpufreq question

2011-04-08 Thread Daniel Geržo

Hello guys,

I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like 
to utilize powerd(8) on it however, when I run `powerd -v -r90' I see 
something like this:


load  64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load  62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load  82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz

even though the machine is according to top(1) ~90% idle; So I realized, 
that powerd might take the load as the sum of loads of all the cores 
(12), so I tried to tweak powerd arguments like this:


`powerd -v -r 1000 -i 600'

but that errors for me with:

root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load > 100% 
but doesn't allow me to specify it? Is this bug? I suppose not if it 
works for other people...


Other question would be why powerd wants to set freq 5336, when it is 
not available at all (would be nice to have it heh.):


dev.cpu.0.freq_levels: 2668/109000 2533/81000 2400/69000 2267/58000 
2133/48000 2000/4 1867/32000 1733/26000 1600/2 1400/17500 
1200/15000 1000/12500


The symptoms seem to show that there's a bug in the code calculating the 
cpu load. Any ideas what may be wrong?


Examle of two consecutive cp_times sysctl output:

kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110 
14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650 
2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 
0 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 
178894 36963 15466280 1607095 0 117396 4197 16410185 2127878 0 147639 
30804 15832552 1406621 0 92686 1058 16638508


kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110 
14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735 
2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 
0 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 
178897 36963 15466358 1607098 0 117398 4197 16410269 2127880 0 147640 
30804 15832638 1406621 0 92686 1058 16638597


Thanks!

--
S pozdravom / Best regards
  Daniel Gerzo
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: powerd / cpufreq question

2011-04-08 Thread Alexander Motin

Hi.

On 08.04.2011 14:12, Daniel Geržo wrote:

I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like
to utilize powerd(8) on it however, when I run `powerd -v -r90' I see
something like this:

load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz

even though the machine is according to top(1) ~90% idle; So I realized,
that powerd might take the load as the sum of loads of all the cores
(12), so I tried to tweak powerd arguments like this:

`powerd -v -r 1000 -i 600'

but that errors for me with:

root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load > 100%
but doesn't allow me to specify it? Is this bug? I suppose not if it
works for other people...


It is reasonable limitation. powerd can't know how load distributed 
among multiple cores in time. If all cores are equally busy at lets say 
10% (that gives 120% total) and cores are never waiting for each other 
then obviously frequency could be reduced. But if the same 120% mean 
100%+20%, or if load is equally spread, but processes on different cores 
are waiting for each other, then reducing frequency will reduce 
performance. powerd can't know that and so stays on a safe side.



Other question would be why powerd wants to set freq 5336, when it is
not available at all (would be nice to have it heh.):


You may see there it is a "wanted" frequency, not real one. :) It is 
internal implementation details. In such way powerd implements keeping a 
full frequency for some time after the load dropped. It's not a bug.


On multi-core systems like this power management can better be done on 
per-core bases. Powerd can't control frequencies on per-core basis (also 
because it require non-trivial interoperation with scheduler). But if 
your ACPI BIOS allows, you can try to put unused cores into deeper 
C-states, that may give better power saving and TurboBoost on busy cores 
as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses 
still could be achieved.


You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD-8.2: Channel Bounding: LACP or Roundrobin? With Cisco Catalyst

2011-04-08 Thread Rumen Telbizov
Denny,

Since LACP uses hashing to determine which channel to send the packet to
the traffic between two nodes (ip:mac ip:mac) will always get bound to only
one of the two channels.
I am using HP procurve's here and they do seem to hash by ip too
although I can't see/tweak that as in catalyst. If you can add the IP
address
in the hashing function of the switch then you have a better chance of
achieving
what you're trying to by bringing an IP alias (might have to try a few until
you
find one that maps to the opposite channel) and then having two iperfs
between
different IP addresses. In the real live scenario it will be equivalent to
having
multiple iscsi targets between the two hosts on top of different IPs if this
works
for you.

As for the load balancing option - my HP 2910al's seemed to get CPU
overloaded
when I push a gigabit of traffic with this option. My guess is the crazy mac
address
change might be exhausting the cpu much faster. I'd be interested to see how
this
affects catalyst - let me know.

Good luck,
Rumen Telbizov


On Fri, Apr 8, 2011 at 3:29 AM, Denny Schierz  wrote:

> hi,
>
> I want to bound two e1000 (1Gb/s) channels and use at the moment LCAP,
> but the max throughput is slower, than without channel bounding. I've
> got round about 70MB/s instead of > 150MB/s - 200MB/s.
>
> I used iperf with standard options:
>
> :~$ iperf -f M -c 1.2.3.4
> 
> Client connecting to 1.2.3.4, TCP port 5001
> TCP window size: 0.02 MByte (default)
> 
> [  3] local 1.2.3.5 port 58637 connected with 1.2.3.4 port 5001
> [ ID] Interval   Transfer Bandwidth
> [  3]  0.0-10.0 sec705 MBytes  70.5 MBytes/sec
>
> If a second PC do the same, than my 70MB/s splittet into ~30MB/s and
> ~40MB/s
>
> config:
>
> root@iscsihead-m:~# ifconfig lagg0
> lagg0: flags=8843 metric 0 mtu
> 1500
>
>
> options=219b
>ether 00:15:17:f1:5d:5f
>inet6 fe80::215:17ff:fef1:5d5f%lagg0 prefixlen 64 scopeid 0x5
>inet 1.2.3.4 netmask 0xffc0 broadcast 1.2.3.255
>nd6 options=3
>media: Ethernet autoselect
>status: active
>laggproto lacp
>laggport: em1 flags=1c
>laggport: em0 flags=1c
>
>
> Config from the Cisco:
>
>
> cisco#sh run int po3
> Building configuration...
>
> Current configuration : 119 bytes
> !
> interface Port-channel3
>  description iscsi-test
>  switchport
>  switchport access vlan 111
>  switchport mode access
> end
>
>
>
> #sh etherchannel summary
> Flags:  D - downP - bundled in port-channel
>I - stand-alone s - suspended
>R - Layer3  S - Layer2
>U - in use  f - failed to allocate aggregator
>
>M - not in use, minimum links not met
>u - unsuitable for bundling
>w - waiting to be aggregated
>d - default port
>
>
> Number of channel-groups in use: 3
> Number of aggregators:   3
>
> Group  Port-channel  ProtocolPorts
>
> --+-+---+---
> 1  Po1(SU)  -Gi1/1(P)Gi1/2(P)
> 2  Po2(SU)  -Gi6/17(P)   Gi6/18(P)   Gi6/19(P)
> 3  Po3(SU) LACP  Gi5/41(P)   Gi5/44(P)
>
> cisco#sh run int gi5/41
> Building configuration...
>
> Current configuration : 182 bytes
> !
> interface GigabitEthernet5/41
>  description iscsi-head1
>  switchport access vlan 111
>  switchport mode access
>  no cdp enable
>  channel-group 3 mode active
>  spanning-tree portfast
> end
>
>
>
> #sh run int gi5/44
> Building configuration...
>
> Current configuration : 183 bytes
> !
> interface GigabitEthernet5/44
>  description GRAU1_iscsi2
>  switchport access vlan 111
>  switchport mode access
>  no cdp enable
>  channel-group 3 mode active
>  spanning-tree portfast
> end
>
>
> #sh etherchannel load-balance
> EtherChannel Load-Balancing Configuration:
>src-dst-mac
>
> EtherChannel Load-Balancing Addresses Used Per-Protocol:
> Non-IP: Source XOR Destination MAC address
>  IPv4: Source XOR Destination MAC address
>  IPv6: Source XOR Destination MAC address
>
>
> What I saw with tcpdump: it seems, that only one device is used.
>
> Maybe, Cisco uses mac and FreeBSD IP ?
>
> Any suggestions?
>
>


-- 
Rumen Telbizov
http://telbizov.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: powerd / cpufreq question

2011-04-08 Thread Daniel Gerzo

On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote:

Hello Alexander, thanks for quick reply;


root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load > 
100%

but doesn't allow me to specify it? Is this bug? I suppose not if it
works for other people...


It is reasonable limitation. powerd can't know how load distributed
among multiple cores in time. If all cores are equally busy at lets
say 10% (that gives 120% total) and cores are never waiting for each
other then obviously frequency could be reduced. But if the same 120%
mean 100%+20%, or if load is equally spread, but processes on
different cores are waiting for each other, then reducing frequency
will reduce performance. powerd can't know that and so stays on a 
safe

side.


OK, I understand what you are saying here. On the other side, I know 
pretty well how the load is distributed - in this particular case, the 
box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I 
suspect the cores are never waiting for each other. There could be an 
option which would allow an administrator to decide whether this is the 
case and allow him to set a higher -r and -i values, what do you think?


Other question would be why powerd wants to set freq 5336, when it 
is

not available at all (would be nice to have it heh.):


You may see there it is a "wanted" frequency, not real one. :) It is
internal implementation details. In such way powerd implements 
keeping
a full frequency for some time after the load dropped. It's not a 
bug.


OK :-) I actually though powerd always honors the values from 
dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little 
weird to me.



On multi-core systems like this power management can better be done
on per-core bases. Powerd can't control frequencies on per-core basis
(also because it require non-trivial interoperation with scheduler).
But if your ACPI BIOS allows, you can try to put unused cores into
deeper C-states, that may give better power saving and TurboBoost on
busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE
some bonuses still could be achieved.


Any idea what I should look for in the BIOS?
This is 8-STABLE, any idea whether there's a MFC plan for the extra 
9-CURRENT bonuses?



You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


From reading this, are you reffering above to the C2 states? (seems 
like C3 is not optimal for this kind of operation...)


Thanks.

--
Kind regards
  Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: powerd / cpufreq question

2011-04-08 Thread Alexander Motin

On 08.04.2011 17:42, Daniel Gerzo wrote:

On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote:

root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load > 100%
but doesn't allow me to specify it? Is this bug? I suppose not if it
works for other people...


It is reasonable limitation. powerd can't know how load distributed
among multiple cores in time. If all cores are equally busy at lets
say 10% (that gives 120% total) and cores are never waiting for each
other then obviously frequency could be reduced. But if the same 120%
mean 100%+20%, or if load is equally spread, but processes on
different cores are waiting for each other, then reducing frequency
will reduce performance. powerd can't know that and so stays on a safe
side.


OK, I understand what you are saying here. On the other side, I know
pretty well how the load is distributed - in this particular case, the
box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I suspect
the cores are never waiting for each other. There could be an option
which would allow an administrator to decide whether this is the case
and allow him to set a higher -r and -i values, what do you think?


I think it should be possible with minimal changes.


Other question would be why powerd wants to set freq 5336, when it is
not available at all (would be nice to have it heh.):


You may see there it is a "wanted" frequency, not real one. :) It is
internal implementation details. In such way powerd implements keeping
a full frequency for some time after the load dropped. It's not a bug.


OK :-) I actually though powerd always honors the values from
dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little
weird to me.


It does it on left side, but no longer on the right side. Abstracting 
from real frequencies made behavior more universal and predictable.



On multi-core systems like this power management can better be done
on per-core bases. Powerd can't control frequencies on per-core basis
(also because it require non-trivial interoperation with scheduler).
But if your ACPI BIOS allows, you can try to put unused cores into
deeper C-states, that may give better power saving and TurboBoost on
busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE
some bonuses still could be achieved.


Any idea what I should look for in the BIOS?


Something about C-states, or Cx-states on the CPU page. But first look 
at dev.cpu.X.cx_supported to make sure it is not already present and 
just unused.



This is 8-STABLE, any idea whether there's a MFC plan for the extra
9-CURRENT bonuses?


I suppose around May.


You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


 From reading this, are you reffering above to the C2 states? (seems
like C3 is not optimal for this kind of operation...)


The deeper state, the more power saved. To get most of it and to get 
TurboBoost working you need at least C3 CPU state (ACPI may report it 
with different number). Some latest Intel CPUs have no described 
problems with C3 and LAPIC, for others described system tuning requited.


PS: Using powerd in best case wont hurt performance, while using 
C-states may even increase it in some cases because of TurboBoost.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: powerd / cpufreq question

2011-04-08 Thread Daniel Gerzo

On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote:

OK, I understand what you are saying here. On the other side, I know
pretty well how the load is distributed - in this particular case, 
the

box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I 
suspect

the cores are never waiting for each other. There could be an option
which would allow an administrator to decide whether this is the 
case

and allow him to set a higher -r and -i values, what do you think?


I think it should be possible with minimal changes.


So, here is my attempt to implement it:
http://danger.rulez.sk/powerd.diff
Can you please review & comment? I should be able to commit it mysqlf 
if you consider it acceptable. It seems to work for me :)





Any idea what I should look for in the BIOS?


Something about C-states, or Cx-states on the CPU page. But first
look at dev.cpu.X.cx_supported to make sure it is not already present
and just unused.


Seems like it was enabled by default. I have like these:
dev.cpu.0.cx_supported: C1/3 C2/96 C3/128

Does that mean I only need to set these in rc.conf?:
performance_cx_lowest="C3"
economy_cx_lowest="C3"

Then run /etc/rc.d/power_profile 0x00?
May it cause any instability?


This is 8-STABLE, any idea whether there's a MFC plan for the extra
9-CURRENT bonuses?


I suppose around May.


Do you have some patches? If not you don't really need to make them 
just for me, I can wait a little.



You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


 From reading this, are you reffering above to the C2 states? (seems
like C3 is not optimal for this kind of operation...)


The deeper state, the more power saved. To get most of it and to get
TurboBoost working you need at least C3 CPU state (ACPI may report it
with different number). Some latest Intel CPUs have no described
problems with C3 and LAPIC, for others described system tuning
requited.



I believe this is pretty recent CPU (6 core Xeon X5650). Do you know 
about any problems?



PS: Using powerd in best case wont hurt performance, while using
C-states may even increase it in some cases because of TurboBoost.


If I want to use C-states, should I stop to use powerd, or is it 
possible to use them both together?


Thanks!

--
Kind regards
  Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: powerd / cpufreq question

2011-04-08 Thread Alexander Motin

On 08.04.2011 19:53, Daniel Gerzo wrote:

On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote:

OK, I understand what you are saying here. On the other side, I know
pretty well how the load is distributed - in this particular case, the
box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I suspect
the cores are never waiting for each other. There could be an option
which would allow an administrator to decide whether this is the case
and allow him to set a higher -r and -i values, what do you think?


I think it should be possible with minimal changes.


So, here is my attempt to implement it:
http://danger.rulez.sk/powerd.diff
Can you please review & comment? I should be able to commit it mysqlf if
you consider it acceptable. It seems to work for me :)


Looks fine, except that -f option have to be the first, that is not 
obvious. Another moment -- I've noticed some load constants hardcoded 
there. They should also be handled to make higher values to work properly.



Any idea what I should look for in the BIOS?


Something about C-states, or Cx-states on the CPU page. But first
look at dev.cpu.X.cx_supported to make sure it is not already present
and just unused.


Seems like it was enabled by default. I have like these:
dev.cpu.0.cx_supported: C1/3 C2/96 C3/128

Does that mean I only need to set these in rc.conf?:
performance_cx_lowest="C3"
economy_cx_lowest="C3"

Then run /etc/rc.d/power_profile 0x00?


It short - yes. In long - read the link I've given.


May it cause any instability?


It you won't switch from LAPIC to other timer and it stop - your system 
will freeze, or at least not work well. You should notice problems 
immediately, if there are.



This is 8-STABLE, any idea whether there's a MFC plan for the extra
9-CURRENT bonuses?


I suppose around May.


Do you have some patches? If not you don't really need to make them just
for me, I can wait a little.


Last ones I've generated are five months old:
http://people.freebsd.org/~mav/timers_merge/
They are large and I am not sure how good they apply now.


You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


From reading this, are you reffering above to the C2 states? (seems
like C3 is not optimal for this kind of operation...)


The deeper state, the more power saved. To get most of it and to get
TurboBoost working you need at least C3 CPU state (ACPI may report it
with different number). Some latest Intel CPUs have no described
problems with C3 and LAPIC, for others described system tuning
requited.


I believe this is pretty recent CPU (6 core Xeon X5650). Do you know
about any problems?


I have no idea about these Xeons. I know just that LAPIC of the my Core 
i5 works fine in C3, while one of the my Core i7 doesn't.



PS: Using powerd in best case wont hurt performance, while using
C-states may even increase it in some cases because of TurboBoost.


If I want to use C-states, should I stop to use powerd, or is it
possible to use them both together?


I am using both together on my laptop.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"