Pyun YongHyeon wrote:
On Mon, Apr 12, 2010 at 10:57:01AM -0700, Pyun YongHyeon wrote:
On Sun, Apr 11, 2010 at 03:15:16AM -0600, Erich Jenkins, Fuujin Group Ltd wrote:
I've been muddling around in src/sys/dev on the old system and the new system and there appear to be rather major changes to MII and bge, possibly the whole stack?

It was not completely rewritten but many improvements were made.

There are a number of things that seem to have been merged with other parts of the network stack, or perhaps written into the individual drivers (someone working on the net stack would have to verify that).

For instance, some files called in 5.3-REL seem to have gone away completely, and in the new (unpatched) version of if_bge.c under 7.3-REL, calls to these modules are gone:

- #include <vm/vm.h>              /* for vtophys */
- #include <vm/pmap.h>            /* for vtophys */
One of the most significant changes would be bus_dma(9) conversion
which is required to all drivers to make it work correctly on a
variety of platforms. bus_dma(9) does not directly use vtophys
anymore so these headers were nuked.

- #include <machine/clock.h>      /* for DELAY */
- #include <machine/bus_memio.h>

- #include <dev/pci/pcireg.h> (called but something changed in here)
- #include <dev/pci/pcivar.h> (ditto above)

No, these headers are still present.

It appears that the checksum features have been completely rewritten,
Checksum offloading was not completely rewritten but workaround
for buggy controllers was added.

and some of the ring settings have changed. It's interesting that the driver only fills 256 of the rx rings in the hopes that the cpu is "fast enough to keep up with the NIC". Would a subroutine here to grab the cpu
That magic number 256 is adequate for most cases but it may not be
enough to handle heavy loads. Internally the controller use fixed
512 RX buffers but bge(4) used only half of the buffers to save
resources. I think you can increase SSLOTS to 512 to get full 512
RX buffers.

clock and count (number of procs/pipelines) be more trouble than it's worth to "automagically" increase the number of rx rings the driver fills based on the system in which it's installed?

Dynamically increasing number of RX buffers is doable but it would
add much more code. If there is high demand for that I would just
increase number of RX buffers to 512. Controller can't be
configured to have more than 512 RX buffers.

Something also changed in pci/pcireg.h and pci/pcivar.h, but I haven't had the time to hunt down and expand the source tree from the 5.3-REL branch yet.

I have other machines with copper nics utilizing the bge driver, and there are no issues at all. Perhaps I'm getting ahead of things, but
Yes that is expected one. :-)

since this seems to have been broken through several releases, would it make any sense to split the support between the BCM5701KHB chipset and the more recent BCM chipset to avoid causing issues with cards/systems not currently experiencing troubles?

I'd like to if I can. Supporting huge number of different
controllers in single driver is maintenance nightmare. However,
rewriting some part that require special handling for certain
controller/revision is too risky because I don't have access to
most controllers.

One theory for the issue I got while reading the code is link state
handling. As I said in previous mail, link state handling for TBI
is somewhat tricky in bge(4) and driver seemed to rely on periodic
register access to keep track of link state. I guess polling(4) may
give different behavior on link state handling as it does not rely
on interrupts at all. So would you try to use polling(4) and see
that make any difference on your box?

If polling(4) make it work, try attached patch.


------------------------------------------------------------------------

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

I'll get this set up. I've got a jail issues on 7.0-REL that I'm trying to figure out too, so it might take a few hours before I get to this.

I just checked on a reported iSCSI error on a machine using a BCM5721 nic (copper gigE) and I'm seeing issues like this:

Apr 11 06:24:59 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad "Opcode": Got 0 expected 5. Apr 11 06:24:59 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** iscsi_write_data_decap() failed Apr 11 16:51:52 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad "Opcode": Got 0 expected 5. Apr 11 16:51:52 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** iscsi_write_data_decap() failed Apr 12 10:32:49 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad "Opcode": Got 0 expected 5. Apr 12 10:32:49 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** iscsi_write_data_decap() failed Apr 12 11:55:42 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad "Opcode": Got 0 expected 5. Apr 12 11:55:42 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** iscsi_write_data_decap() failed Apr 12 14:07:13 san0 iscsi-target: pid 863:iscsi.c:1149: ***ERROR*** Bad "Opcode": Got 0 expected 5. Apr 12 14:07:13 san0 iscsi-target: pid 863:target.c:1317: ***ERROR*** iscsi_write_data_decap() failed

Any chance this could be because of the NIC chipset? I don't see this on any of the machines configured identically, using the em driver for Intel GigE nics.


Erich M. Jenkins
Fuujin Group Limited

"You should never, never doubt what no one is sure about."
-- Gene Wilder
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to