Re: alpha kernel build failure (w/patch)

1999-07-05 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Steve Price had 
to walk into mine and say:

> [trimmed -alpha from cc: list to keep the cross posting
>  police from coming after me :)]
> 
> On Mon, 5 Jul 1999, Parag Patel wrote:
> 
> # On Mon, 05 Jul 1999 00:33:57 CDT, Steve Price wrote:
> # >+#ifdef __i386__
> # >   sc->wb_btag = I386_BUS_SPACE_IO;
> # >+#endif
> # >+#ifdef __alpha__
> # >+  sc->wb_btag = ALPHA_BUS_SPACE_IO;
> # >+#endif
> # 
> # Just curious, but is there a reason that these lines aren't simply
> # 
> # sc->wb_btag = BUS_SPACE_IO;
> # 
> # with this macro being set to the correct machine-specific one in some
> # appropriate header file?  I'm sure I'm missing something...
> 
> I wondered that as well.  For both the i386 and alpha port
> the definitions end up in /usr/include/machine/bus.h and
> stripping off the arch-specific prefix shows that their value
> is the same.  In fact they appear to be the only #define in
> bus.h with the arch-specific prefix besides the multiple-inclusion
> #defines.  I think they could be combined, but defer the
> decision (commit) to the folks working on the new bus code
> as they know their way around this code much better than I
> do.

The reason it's not done that way is because the bus_space code is
incomplete. The NetBSD code from which it was taken has a routine
that sets up the bus tag for you (and I think the handle too) based
on the actual bus type. In other words, you're supposed to be passed
a handle to the bus on which your device resides, and you pass that
to bus_space_create() or whatever, and it figures out all the right
machine specific details for you.

Why don't we have this routine? Because we don't have the NetBSD bus
architecture and at the time we only ran on the i386 arch, so we took
a shortcut and fiddled with the bus space handle and bus space tag
directly.

If we're really lucky then some day this will get fixed correctly,
by somebody who is not me, as I have plenty of other things to keep
me busy.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: wi0 almost works with Wavelan Turbo card

1999-07-18 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Ernie Elu had 
to walk into mine and say:

> I am looking for help with getting a Lucent Wavelan Turbo ISA (Bronze) 
> card running.

What speed is the card, exactly.

> Having read a few posts about the Wavelan IEEE 802.11 card not working with
> the wi driver I thought I would give it a go anyway with the turbo card.

"Not working with the wi driver?" I hope you meant "now working."

> I installed a Wavelan Turbo PCMCIA card in my Toshiba 2520CDT notebook, and
> an idetical card with the Wavelan ISA adapter board into an Advantech 6154
> Slot PC.

I am not familiar with an "Advantech 6154 Slot PC." Please don't assume that
everyone automatically knows your hardware by name. Describe it. In detail.

> Both computers are running FreeBSD 4.0-CURRENT with their IRQ set
> to 10 in pccard.conf, all other settings are default.
> 
> It sort of works, the notebook end seems fine, but the Advantech end keeps
> coming up with the same error on the console every few seconds when there is
> traffic between them:
> 
> wi0: oversized packet received (wi_dat_len=24576, wi_status=0x2000)
> 
> No such error on the laptop.
> 
> When the error occurs ftp or whatever you were doing stalls for a bit then
> continues.
> 
> Any suggestions?

No. I never obtained any real documentation from Lucent (they won't release
the Hermes programming manual without NDA) and I don't have a turbo WaveLAN
card so I'm unable to duplicate your problem on my own equipment. If I can't
duplicate the problem and analyze it, I can't even begin to fix it.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: wi0 almost works with Wavelan Turbo card

1999-07-19 Thread Bill Paul
ver is *supposed* to work with the turbo cards, but
since I don't actually have any I can't verify this. The only high speed
wireless cards I have are the Aironet PC4800 and ISA4800 (for which I'm
currently writing another driver) and I don't appear to have any problems
with these. I cite them for comparison because they have a very similar
programming interface to the WaveLANs (which is surprising really since I
thought the Hermes API was proprietary to Lucent; it's possible that all
of the people making 802.11 equipment got together and agreed on a general
hardware spec, but if so it's news to me).

The only thing that I can think of is that the driver is having trouble
reading the packet data out of the card at high speed. The way the
cards work, data is read from/written to the card 16 bits at a time
via I/O registers (these are programmed I/O devices; there's no memory
mapping). It's possible that at high speeds, the I/O gets thrown out of
whack sometimes. It's actually possible to go back and re-read the frame
from the NIC, although I'd much rather figure out the exact cause of
the problem and deal with that rather that applying a bandaid to work
around the problem (re-reading the received frame would hurt performance).
 
> There is also a version 4 linux driver for the silver card, thats the one with 
> the WEV encryption.

WEP, not WEV, and whether you have encryption or not depends on if it's
supported by the firmware in your card, not on the driver (though you
do have to do a few things in the driver to turn the encryption on).

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: wi0 almost works with Wavelan Turbo card

1999-07-19 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Ernie Elu had 
to walk into mine and say:
 
> I forgot to mention if you want to telnet in and poke around the system just
> let me know a login and password you would like  and I will set it up.

Gee, you know, I'd love to reply to you, but every time I do I get an
error:

   - The following addresses had permanent fatal errors -
<[EMAIL PROTECTED]>

   - Transcript of session follows -
... while talking to spooky.eis.net.au.:
>>> MAIL From:<[EMAIL PROTECTED]> SIZE=1454
<<< 553 <[EMAIL PROTECTED]>... Access Denied
501 <[EMAIL PROTECTED]>... Data format error

Looks like it doesn't like anything under columbia.edu. It likes
freebsd.org though.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: sysinstall network performance

1999-07-28 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Brian Dean had 
to walk into mine and say:

> Hi,
> 
> During recent installs using the 7/26 snap, I noticed that the
> transfer rate for the "ports" distribution was about twice as fast as
> SNAPs from the beginning of the month.  Previously I was seeing a
> download rate of around 13 KB/s, while now I'm seeing around 28 KB/s
> (while these rates may sound horrrendous, it's typical of what we get
> when installing ports - other distributions generally hit upwards of
> about 800 KB/s).
> 
> I was wondering what to attribute this better performance to.  Could
> this be due to the new network driver / newbus integration?

Well, since you didn't tell us what kind of network card(s) you have,
that's impossible to say.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-28 Thread Bill Paul

Somehow I knew this was going to happen. I just got done upgrading one
of my Indigo2s to IRIX 6.5.4, with am-utils 6.0 for NFS automounting.
IRIX 6.5.4 supports NFS v3 and TCP. I tried cd'ing to a directory
served on a FreeBSD 3.2-RELEASE system which happens to have the build
tree for the Alteon Tigon firmware (that's where I compiled the last
firmware image for the Tigon driver). I did a 'du' and after a short
while, it exploded with the following messages:

mbuf siz=33524
panic: Bad nfs svc reply

Inspecting a crash dump showed that the mbuf chain was trashed. The
same thing happens with a 4.0-current snapshot from the 15th: this time
I just manually mounted /usr from the FreeBSD server under /mnt on
the SGI and did cd /mnt; du. Pow: died right away.

The FreeBSD 3.2-RELEASE host has a 3Com 3c509 card. The 4.0-CURRENT
host has a 3Com 3c900-COMBO PCI card. Each uses different drivers and
networking works fine otherwise, so I'm pretty sure the problem is in
NFS somewhere and not in the drivers.

This doesn't happen when using UDP. Given that I can reproduce this
on demand, I should be able to debug it eventually, but hints in the
right direction would be useful.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-28 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say: 

> 
> Ok, so if I understand this correctly you have a FreeBSD server and
> an IRIX client.  UDP mounts work, TCP mounts do not.  You are using
> the AMD automounting software running on the ... client I presume?
> It is the server that is panicing.

Yes. But it's nothing to do with AMD. I can do it manually:

irix# mount -o vers=3,proto=tcp freebsd:/usr /mnt
irix# cd /mnt; du

FreeBSD box explodes.

> First of all, if these are production machines stick with UDP so's
> you don't tear your hair out.  Also double check that the bug still
> exists with the absolute latest CURRENT if you can.

I'd love to except *somebody* hasn't gotten arround to fixing
current.freebsd.org yet. *Nudges jkh*

And I'm not going to stick with UDP mounts because that's hiding the
problem, not fixing it. You're just going to get more grief from the
next poor fool who runs afoul of this problem.
 
> Also please run this (on the FreeBSD server running CURRENT).  It
> will tell me whether NFS is being forced to realign packet data
> coming from your ethernet controller.  (In the example below, my
> NFS server has to realign the data).
> 
>   # sysctl -a | fgrep nfs
>   vfs.nfs.realign_test: 1583064
>   vfs.nfs.realign_count: 1583064

In the case of the 4.0 box, it explodes almost immediately: there's
no chance to actually obtain this data there. I added some printfs
to the 3.2 box briefly and it didn't look like the realign code was
being triggered.
 
> We fixed a serious data corruption bug with NFSv3 over TCP that 
> could result in panics.  This fix was made on May 2nd to current
> and MFC'd to stable on May 8th.  This fix made it into 3.2.

FreeBSD mcmillan.ctr.columbia.edu 4.0-19990715-CURRENT FreeBSD 
4.0-19990715-CURRENT #2: Tue Jul 20 17:07:35 EDT 1999 
[EMAIL PROTECTED]:/usr/src/sys/compile/TEST  i386

Should be in there. I don't think that's it.

Note that the 'mbuf siz' value that gets printed is the exactly the
same every time.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-28 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

> :This is yet another problem that we have run into here.  If you check the
> :digest for -hackers it was reported awhile ago (mike smith even cc-ed it
> :to security since it may have been a kernel stack overflow) .  Anyway, the
> :problem is that IRIX defaults to 32K packets on TCP NFSv3 mounts, and
> :16K on UDP NFSv3 mounts.  I recommend using UDP and setting rsize=8192,
> :wsize=8192 in your amd maps (as we do now, no problems at all).
> :
> :--
> :David Cross   | email: [EMAIL PROTECTED] 
> :Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd 
> 
>Ah ha!  Yes, 32K packets will certainly screw up NFS under FreeBSD.

Uh could you elaborate a little? No, strike that: could you elaborate
a *lot*. A whole lot.
 
>We need to fix that panic to have it simply drop the packet, I guess.

No, we need to fix the code so it handles 32K "packets" (datagrams)
correctly.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-28 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

> :>Ah ha!  Yes, 32K packets will certainly screw up NFS under FreeBSD.
> :
> :Uh could you elaborate a little? No, strike that: could you elaborate
> :a *lot*. A whole lot.
> 
> Sure.  There is a constant called NFS_MAXDATA defined in ..mmm..
> nfs/nfsproto.h.  Set to 32768 for TCP connections, 16384 for UDP
> connections.  The code is a mess though, so usually just the higher
> limit is used.  The fsinfo rpc returns this maximum to the client.
> 
> The client is supposed to limit NFS packets to the specified size.

Okay. Well, I experimented a bit, and found that if I increased
NFS_MAXPACKET by 512 bytes, the machines no longer panic. (Yes, that's
NFS_MAXPACKET, not NFS_MAXDATA.) 512 is just a number I pulled out of my 
ass: initially I just tried increasing it by 372 bytes (33544 - 
NFS_MAXPACKET == 372) which got me a little further along, but later I 
got another crash where mbuf siz was 33632. So I tried 512 and was able 
to do a complete du on /usr without any problems.

As for the trashed mbuf chain I thought I saw, I was confused by a
couple of factors:

- When you do gdb -k vmunix vmcore.X, values on the stack such as
  automatic variables aren't reliably preserved. In this case I
  was trying to do a "print *m" to observe the contents of the last
  used mbuf and this pointed me off into space somewhere. It should
  have been NULL since m_next off the last mbuf in a chain is NULL.

- I was looking at m_pkthdr.rcvif and m_pkthdr.len of mreq, which were
  not initialized and hence were also bogus (which makes sense since
  this was an mbuf chain to be transmitted, not the request that
  was received). Following the mbuf chain along showed that it
  was in fact sane. 

I don't know where these extra bytes are coming from. Presumeably there
is some upper bound to the size of an NFS v3 RPC; either we are computing
it wrong or SGI is. What I'd love to be able to do is snoop the requests
coming from the SGI but that's hard since they're encapsulated in a TCP
stream.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

> :Yes, we do.  I've run into this problem elsewhere but a quick fix was needed
> :so it just got hacked.  NT NFS clients tend to trigger it too.
> :
> :The problem is that the sanity check is a fair way away from where the problem
> :packet is generated.  The bad reply is generated in the readdirplus routine,
> :gets replied (without checking) and cached.  The client drops the (oversize)
> :packet, resends, and the nfsd replies from the cache and this time hits
> :the sanity check and panics.
> :
> :...
> :
> :I will have another look shortly.  Anyway, the clue is that the server
> :readdirplus routine is the apparent culprit.
> :
> :Cheers,
> :-Peter
> 
> This makes a lot of sense.  A report of du causing the panic, and
> the good possibility that readdirplus is caching an oversized response
> packet.  Tell me what you come up with!  I'll take a crack at it if you
> don't find anything.

Caching doesn't enter into it. The problem is bad arithmetic.

In /sys/nfs/nfs_serv.c:nfsrv_readdirplus(), we have the following
code:

/*
 * If either the dircount or maxcount will be
 * exceeded, get out now. Both of these lengths
 * are calculated conservatively, including all
 * XDR overheads.
 */
len += (7 * NFSX_UNSIGNED + nlen + rem + NFSX_V3FH +
NFSX_V3POSTOPATTR);
dirlen += (6 * NFSX_UNSIGNED + nlen + rem);
if (len > cnt || dirlen > fullsiz) {
eofflag = 0;
break;
}


I observed that the value of "len" didn't agree with the actual amount
of data beong consumed in the mbuf chain. It turns out that each
time through the loop, len is being incremented by 4 bytes too little.
In other words, 7 * NFSX_UNSIGNED should really be 8 * NFSX_UNSIGNED.
When I change 7 to 8, I no longer get oversized replies and everything
adds up.

This sanity code is trying to add up the amount of data consumed for
each entryplus3 that gets consumed by a directory entry. The entryplus3
is defined in nfs_prot.x like this:

struct entryplus3 {
fileid3 fileid;
filename3   name;
cookie3 cookie;
post_op_attrname_attributes;
post_op_fh3 name_handle;
entryplus3  *nextentry;
};

Unfortunately I haven't been able to wrap my brain around how this is
being counted up for the "len" calculation. Whatever it's doing, it's
off by 4 bytes. Possibly somebody forgot that "filename3" is a string,
which in XDR format consists of a string bytes, plus padding to a longword
boundary, *plus* a longword length value. Some comments would have been
useful here. (Hint, hint.)

What I don't know is whether or not the calculation for dirlen is
wrong or not. Hopefully now that I've shown everyone the light, maybe
somebody can tell me for sure.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

> I counted it all up.  It definitely needs to be 8 * NFSX_UNSIGNED.

Yes, I know that. :)

But what about the check for dirlen:

> :dirlen += (6 * NFSX_UNSIGNED + nlen + rem);

Should this be 7 * NFSX_UNSIGNED or is it correct as it is. I don't
know how dirlen relates to the entryplus3 structure.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
> I think dirlen is supposed to be a calculation of the size of the 
> struct dirent that the client will eventually synthesize from all
> of this, in order to ensure that the result synthesized by the client
> does not cross a 512 byte boundry.  But if it is, it is being *very*
> conservative.
> 
> I think this may simply be because different clients have different
> structural sizes for struct dirent.  I am guess that the 
> (6 * NFSX_UNSIGNED) is basically a NFS constant.

Okay. I committed the fix to the length calculation to -current and
-stable (I just love one-line patches that stop panics). I just got 
done patching my NFS server machines and they all seem to get along
nicely with the SGI now. Now I can upgrade the other SGIs without
worrying about them clobbering my FreeBSD machines.

Hm. I wonder what would happen if the FreeBSD host was the client
and the SGI was the server.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm

1999-07-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
> And here is something *really* scary.  For the last month I've been
> running NFS over TCP without even realizing it.  I had set up my 
> machines to run NFS/TCP as a test instead of NFS/UDP and then forgot
> to change it back!
> 
>   -Matt

And here is something even scarier: readdirplus from the client side
doesn't appear to work correctly either. This time, you don't need an
IRIX machine to trigger the problem (though it helps :). Do the following

client# mount -o nvsv3,tcp,rdirplus server:/somefs /mnt
client# ls /mnt; du /mnt; etc...

Seems okay so far, right? Ah, but now try to unmount the filesystem:

# umount /mnt



With an IRIX server, the machine wedges as soon as you do ls /mnt. With
a FreeBSD server, nothing happens until you try to unmount the filesystem.
The umount process looks like this:

0   418   388   0  -2  0   312  176 vnlock D+p00:00.01 umount /mnt

When the machine got stuck when I tested it with IRIX, I had to take
a crash dump in order to analyze things; the kernel doesn't seem to
be wedged, but I see these:

 1063   362   293   0  -2  0   3560 vnlock D+v00:00.00  (ls)
 1063   318 1  17  -2 17   7480 vnlock DNp00:00.00  (mailck)

Actually, it looks like it wedges if you use UDP too, so I guess it's
not related to the transport.

Anybody have any ideas? I did my good deed for the day.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: readdirplus client side fix (was Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm)

1999-07-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

> :And here is something even scarier: readdirplus from the client side
> :doesn't appear to work correctly either. This time, you don't need an
> :IRIX machine to trigger the problem (though it helps :). Do the following
> :
> :client# mount -o nvsv3,tcp,rdirplus server:/somefs /mnt
> :client# ls /mnt; du /mnt; etc...
> :
> :Seems okay so far, right? Ah, but now try to unmount the filesystem:
> :
> :# umount /mnt
> :
> :...
> :-Bill
> 
> But, on the bright side, readdirplus is somewhat experimental in that
> it is not used by default, so very little testing of it has been done
> to date.  Thus the bug is not unexepcted :-).  At least the bugs we
> are getting now tend to be in the 'outlying areas' of NFS and not so much
> with the core code.

Well, IRIX is using it by default, and option or not, it's documented
and implemented, so it should work.

> Another area that is probably full of bugs:  nqleasing.

Well, the problem there is: what commercial UNIXes implement NQNFS?
I stumbled over these problems because I was testing things with a
commercial implementation of NFS.

> --
> 
> Ok, I was able to reproduce the above bug and fix it.  The problem on
> the FreeBSD client is in nfs_readdirplusrpc() in nfs/nfs_vnops.c.  It 
> can obtain the vnode being used to populate the additional directory 
> info in one of two ways.  When it gets the vnode via nfs_nget(), the
> returned vnode is locked.  When it gets it via a hit against NFS_CMPFH()
> (which I presume is for '.'), it simply VREF()'s the vnode.
> 
> In the one case the vnode is returned locked, in the other it is not.
> 
> However, the internal loop vrele()'s the vnode rather then vput()'s it,
> so the vnodes in the directory scan are never unlocked.  This leads to
> the lockup.

Uh, yeah.

One of these days I'll be able to understand everything that you just
said. But not today.
 
> If you could test and then commit this patch (w/ me as the submitter),
> I would appreciate it!  It seems to fix the problem for me.  This patch
> is relative to CURRENT.  The fix ought to be MFCable to STABLE.

Close, but not quite. You didn't beat up on it hard enough. The secret
is to think like a kid with a new toy, or more precisely, a sysadmin with
a new toy (amounts to the same thing :). The first thing any sysadmin
wants to do when you hand him a new gizmo is to push the buttons, turns
the knobs and flip the switches, in order to try out all those great
new features he's heard about. That's how you find the bugs.

Anyway, in this case, I found another problem: with your patch applied,
I mounted a filesystem from a 3.2-RELEASE server (which I fixed today
with the readdirplus server side patch) which happened to have a
directory containing the unpacked source code for Ghostscript 5.50,
plus objects left over from a build. There are a crapload of files
in the gs 5.50 distribution, plus another crapload created by compiling
it. I did the following:

client# mount -o nfsv3,tcp,rdirplus server:/fs /mnt
client# cd /mnt
client# ls
client# du



There seems to be another problem in nfs_readdirplusrpc(). The following
diff shows the changes I made to stop the panic:



 
> The funny thing is that the error termination code actually got it
> right and the loop got it wrong.  Usually it's the other way around. 
> 
> --
> 
> Presumably this will not fix the SGI client.  I've no idea what the
> problem there is.  There may be a bug in the SGI client or there may
> be a bug in the client & server implementation of the protocol in FreeBSD.
> 
>   -Matt
>   Matthew Dillon 
>   <[EMAIL PROTECTED]>
> 
> 
> Index: nfs_vnops.c
> ===
> RCS file: /home/ncvs/src/sys/nfs/nfs_vnops.c,v
> retrieving revision 1.135
> diff -u -r1.135 nfs_vnops.c
> --- nfs_vnops.c   1999/07/01 13:32:54 1.135
> +++ nfs_vnops.c   1999/07/29 23:57:06
> @@ -2367,7 +2367,10 @@
>   nfsm_adv(nfsm_rndup(i));
>   }
>   if (newvp != NULLVP) {
> - vrele(newvp);
> + if (newvp == vp)
> + vrele(newvp);
> +     else
> + vput(newvp);
>   newvp = NULLVP;
>   }
>   nfsm_dissect(tl, u_int32_t *, NFSX_

Re: readdirplus client side fix (was Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm)

1999-07-29 Thread Bill Paul

Crap, I just sent out an incomplete message. Let me pick up from where
I left off. Here's a diff that shows the changes I made to nfs_vfsops.c:

*** nfs_vnops.c.origThu Jul 29 22:46:28 1999
--- nfs_vnops.c Thu Jul 29 22:36:39 1999
***
*** 2342,2348 
IFTODT(VTTOIF(np->n_vattr.va_type));
ndp->ni_vp = newvp;
cnp->cn_hash = 0;
!   for (cp = cnp->cn_nameptr, i = 1; i <= len;
i++, cp++)
cnp->cn_hash += (unsigned char)*cp * i;
cache_enter(ndp->ni_dvp, ndp->ni_vp, cnp);
--- 2342,2351 
IFTODT(VTTOIF(np->n_vattr.va_type));
ndp->ni_vp = newvp;
cnp->cn_hash = 0;
!   /*for (cp = cnp->cn_nameptr, i = 1; i <= len;*/
!   if (len != cnp->cn_namelen)
!   printf("bogus: %d %d\n", len, cnp->cn_namelen);
!   for (cp = cnp->cn_nameptr, i = 1; i <= cnp->cn_namelen;
i++, cp++)
cnp->cn_hash += (unsigned char)*cp * i;
cache_enter(ndp->ni_dvp, ndp->ni_vp, cnp);




Basically, at some point, the code tries to calculate a new hash value
(what for I don't know) of a name that was read from the directory
listing. However it uses "len" as the length of the name, which for
some reason I can't understand turns out not matching the cn_namelen
value in cnp. The "bogus" printf shows about a half dozen occasions
where len and cn_namelen don't agree. Sometimes "len" is larger
than "cnp->cn_namelen," sometimes it's smaller. By using cn_namelen
instead of len, everything seems to work correctly. It looks like
this loop makes more than one pass over directory entries, so it could
be that len is sometimes stale. If you can make sense of why this
happens, I would appreciate it: I don't like to commit changes when
I don't fully understand what's going on.
 
> Presumably this will not fix the SGI client.  I've no idea what the
> problem there is.  There may be a bug in the SGI client or there may
> be a bug in the client & server implementation of the protocol in FreeBSD.

Er, I think you misunderstood: there's nothing wrong with the SGI in
this case. I mentioned it because I was using it as a _server_ when
testing the client side readdirplus support: the behavior with the
SGI server was slightly different from another FreeBSD server (the
FreeBSD client blew up right away with the SGI acting as server,
where it took a little longer with the FreeBSD server). I think this
was just a consequence of the filesystems being laid out differently.
The patched FreeBSD client works fine now with the SGI server.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: readdirplus client side fix (was Re: IRIX 6.5.4 NFS v3 TCP client + FreeBSD server = bewm)

1999-07-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
> Look up a bit in the code.  If bigenough is not true, cnp does not 
> get initialized.  This could lead to the bogus length -- or rather,
> it would be the cnp that is bogus, not the 'len'.
> 
> The question is how to fix it.  I think we can safely avoid doing the
> cache_enter so try changing the 'if (doit)' to 'if (doit && bigenough)'.
> I've included the patch below.
> 
> I am not 100% sure about this.

Hm. Well, it cures the panic that I was experiencing quite nicely.
I'm going to commit this latest patch for now since it fixes both
the vnode locking problem and a crash condition, which are pretty
serious problems. If you come up with something different, I'll be happy 
to try it out.

Not a bad day's work. :)

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



FYI for those with 3Com 3c905C cards

1999-08-02 Thread Bill Paul

I've gotten a couple of reports from people claiming to have trouble
with the 3c905C, usually along the lines of "the autonegotiation works
and I can bring the card up, but I can't receive or send any packets."

(For those who don't know, the 3c905C differes from the 3c905B in that
it has a new 3Com XL ASIC revision called the "Tornado," plus has
management features including wake on LAN and SMBUS interfaces, and
it has a built-in boot ROM with support for BOOTP, DHCP, PXE, and a
bunch of other acronyms.)

First off, this is not entirely correctly: in the few cases where I've
been able to squeeze information out of people, I've found that running
tcpdump on the interface shows that the 3c905C does at least receive
some traffic, so "can't send or receive any packets" is not entirely
correctly. I haven't gotten anyone to do proper transmitter testing.

Diagnosing this has been hard, largely due to the fact that I haven't
been able to duplicate the problem with my hardware: I've tried a
Dell PowerEdge 2300/400 dual PII 400 SMP box, a P200 GW2K box, a
486/66 and my alphastation 200, and in each case I was able to send
and receive traffic without any apparent problems. The only peculiarity
I encountered was with the PowerEdge system. The two cards I have were
sent to me by 3Com; when the first one arrived, I plugged it into one
of the primary PCI slots in the Dell (one of the same slots I'd used
before with several other cards) and the machine refused to power on.
Initially, I thought the card was hosed, and 3Com sent me a second one,
which exhibited the same behavior. Finally I discovered that if I
put the card in one of the secondary PCI slots (which turns out to
be PCI bus 2, connected to PCI bus 0 via a DEC PCI-PCI bridge), the
machine would power on fine and the card would work correctly in
FreeBSD.

(For all you armchair technicians out there, no, it's not a problem
with the card drawing too much power or a shorted trace on the card.
I don't know what it is, though I suspect some PCI BIOS weirdness.)

Today I was experimenting with another machine, a Dell PowerEdge 4300/500
dual PIII 500Mhz system, and I did finally encounter a problem: for some
reason, the 3c905C refused to receive certain NFS packets (in particular
NFS create requests). I don't know why this happened exactly, however
I fixed the problem by modifying the xl driver code to issue individual
"reset TX block" and "reset RX block" command after issuing the global
reset command in xl_reset(). Now that the transmitter and receiver are
reset as part of the card initialization, everything seems to work
correctly.

I committed this changed to the xl driver in both the current and stable
branches today, plus I updated the code at http://www.freebsd.org/~wpaul/3Com.
If you have a 3c905C card and have been having trouble getting it to
work with FreeBSD, I would appreciate it if you could cvsup to the
latest -stable or -current or download the driver from the web server
and give it a try. I'm not positive that this will fix all the problems
people have been seeing, but I'm very curious to see what effect it has.

If you notice any improvements (or not), please let me know at
[EMAIL PROTECTED]

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



PCI bus on 486/66 no longer detected

1999-08-09 Thread Bill Paul

Today I thought I would upgrade my 4.0 box from a July 15th snapshot
to an August 9th snapshot. Only problem is I can't, because the August
9th snapshot's boot kernel refuses to locate my 486's PCI bus.
Previously, the bus was detetected as follows:

pcib0:  on motherboard
pci0:  on pcib0
chip0:  at device 0.0 on pci0
tl0:  irq 9 at device 13.0 on pci0
tl0: Ethernet address: 00:80:5f:9a:58:f1
xl0: <3Com 3cSOHO100-TX OfficeConnect> irq 12 at device 14.0 on pci0
xl0: Ethernet address: 00:10:5a:e3:60:9c
xl0: autoneg complete, link status good (half-duplex, 100Mbps)

pciconf -l shows the following:

mcmillan# pciconf -l
chip0@pci0:0:0: class=0x068000 card=0x chip=0x884910e0 rev=0x02 hdr=0x00
tl0@pci0:13:0:  class=0x028000 card=0x chip=0xf1300e11 rev=0x10 hdr=0x00
xl0@pci0:14:0:  class=0x02 card=0x764610b7 chip=0x764610b7 rev=0x30 hdr=0x00

No, I didn't change any hardware setting anywhere. The older kernel still
boots and works properly (I couldn't install the new snapshot because I
need to do a network install, and my network cards are PCI).

I have no idea where to look for the problem. Anybody have any bright
ideas?

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: buildworld fails in /usr/src/sys/modules/mii/..

1999-08-21 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Doug had to 
walk into mine and say:

>   Cvsup'ed today, make -DNOCLEAN world failed in gcc, did a 'make includes'
> and 'make world' and now it fails in:
> 
> ===> sys/modules/mii
> @ -> /usr/src/sys
> machine -> /usr/src/sys/i386/include
> perl /usr/src/sys/modules/mii/../../kern/makedevops.pl -h
> /usr/src/sys/modules/mii/../../kern/bus_if.m
> make: don't know how to make
> /usr/src/sys/modules/mii/../../dev/mii/miibus_if.m. Stop

Ai! The stupid bus interface description file escaped before I
could commit it with the rest of the miibus code. Okay, I just fixed
it. Thanks for the heads up and sorry for the trouble. My turn to
wear the pointy hat again.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Monday strikes again

1999-08-23 Thread Bill Paul

Must... control... fist of death...

I just tried to boot the latest -current snapshot (Aug 23) on my little
486/66 machine. The kern.flp kernel panics right after saying
"Probing for PnP devices:". Now, this machine has a PCI bus but it
doesn't support ISA plug and play, so before any of you lot start
theorizing about possible PnP BIOS problems, don't. The panic 
message says the kernel dies because of a page fault trying to reference 
memory location 0x4 (which is in page 0, which isn't mapped, which means 
this is a NULL pointer dereference) at PC 0xc0175b20.

Running "nm kernel | grep c0175b" on the install kernel yields:

c0175b68 t cnuninit
c0175bc8 t sysctl_kern_consmute

Running "nm kernel | grep c0175a" on the install kernel yields:

c0175a50 T cninit
c0175afc T cninit_finish
c0175a50 t gcc2_compiled.
c0175a08 t l_noclose
c0175a14 T l_noread
c0175a2c t l_norint
c0175a38 t l_nostart
c0175a20 T l_nowrite
c0175a44 t l_nullioctl

My money says the problem is cninit_finish(). The hardware config of
this machine is as follows:

486DX2-S 66Mhz CPU
16MB RAM
Diamond Speedstar ISA SVGA adapter (ET4000 chipset, 1MB RAM)
IDE disk controller
Maxtor LTX-200A IDE disk
3.5" floppy drive
2 serial ports, one parallel port
Integrated Micro Systems PCI bridge
Compaq NetFlex 3/P PCI ethernet adapter
D-Link DFE-550TX PCI ethernet adapter

The machine is running a -current snapshot from August 9th which works
fine (or at least, it did after I fixed the PCI bridge detection 
breakage that screwed it up last time). It looks like the major 
difference is that /sys/i386/i386/cons.c was taken away and replaced with 
some MI console routines in /sys/kern. My gut tells me that console 
initialization is failing because it can't find the ISA graphics
adapter for some reason.

Anybody have any bright ideas where I can start looking for the problem?

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: USB broken?

1999-12-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Eric D. Futch 
had to walk into mine and say:

> I'm running -current that's about a week old.

Erm... are you sure? I'm having trouble believing you.

>  I configed my kernel for
> USB support.  After turning on the USB interface in BIOS kernel panics
> after it probes uchi0.  Below is the panic screen, I don't have much else
> to go on.
> 
> ---
> uhci0:  rev 0x01 int d irq 10 on pci0.7.2>
> kernel trap 12 with interrupts disabled

See this kernel probe output here? This is not from a 4.0-CURRENT
kernel from a week ago. This is what the probe output from a recent
-current system should look like:

uhci0:  irq 11 at device 7.2 on pci0

Notice the difference? It's been like that for a *long* time now.
Therefore I can only conclude that either you're not actually running 
-current, or else you thought it would be okay to substitute in a really 
stale entry from a system log file from a 3.x system. Either way, you
need to re-evaluate the situation and provide more info.

Now rather than being vague, go back and show us what uname -a says
on this allegedly -current system and show it to us. Show us the
*entire* dmesg output too, while you're at it.

Furthermore, you should be able to test USB support without recompiling
the kernel. All you need to do is kldload usb. That will load the usb.ko
kernel module, which should find the UHCI controller.

>From the panic message you showed here, you're using SMP. Have you
tested it with a UP kernel? (Yes, it's supposed to work either way,
but it would be nice if you would just test it to rule out some sort
of SMP-related condition.)

What you should do is this:

- Compile a kernel with options DDB, but *WITHOUT* USB support.
- Boot this kernel.
- Type kldload usb
- See if the system crashes.
- If it does, it will drop into the debugger.
- Type 'trace'
- Report what it says.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: panic in uipc_mbuf.c or if_aue.c

2000-01-12 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Jun Kuriyama 
had to walk into mine and say:

> 
> I got more panic with DEBUG=-g and INVARIANTS.  I saved core dump at
> this time.

We need version information! How recent is your version of -current!
What's the rcsid from if_aue.c! Details please!
 
> This panic is caused when I tested heavy traffic via aue0 (USB
> ethernet adaptor) with "while looped" large file scp.  I think that is 
> only active process.
> 
> My ipfw is set as default like as "65535 allow ip from any to any".

*sigh* No, this is not what you meant to say. What you meant to say
is: "Oh, by the way, I also use ipfw. And oh, by the way, I didn't
think to repeat the same test without ipfw."

Try the test again with a new kernel *without* ipfw. Maybe the problem
is in ipfw. Maybe it isn't, but you have to do some testing to eliminate
the possibility!

> Should I give some data to solve this problem?

No, you should sit there and wait for the bug fairy to come and tap
you with her magic wand.

Print out the contents of the mbuf!! Show is what it thinks the real
length is!

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: buildworld fails on Alpha

2000-01-14 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Wilko Bulte had 
to walk into mine and say:

> On a very freshly supped -current on Alpha:
> 
> ===> sys/modules
> ===> sys/modules/aha
> rm -f aha.h setdef0.c setdef1.c setdefs.h setdef0.o setdef1.o aha.ko aha.o
> aha_isa.o @ machine symb.tmp tmp.o opt_cam.h opt_scsi.h bus_if.h device_if.h
> isa_if.h
> rm -f .depend /usr/src/sys/modules/aha/GPATH /usr/src/sys/modules/aha/GRTAGS
> /usr/src/sys/modules/aha/GSYMS /usr/src/sys/modules/aha/GTAGS
> ===> sys/modules/amr
> rm -f setdef0.c setdef1.c setdefs.h setdef0.o setdef1.o amr.ko amr.o
> amr_pci.o amr_disk.o @ machine symb.tmp tmp.o bus_if.h device_if.h pci_if.h
> rm -f .depend /usr/src/sys/modules/amr/GPATH /usr/src/sys/modules/amr/GRTAGS
> /usr/src/sys/modules/amr/GSYMS /usr/src/sys/modules/amr/GTAGS
> ===> sys/modules/an
> cd: can't cd to /usr/src/sys/modules/an
> *** Error code 2
> 
> Stop in /usr/src/sys/modules.
> *** Error code 1

Roar. I swear I checked in this module Makefile. Honest and for true.

Okay, I think I've really got it this time. Please try cvsupping
again: you should get a src/sys/modules/an/Makefile for compiling
the Aironet driver module.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current, XEON and MP performance

2000-01-16 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Achim Patzner 
had to walk into mine and say:

> I don't know where to ask first (or what to look at) so I'd like some
> creative guessing by some people closer to the sources...
> 
> Running the same programs on nearly identically configured -CURRENT kernels
> on a

> HP NetServer LH4
> (four 550 MHz PIII Xeon with 512MB Cache,

512MB cache? I think you mean KB.
 
> supposed to be an INTEL 450NX-based chipset)
> with one GB RAM


> and a home-grown ASUS P2-BDS based system
> (two 450 MHz PIII)
> with 512 MB RAM

> I find that the
> programs (running on the same input data) on the "smaller" machine tend to
> take only a third of the CPU time they need on the LH4.

Can you show us the actual results from your testing (an hopefully your
testing methods as well) that led you to this conclusion? Details matter.

Are these programs I/O bound, CPU bound, or a little of both? FreeBSD's
SMP support still depends largely on the big giant lock approach which
means that while you can indeed get processes running on multiple CPUs
at the same time, you end up using only one CPU once you enter the kernel.
And you have to enter the kernel in order to perform any disk, network
or even console I/O. If your programs suck large datasets into memory,
do lots of number crunching on them, then spit the results back out to
a disk file, then they should benefit from more CPUs. However if they
read and write data a lot while running, you're going to be limited by
the big giant lock.

There may also be scalability issues (i.e. does FreeBSD perform better
as you add more CPUs or does it spend so much time trying to stay out
of its own way that it actually performs worse) however I don't know
enough to say if you could be running into such problems as the only
SMP machines I have access to have only 2 CPUs.

> [Worse: The LH4
> behaves like a spoilt brat when it comes to hardware, disliking the Intel
> EtherExpress that came with it (generating bus mastering problems after
> bringing it up),

Which model Intel EtherExpress? What chipset? What bus mastering problems 
exactly?

> having interrupt routing problems with two DEC TULIP based
> ethernet cards sharing the same IRQ

Which tulip cards? What driver? What kind of problems? I find it unusual
that two PCI devices would wind up with the same IRQ with the APIC enabled
since it's supposed to give you a lot more IRQs than in UP mode.

> and being picky just which 3C906B-TX it
> gets plugged in.

There is no such card as a 3c906B. There's a 3c905B, and there's a
3c905C. Unfortunately, 3Com did go through several different ASIC
revisions with the 3c905B series, some of which work better than others,
but again, I see no details here.

> It's a bitch and I'd like shooting it. Oh yes - HP has been
> very helpful, telling me that I was at least 10 years behind wanting to run
> a BSD and that only WinNT, HP-Sux and Linux were supported on this hardware.]

If somebody at HP actually told you that HP-UX runs on anything besides
the PA-RISC architecture (and, in the distant past, the m68k architecture), 
they were either a) jerking your chain, b) working at HP in an parallel 
dimension, c) misinformed, or d) not terribly bright.

(I'm sure HP wouldn't mind having HP-UX/x86, but they certainly don't
offer it as a product now.)

> Back to the topic: Are there any reasons for these observations? If someone
> liked taking a closer look at it I could provide them with access to the
> machine (and its console). I ran out of clues...

Hard to tell really without more info. We don't know what your test
programs do, so it's impossible to predict what their behavior
should or shouldn't be.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Current, XEON and MP performance

2000-01-17 Thread Bill Paul
ing the meaning of the full
  duplex enable bit in CSR6.
- The chip defaults to autoneg turned on after a reset. It also
  defaults to half duplex. So if you want to manually enable full
  duplex, you *must* first turn off the autoneg (by clearing CSR14).

The de driver doesn't know about this, and the result is that full
duplex mode just won't work. If your board uses an external transceiver,
the internal NWAY support should be turned off, but it isn't. So
selecting full duplex mode manually or autonegotiating full duplex
with a link partner doesn't work, and you stay in half duplex mode
no matter what you do.

The if_dc driver handles this correctly. You could tweak the de driver
to handle it too, however looking at the code, I don't think it handles
non-MII cards properly. The dc driver should handle those correctly
(at the very least, you can always manually override the mode with
ifconfig even if the autoneg gets it wrong).
 
> > I find it unusual
> > that two PCI devices would wind up with the same IRQ with the APIC enabled
> > since it's supposed to give you a lot more IRQs than in UP mode.
> 
> *ROTFL* (sorry). HP's algorithm for allocating IRQs is giving "same devices"
> the same IRQs if they are running out of IRQS.
> 
> Just take a look at this:

Uhm...

[...]

> APIC_IO: Testing 8254 interrupt delivery
> APIC_IO: Broken MP table detected: 8254 is not connected to IO APIC int pin=
>  2
> APIC_IO: routing 8254 via 8259 on pin 0
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #3 Launched!

I'm a little worried about this. You might want to forward all this
information to the freebsd-smp list: I'm a little out of my depth.

> >=20
> > > and being picky just which 3C906B-TX it
> > > gets plugged in.
> >=20
> > There is no such card as a 3c905B. There's a 3c905B, and there's a
> > 3c905C. Unfortunately, 3Com did go through several different ASIC
> > revisions with the 3c905B series, some of which work better than others,
> > but again, I see no details here.
> 
> 2nd time. Sorry. I should really be more carefull:
> 
> xl0@pci1:3:0:   class=3D0x02 card=3D0x905510b7 chip=3D0x905510b7 rev=3D=
> 0x64 hdr=3D0x00
> xl1@pci2:2:0:   class=3D0x02 card=3D0x905510b7 chip=3D0x905510b7 rev=3D=
> 0x24 hdr=3D0x00
> 
> The one not working stated rev=3D0x00. And it is working perfectly well in
> another machine.

Again, there is a distinct lack of details. You can't just say it
doesn't work. You have to describe the failure.

> 
> Ok. Tell me what info to gather. Any preferred benchmarks?

Again, you didn't show us the actual results from your original tests.
You just said "it's not as fast." We don't want your interpretation of
the data: we want the data itself. Again, I would ask the freebsd-smp
list about this. Also, I would try installing a more current version
of -current just to be sure you're on the same page as they are.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: USB D-Link DSB-650 kue0: failed to load code

2000-01-17 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Eric J. Haug 
had to walk into mine and say:

> Hi all,
> 
> I have a Toshiba 2100CDS laptop with an OHCI USB controller
> that gives a kue0: failed to load code segment error message
> Rather than clutter the list, the conf file and the dmesg boot
> file is available at
> ftp.eas.slu.edu:/pub/incoming/[usbdmesg, usbbootmsg, usbltaconf]
> The usbbootmsg is from yesterdays kernel sources with some of the debug
> variables set to 15.
> The changes from today did not seem to make any difference.
> 
> the stripped mesg output from a boot follows:
> 
> ohci0:  mem 0xf7fff000-0xf7ff irq 11 at device 11.0 
>on pci0
> usb0: OHCI version 1.0
> usb0:  on ohci0
> usb0: USB revision 1.0
> uhub0: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> kue0: D-Link Corp 10Mbps ethernet adapter, rev 1.00/0.02, addr 2
> kue0: failed to load code segment: IOERROR
> device_probe_and_attach: kue0 attach returned 6

An important point which you neglect to mention is: how long did it
take before the IOERROR message appeared? (That is, how much time
passed between the first kue0 probe message and the next?) Getting the 
Kawasaki  chip to work requires downloading firmware into it, and the 
code segment of the firmware is about 3800 bytes, which makes for a 
fairly large control transfer. I had to set things up with a longer than 
normal timeout to make this work on my laptop.

If the IOERROR message appears after only a second or two (or maybe
three), then the timeout may not be long enough for your machine. If
it sits there for a long time (ten seconds or longer) then it's probably
something else.

To see if this in fact the problem, do the following:

- Bring up /sys/dev/usb/if_kue.c in your favorite editor.
- Find the kue_do_request() function.
- Change the timeout from 50 to 100, i.e. change this:

usbd_setup_default_xfer(xfer, dev, 0, 50, req,
data, UGETW(req->wLength), USBD_SHORT_XFER_OK, 0);

  to this:

usbd_setup_default_xfer(xfer, dev, 0, 100, req,
data, UGETW(req->wLength), USBD_SHORT_XFER_OK, 0);

Then recompile your kernel/module/whatever and try again. (And let
me know what happens, of course.)

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: /dev/sndstat

2000-01-17 Thread Bill Paul

Of all the gin joints in all the towns in all the world, George Cox had 
to walk into mine and say:

> I cvsupped yesterday.

I install a complete snapshot today.
 
>   extremis /dev # ./MAKEDEV sndstat0
>   expr: non-numeric argument
>   bad node: mknod mixerstat0
> 
> Something's wrong :-)

No, nothing is wrong:

x-ctr# cd /dev
x-ctr# ./MAKEDEV snd0
x-ctr# ls -l /dev/sndstat
crw-rw-rw-  1 root  wheel   30,   6 Jan 17 10:51 /dev/sndstat

/dev/sndstat is created as a consequence of creating doing MAKEDEV snd0.


-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: USB D-Link DSB-650 kue0: failed to load code

2000-01-17 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Eric J. Haug 
had to walk into mine and say:

> 
> >  - Find the kue_do_request() function.
> >  - Change the timeout from 50 to 100, i.e. change this:
> >  
> >  usbd_setup_default_xfer(xfer, dev, 0, 100, req,
> >  data, UGETW(req->wLength), USBD_SHORT_XFER_OK, 0);
> 
> After this change
> Again, the boot message rushes by.
> But later, say about 8 seconds or so
> about the time the system is printing out ppc0 messages
> i get
> panic: removing other than first element.

*sigh*

Whatever. Fortunately, it turns out I had an OHCI controller here
and I was able to duplicate this problem. (It doesn't happen with
my laptop and UHCI controller, otherwise I could have spotted it
before.) I just updated the driver in -current to fix this. The
driver tries to check if the firmware is already running by reading
the MAC address. It turns out that doing this when the firmware is
not running generates a 0 length transfer. This doesn't seem to do
any harm with a UHCI controller, but it makes the OHCI controller
(or its driver) mad. I changed the code to test for the presence
of already running firmware in a different way, and now it seems
to work with my test system.

Make sure to get src/sys/dev/usb/if_kue.c revision 1.16. This version
should fix the problem.

I also noticed that performance with the OHCI controller is significantly
better than with the UHCI controller. Just my rotten luck I'm stuck
with a UHCI one in my laptop.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ...(file transfer crashes system) ethernet driver or IP stack bug?

2000-01-20 Thread Bill Paul

Of all the gin joints in all the towns in all the world, arnee had to walk 
into mine and say: 
 
> FreeBSD 4.0-CURRENT-2118-08:18 #4: Tue Jan 18 13:17:49 PST 2000
> root@groovinit:/usr/src/sys/compile/GROOVINIT
> Timecounter "i8254"  frequency 1193182 Hz
> Timecounter "TSC"  frequency 249919443 Hz
> CPU: Cyrix 6x86MX (249.92-MHz 686-class CPU)

Ack...

>   Origin = "CyrixInstead"  Id = 0x601  Stepping = 1  DIR=0x1453
>   Features=0x80a135
> real memory  = 29360128 (28672K bytes)
> avail memory = 25792512 (25188K bytes)
> Preloaded elf kernel "kernel" at 0xc02c9000.
> md0: Malloc disk
> npx0:  on motherboard
> npx0: INT 16 interface
> pcib0:  on motherboard

No, please...

> pci0:  on pcib0
> isab0:  at device 1.0 on pci0
> isa0:  on isab0
> ata-pci0:  port
> 
>0x4000-0x400f,0xa8c42268-0xa8c4226b,0x35d8a0c0-0x35d8a0c7,0xc7cc0-0xc7cc3,0x42140728-0x4214072f
> irq 0 at device 1.1 on pci0
> ata-pci0: Busmastering DMA supported
> ata-pci1:  port
> 0xf080-0xf0bf,0xf7e0-0xf7e3,0xf790-0xf797,0xf7f0-0xf7f3,0xf7a0-0xf7a7 mem
> 0xffac-0xffad irq 11 at device 13.0 on pci0
> ata-pci1: Busmastering DMA supported

Stop, you're killing me...

> ata2 at 0xf7a0 irq 11 on ata-pci1
> dc0: <82c169 PNIC 10/100BaseTX> port 0xf200-0xf2ff mem 0xffaaff00-0xffaa irq 9 at
> device 15.0 on pci0

Aie!

Oh for the love of *god*. Rip this thing out of your machine now. Put it
on the ground. Stomp on it. Repeatedly. Then set it on fire. Bury the
remains in the back yard. Then run - don't walk! - to your local computer
store, put a crowbar in your wallet, and buy a better ethernet card.

Don't whine, don't bitch, don't moan, don't complain that it works with
Windows (actually, I bet it doesn't; not on that hardware, using bus
master DMA for both controllers). Just do it.

You have a Cyrix CPU, and I'm willing to bet you've overclocked it. ("What?
You mean that might have some effect on the situation?") You have a PCI 
chipset which the probe messages don't even identify, and you're using 
not one but *two* bus master devices on it, one of which is a PNIC ethernet
controller that can barely do bus master DMA correctly on a good day. Not 
only that, but you have two ATA controllers in this machine, one of which has 
no devices on it, and you have the damn hard disk on the *second* controller. 
Why do you have two controllers? Why not just use the built-in one!

I'm also missing some information here since I arrived in the middle of
this exchange. You didn't say if you're using 10Mbps or 100Mbps with this
card. And you don't say what it's connected to (hub? switch? vendor?
model?)

Does this have something to do with the PNIC? Oh, probably. But I'm not going
to even attempt to debug this. There's no way in hell I could duplicate this 
hardware configuration locally, and even if I did, I probably couldn't 
duplicate the exact same problem. And even if I could do *that*, I
still wouldn't know how to fix it. I'm sorry, but I've pulled out enough
hair over this damn chip. I'm sure I've wasted at least a couple months
of my life trying to make this thing work reliably. I've added several
software workarounds, I defaulted the stupid transmit threshold setting
to store and forward mode. Well I've had it: no more. I'm through trying
to bitchslap this rotten piece of silicon to its senses. I'm not going to
go nuts every time somebody comes up with yet another oddball hardware
combination. There's a limit to my patience and this chip has reached it.
Now it's somebody else's turn to lose their sanity. I have the datasheet for 
the PNIC at http://www.freebsd.org/~wpaul/PNIC. You have the driver source.
Somebody *else* try and figure it out, and then tell me then answer when
you have it.

That said, if you have overclocked this machine, then un-overclock it *right* 
*now* and never, ever do it again! PCI bus master DMA is goofy enough without
people sticking their fingers in the works.

Man, why does PC hardware have to suck so hard.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Problems with an0 and ISA Aironet Card..

2000-01-27 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Paul Reece had 
to walk into mine and say:

> On Thu, 20 Jan 2000, Bill Paul wrote:
> 
> 
> 
> > Back up. You're leaving out some info.
> > 
> > - When did you buy these cards? (The firmware rev may be an issue.
> >   knowing when you bought the card helps me figure out if your firmware
> >   is newer than mine.)
> 
> Cards were purchased in the past 6 months.  Revision of the card I'm using
> at the moment is 3.13 - I upgraded the firmware to the latest. (Win DGS
> under 'status' reports 3.13).

You can view the firmware rev with the if_an driver (when it works) by
doing ancontrol -i an0 -I. The newest PCMCIA card that I have seems to be 
using revision 3.10. The ISA card that I have is using 2.06.

The trouble is I don't have Windows machine set up to run the firmware
update utility. What I tried to do today was swap the PCMCIA module on
my existing ISA card with one of the new ones with the later firmware.
I did this a while back when I got our first batch of cards. However, I
can't do it now.

One of the problems I had with the Aironet cards initially is that they
were set up so that they would operate in two modes: if you applied +5volts
to the vpp1 and vpp2 pins on the PCMCIA module, it would work in PCMCIA
mode such that you could get at the CIS data and configure it like any
other PCMCIA card. Without the +5volts, the module would work in a
special 'dumb bus' mode that would allow it to interface with the ISA
and PCI bridge adapter cards that Aironet uses for their ISA and PCI
cards. Basically, this allows them to make just one PCMCIA module and
use it in all three kinds of cards.

However the latest PCMCIA cards that we just got are different: now they
always work in PCMCIA mode regardless of how vpp1 and vpp2 are set. On
the one hand, this is good because it means you don't have to frob
sys/pccard/pccard.c to enable the vpp voltage when the card is inserted.
(My older cards will not work with FreeBSD unless I apply this tweak to
the kernel.) On the other hand, this means that the newer PCMCIA cards
won't work in the ISA and PCI bridge adapters.

This sort of stymied my attempts to duplicate your problem here in the
lab. What would be nice is if you could somehow set up a scratch box
with an Aironet ISA4800 card in it that I could access remotely. I'm
reasonably confident I could make it work if I could just experiment
with it for a while. Unfortunately, this may not be possible depending
on technical on various political constraints, especially since I need
to twiddle around as root in order to examine register contents and
test a new driver.

> pcpaul#   ./testa 
> COMMAND: 0
> PARAM0: ff11
> PARAM0: 0x
> 
> (still no lights on card)
> if I run it again:
> 
> pcpaul#   ./testa
> COMMAND: 0
> PARAM0: 1234
> PARAM0: 0x
> 
> (and still no lights).
> 
> 
> This info help at all?

Well, yes. It tells me two things. First, it tells me that I made a
typo on the program that I gave you. :)

Second, it shows me that the card is at the I/O address that it's
supposed to be, although it appears to not be responding to the
'read SSID list' command that the if_an driver issues during the
probe phase. Unfortunately, as I said earlier, I need to be able to
experiment on this thing in order to figure out the problem, and I
can't do that unless you can somehow arrange remote access.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Problems with an0 and ISA Aironet Card..

2000-01-20 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Paul Reece had to 
walk into mine and say:
 
> Having a few problems trying to get an ISA Aironet 4800 card working under
> FreeBSD 4.0-CURRENT.  I did try with 3.4-RELEASE first with the
> appropriate drivers, but had even less luck.
 
> What I'm seeing at boot:

Back up. You're leaving out some info.

- When did you buy these cards? (The firmware rev may be an issue.
  knowing when you bought the card helps me figure out if your firmware
  is newer than mine.)
- What sort of machine are you using? (Show us the *whole* dmesg output.
  Timing may also be an issue, in which case I need to know the CPU speed.)
 
> first suspect lines:
> 
> isa0: unexpected tag 14
> isa0: unexpected tag 14

I'm not sure if this is related.
 
> then:
> 
> an0: reset failed
> unknown0:  at port 0x100-0x13f irq 5 on isa0
> an0: reset failed
> unknown1:  at port 0x140-0x17f irq 10 on isa0
> 
> 
> (machine has 2 cards in it).  When trying with NON PNP mode, the cards
> also have the same problem.

Tell us what kernel config line you use when using the card in non-PnP
mode. Note that the switches on the card must all be in the correct
position in order to enable PnP mode: consult your user's manual for
the proper settings. I believe they all need to be in the off position,
however I don't have the manual here at home with me so I could be
mistaken. (I do remember they all have to be set the same way.)

>  PCI cards work fine, just not the ISA
> equivalents..
> 
> Anyone have any clues/hints/tips etc?

Not really. My one and only ISA card works fine, or at least it did when
I did my tests right before I imported the driver. It would help if you
could actually look at the card when the kernel boots to see if the LEDs
flash at all. If the reset is screwing up, then you should see the LEDs
flicker when it tries to access the board. If it's failing to access the
board at all, the LEDs won't change at all.

Try commenting out the code in an_reset() (i.e. make it an empty
function that does nothing) and see if it works then. If it *still*
doesn't work, then there's something else wrong. Try to run the
following program as root:

#include 
#include 
#include 

#include 

#define IOADDR  0x100 /* change to 0x140 for other card */

main()
{
int f;

f = open("/dev/io", O_RDWR);

printf("COMMAND: %x\n", inw(IOADDR));
printf("PARAM0: %x\n", inw(IOADDR + 0x2));
outw(IOADDR + 0x2, 0x1234);
printf("PARAM0: 0x\n", inw(IOADDR + 0x2));

exit(0);
}

This will print out the command and status registers for the card
at iobase 0x100. If the card has been properly activated, you should
see  for the COMMAND and PARAM0 registers initially, then the
program will try to write 0x1234 to the PARAM0 register and read it
back. If it reads back 0x1234, then the card is configured right
and the reset is screwing up. If on the other hand the program prints
 for all of the register contents, then the card is not really
configured properly for address 0x100.

> 
> Cheers.
> 
> 
> Regards,
> Paul.
> 
> (replies to me direct please - not on list)

I'm doing both. Deal with it.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: ifconfig hang

2000-02-05 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Chuck Robey had to 
walk into mine and say:

> I'm trying to get current up on another test box,

Who's exact CPU type and hardware configuration must be a state secret,
since you didn't describe them here.

Come _on_ people, how often do I have to keep harping on this? Don't just
tell me "I have a box." Tell me about it!

> and this one has a CNET
> AX8814 equipped network card.  One second after I do a ifconfig:
> 
> ifconfig dc0 inet (somaddr) netmask (somemask)
> 
> it hangs.  It does this with a completely static kernel (shouldn't be
> loading any modules), even if I start up in single-user.  My config has:
> 
> device  isa
> device  eisa
> device  pci
> device  miibus  # MII bus support
> device  dc0
> 
> as far as network.  My dmesg on the machine shows what I take to be a
> normal dc0 entry, but something I don't recognize for "amphy0" (I added
> cariage returns 'cause I know my mailer will do a worse job if I don't):
> 
> dc0:  port 0x6100-0x617f
>  mem 0xf0201000-0xf020107f irq 12 at device 19.0 on pci0
> dc0: Ethernet address: 00:80:ad:41:4a:95
> miibus0:  on dc0
> amphy0:  on miibus0
> amphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> 
> Any idea why my hang might be happening?

amphy is the driver for the transceiver on the card with the ASIX ethernet
controller. The ASIX AX88140A doesn't have a built-in transceiver. It's
actually the transceiver (PHY) that does the autonegotiation. I suspect
it's really a Davicom PHY, but the Davicom parts look like they're designed
to duplicate the register layout and operation of certain AMD PHYs, and
they claim to have the same vendor/device ID info.

Anyway. This is almost certainly a hardware problem. You haven't provided
enough evidence for me to suspect it could be anything else (it would have
helped if you had tried compiling the kernel with options DDB and attempted
to break into the debugger to see where it was stuck -- if you actually
did try this and it was wedged so bad that you couldn't break into the
debugger, then you should have said so). The usual suspect in this sort of
thing is some sort of problem with bus master DMA. Maybe you tried to 
overclock this system and got the timings wrong. Maybe the PCI chipset has
bugs. Maybe it doesn't get along well with the ASIX part. Maybe you have
an old machine that doesn't support bus master DMA on all of its slots,
and you put the card in a slave-only slot without realizing it.

As soon as you ifconfig an interface up, the kernel tries to send a gratuitous
ARP through it, which triggers a transmission and a DMA. If there's a 
problem, this DMA operation could wedge the bus. Some of the other cards
need to do a DMA just to program the receive filter (though the ASIX is
not one of these).

I have tested the dc driver with an ASIX card and I'm pretty sure I didn't
do anything recently to goof it up, otherwise somebody else would have
complained by now. (Right guys? Right? Bah.) I would try to scrounge up
an MS-DOG boot floppy and run the diagnostics on the diskette supplied
with the card. If the vendor-supplied diags also wedge the system during
a transmission, then you need to check your hardware.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Suggestions for Gigabit cards for -CURRENT

2000-02-03 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Kenneth D. Merry had 
to walk into mine and say: 

> On Wed, Feb 02, 2000 at 13:03:09 -0500, Thomas Stromberg wrote:
> > We're currently looking at upgrading several of our FreeBSD servers
> > (dual PIII-600's, 66MHz PCI) and some Sun Ultra's to Gigabit Ethernet.
> > We plan to hook these machines into our Cisco Catalyst 5000 server. They
> > will most likely move to be running FreeBSD 4.x by the time that we
> > actually get our budget approved. What experiences do you guys have with
> > the cards?
> > 
> > Currently we're looking at the ~$1000 range,  specifically at Alteon
> > 512k's ($1000) for the FreeBSD servers and Sun Gigabit 2.0's ($2000) for
> > the Sun servers. I was interested in the Myrinet cards (for obvious
> > reasons), but they appear to require a Myrinet switch (though I found
> > myself slightly confused so I may be wrong) rather then being able to
> > hook into our Catalyst 5000. The Intel PRO/1000 Gigabit cards look
> > rather nice too, but I haven't seen drivers yet for FreeBSD (Linux yes).
> > 
> > I'm pretty much purchasing on marketing and reputation rather then any
> > experience here, so any help would be much appreciated. 
> 
> I would recommend getting Alteon boards.  It is likely that the Sun boards
> are Alteon OEM, although I'm not positive.

I think the first gigabit cards Sun had on the market were OEMed from Alteon,
but I've been told that their newer cards are something else entirely. I don't
know exactly what, but they're not Tigon-based.
 
> One thing to keep in mind is that both Netgear and 3Com are OEMing Alteon
> boards, and you'll get them much cheaper that way.  The boards are pretty
> much identical to the Alteon branded boards (which have no identifying
> marks on them).  The performance is the same, at least for the Netgear
> boards.  (I don't have any 3Com boards.)

There are a number of companies selling OEM'ed alteon boards for various
prices. IBM sells two cards, one for PC-based hardware and one for RS/6000s
which I think are basically the same hardware with different driver kits.
Of course, the RS/6000 card is $2100 while the PC-based one is probably
around $600 or so. My guess is they're Alteon cards with different PCI device
IDs, but I can't confirm this as I don't have one. The SGI gigabit adapter,
NEC gigabit adapter, DEC EtherWORKS/1000, 3Com 3c985 and 3c985B, and
the Netgear GA620 are all Tigon boards (not to mention the Alteon ACEnic)
and should all work fine with the ti driver.

Oh, I found another one recently: Farallon also sells a gigabit PCI NIC
for the Mac which is Tigon-based.

> The Netgear GA620 is a 512K Tigon 2 board, and generally goes for around
> $300 or so.  The 3Com boards have 1MB of SRAM, but I'm not sure whether
> they're Tigon 1 or Tigon 2.  You really want a Tigon 2 board.  Maybe
> someone who has one can comment.

The original 3Com 3c985 was a Tigon 1 board (I have one) and the 3c985B is
a Tigon 2. The Tigon 1 is no longer in production, though of course I try
to maintain support for it for those people who still have them. The Tigon 1
had only a single R4000 CPU in it while the Tigon 2 has two.

The Netgear GA620 is by far the cheapest at about $320. The various OEM
cards sold for the PC are usually around $600, give or take $100. The GA620
only has 512K of SRAM compared to 1MB on most of the others, however you're
not likely to notice a problem with that unless you try to push the card
really hard with a really big TCP window size and jumbo frames.
 
> The Intel cards may look nice, and there is a FreeBSD driver for them, but
> I wouldn't get one.  The first problem with the Intel boards is that there
> are no docs for them.  Supposedly they're using a Cisco chip, and the specs
> for the chip are top secret.

This is why I don't buy or recommend Intel NICs. But that's just my
personal bias.
 
> The FreeBSD driver (written by Matt Jacob) is based on the Linux driver,
> which Intel wrote, and he hasn't yet managed to get decent throughput
> through the cards.  (Maybe Matt will comment.)  They also only have 64K of
> memory on board, which is insufficient for a heavily loaded server, IMO.
> 
> Even with the 512K Alteon boards, you have a minimum of about 200K, and
> probably more like 300K of cache for transmit and receive.

The Alteon cards also need a certain amount of SRAM to run the firmware.
 
> The Intel boards also don't have the features necessary to really support
> zero copy TCP receive.
> 
> The Alteon boards, on the other hand, have most of the features necessary,
> and if I get some time, I may add the last feature (header splitting) t

Re: Suggestions for Gigabit cards for -CURRENT

2000-02-03 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Kenneth D. Merry had 
to walk into mine and say: > 

> [ Thanks for the info Bill! ]

No problemo.

[...]

> > Both the Alteon and SysKonnect NICs are 64-bit PCI cards. (Actually, I'm
> > pretty sure all of the PCI gigabit NICs are 64-bit.) Both kinds of cards
> > can do jumbograms on FreeBSD. Also, both vendors have released pretty good
> > hardware documentation, which makes them good choices for custom applications,
> > if you're into that sort of thing.
> 
> Alteon also provides firmware source, which can really come in handy.  Do
> you know if SysKonnect has released firmware?

The SysKonnect GEnesis controller and the XaQti XMAC II chips are both static
devices and do not require firmware. If you go to www.syskonnect.com and
search their online knowledge base for the word "manual" you should be
able to find the gigabit NIC programmer's manual. Similarly, XaQti has
the full datasheet for the XMAC II at www.xaqti.com somewhere. (As I recall,
you have to go through a brief registration procedure to get it, but once
that's done you should be able to download it right away.)

Talking of the XMAC II, there's one other thing I forgot to mention earlier.
The FreeBSD sk driver does jumbo frames, but the SysKonnect drivers don't.
At least, not yet. The XMAC II's receive FIFO is 8K. By default, the chip
operates in 'store and forward' mode in order to perform error checking on
received frames (it has to get the entire frame in the FIFO in order to
do a CRC on it, I think). This is fine for normal frames, but if you want
to handle jumbograms larger than 8192 bytes, you have to put the chip into
'streaming' mode, otherwise any frame larger than 8192 bytes will be truncated.
To get 'streaming' mode to work, you have to disable all of the RX error
checking.

Also, the default TX FIFO threshold on the XMAC is very small (8 bytes, I
think). The FreeBSD sk driver bumps this up a bit (to 512 bytes, if I
remember correctly). This is to deal with the case where you have a dual
port card and are pumping data through both XMAC chips at once: with the
default FIFO threshold, I would often see TX FIFO underruns from one of
the XMACs and performance on that port would get spotty. I think the total
TX FIFO memory on the XMAC II is 2K.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Suggestions for Gigabit cards for -CURRENT

2000-02-04 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Kenneth D. 
Merry had to walk into mine and say:

> > Talking of the XMAC II, there's one other thing I forgot to mention earlier.
> > The FreeBSD sk driver does jumbo frames, but the SysKonnect drivers don't.
> > At least, not yet. The XMAC II's receive FIFO is 8K. By default, the chip
> > operates in 'store and forward' mode in order to perform error checking on
> > received frames (it has to get the entire frame in the FIFO in order to
> > do a CRC on it, I think). This is fine for normal frames, but if you want
> > to handle jumbograms larger than 8192 bytes, you have to put the chip into
> > 'streaming' mode, otherwise any frame larger than 8192 bytes will be truncated.
> > To get 'streaming' mode to work, you have to disable all of the RX error
> > checking.
> 
> That is unfortunate, since it means you can't do checksum offloading with
> jumbo frames.

Uhm. I'm not sure about that. The 8K FIFO limitation is in the XMAC II,
not in the GEnesis controller. And I believe it's the GEnesis that actually 
does the hardware checksumming stuff.

Oh, and the XMAC appears to have a 4K TX FIFO, not 2K. My mistake.

> FWIW, of the three gigabit ethernet implementations I've seen anything of
> (Alteon, Intel, SysKonnect), none have implemented all of the hooks
> necessary for a seamless zero copy receive implementation.
> 
> Alteon comes the closest, but they don't support splitting out the headers
> (yet), which is a requirement for us.  The only way to do zero copy receive
> with our VM architecture (that I know of) is page flipping, i.e. receive
> the page in the kernel, and then trade it for the user's page.  You can't
> do it on anything less than page-sized granularity, and things have to be
> page aligned.  (The IO-Lite stuff from Rice is an exception to all this.)
> 
> The nice thing about the Alteon boards, though, is that you can modify the
> firmware, and so header splitting is an option there.  It would even be
> possible to split the headers off of IPv6 packets, or any other protocol
> that you have knowlege of.

If you can actually modify the firmware to do this then you have a lot
more guru points than I do. :) I've looked at the Alteon firmware code
but it's all quite opaque to me.

-Bill

--
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Monday strikes again

1999-08-23 Thread Bill Paul

Okay, further investigation shows that configure() has the following
code:

#if NPNP > 0
/* Activate PNP. If no drivers are found, let ISA probe them.. */
pnp_configure();
#endif

/*
 * Explicitly probe and attach ISA last.  The isa bus saves
 * it's device node at attach time for us here.
 */
if (isa_bus_device)  
bus_generic_attach(isa_bus_device);

However isa_bus_device is still NULL so we never get any ISA devices
attached. No ISA devices means no console (the VGA card and serial ports
are both ISA devices), so we explode.

Since the ISA bus in this machine is on-board instead of being hung
off a PCI to ISA bridge, I suspect that somebody broke the handling
on on-board ISA buses.

Thank you very much, may I have another.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Monday part II: The Terror Continues

1999-08-24 Thread Bill Paul

So today my ISA bus is detected properly and the kernel gets as far as
trying to launch /stand/sysinstall, but then, just when I thought it was
safe to try and load a new snapshot:

rootfs is 2880 Kbyte compiled in MFS
spec_getpages: I/O read failure: (error code=0) bp 0xc34fc3a0 vp 0xc7ed8ec0
   size: 0, resid: 0, a_count: 49152, valid: 0x0
   nread: 0, reqpage: 0, pindex: 0, pcount: 12
exec /stand/sysinstall: error 5
init: not found
panic: no init

This has nothing to do with the 486 though; I tried it on a laptop
that was handy and it blew up the same way. I tried yesterday's mfsroot
image and it doesn't work with that either.

The August 16th snapshot's kernel and mfsroot images seem to work.
The August 16th snapshot's kernel and yesterday's mfsroot image also
works.

This is the third unusable snap in a row that I've had the misfortune
to encounter. I'm starting to think this is more than a coincidence. Did 
somebody launch a "Piss Bill Off" contest when I wasn't looking or 
something? If so, let me stress that you really don't want to find out
what first prize is.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Heads up: 3Com XL driver converted to miibus

1999-08-29 Thread Bill Paul

I just committed changes to if_xl to make it use the miibus support
instead of its own MII support code (which I ripped out with much gusto).
I've tested this with a 3c905, 3c905B, 3cSOHO100-TX Office Connect and a 
3c905C and it works fine for me, however Murphy's Law dictates that I may 
have goofed something up without realizing it. Non-MII cards should still 
work as before, however if they don't please let me know quickly.

Also, I'm not 100% sure about the 3c905B-COMBO. The 10/100 support ought
to work fine just as with the other cards, however I can't be certain
about the BNC and AUI ports. If you've got one of these, please test it.
You should see the 10base5 and 10base2 media types available on the
interface when you do ifconfig xl0 and you should be able to turn them
on, as well as switch back to the RJ45 port.

Assuming I haven't broken something horribly, the major benefit of
this is that code should be xl driver code should be smaller than
before, and the 'wait until autoneg completes' delay at boot is now
gone. Lastly, the link state should now be properly reflected by
ifconfig (active or no carrier).

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: kernel build fail- /pci/if_xl.c:133: miibus_if.h: No such file or directory

1999-08-30 Thread Bill Paul

Of all the gin joints in all the towns in all the world, FreeBSD mailing 
list had to walk into mine and say:

> Subject says it all.

No, the subject does not say it all: the subject says nothing about how
you forgot to update your kernel config file to include:

controller miibus0

The subject also fails to mention that you didn't go back and read
previous postings on this list, especially the one where I said that I
had converted the xl driver to use miibus.

Of course, nowdays you don't even need to include the xl driver in your
kernel. You can just do:

kldload mii
kldload xl

Or you can include the following in /boot/loader.conf and reboot:

mii_load="YES"
xl_load="YES"

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Problems with the latest changes to ifconfig (I guess) -> Bad

1999-08-31 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Peter Jeremy 
had to walk into mine and say:

> Ian Whalley <[EMAIL PROTECTED]> wrote:
> >My card is identified as <3Com 3c905B-TX Fast Etherlink XL>.
> 
> FWIW, I'm running a kernel about 30 hours old with a <3Com 3c905-TX
> Fast Etherlink XL> and I'm not seeing this problem.
> 
> At a quick quess, something in the miibus support broke the 3C905B
> support.

Not quite.

The original 3c905-TX NIC used an external NatSemi PHY chip which
was mapped to MII address 24. The 3c905B uses an internal transceiver,
which is also mapped to MII address 24 for compatibility purposes.
However, there are several different 3c905B ASIC revisions and
at least one of them, for some peculiar reason, maps the transceiver
to *all* MII addresses (0 through 31). Technically this isn't a
big problem since if you always assume that the PHY is at address
24 (which I sure is what 3Com's drivers do) you'll never notice
the difference. But you have to watch out for it.

The old code in if_xl.c would probe for PHYs and stop the moment
it encountered the first one, which would work fine: it would stop at 
address 0 for the broken ASIC and 24 for the working ones. But the miibus 
code probes at all addresses because there are some NICs that actually 
have more than one transceiver. But with the  buggy 3Com ASIC, we end up 
incorrectly trying to map the same PHY several times over, which the 
xlphy driver doesn't like, so the probe fails, the miibus attach fails, 
and bad things happen later.

I just committed a patch to -current to deal with this: the
xl_miibus_readreg() and xl_miibus_writereg() routines will not
only return values at MII address 24. This will make the buggy
ASIC appear to work correctly so that only one PHY instance will
be detected.

Why didn't I catch this earlier? Well, the 3c905B NIC that I tested 
happens to work correctly. So did the 3c905C that I tried after it.
In fact, I think the only place I encountered the buggy ASIC locally
is with the embedded 3c905B NIC in some of the Dell machines in the
lab, which aren't currently running FreeBSD.

Don't you just love hardware programming?

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



HEADS UP: 3Com xl driver changes

1999-09-19 Thread Bill Paul

I've just committed some changes to the xl driver to implement an
alternate and hopefully more efficient transmit queueing scheme for
3c90xB adapters. The changes are based on the mechanism used in the
3Com 3c90xB driver for Linux which was just recently released (I
didn't use any of the actual 3Com driver code though since it's GPLed).

The changes involve using transmit descriptor polling; the NIC is
set up to poll the transmit descriptor ring in order to determine
when new packets are ready for transmission. The advantage to this is
that the host doesn't have to perform any register accesses to the
NIC during transmissions: it basically just queues the packets into
the transmit ring and the card takes care of the rest. By contrast,
the old mechanism requires several register accesses in order to
stall the TX engine, queue new packets, update the transmit list
pointer register, and then unstall the TX engine again.

I've had good results with these changes here, however Murphy's Law
applies, which means somebody somewhere will probably stumble over
a bug that I missed. If people find that things are horribly broken,
I'm prepared to back everything out.

The cards for which these changes will have an effect are:

- 3c900B-TPO
- 3c900B-TPC
- 3c900B-COMBO
- 3c900B-FL
- 3c905B-TX
- 3c905B-FX
- 3cSOHO100-TX
- 3c905C-TX
- 3c980-TX
- 3c980B-TX
- 3c980C-TX

If you have one of these cards, you might want to try the following:
run systat -iostat 1 and generate some heavy traffic on the xl interface
both before upgrading the driver and after. Ideally, CPU and interrupt
load should be lower after the driver has been updated. At the very
least, there should not be any problems with packet transmission that
weren't present before.

If somebody notices a problem, please let me know immediately. For
that matter, if you notice an improvement, let me know as well.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Page fault with ethernet xl0

1999-09-20 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Stephan van 
Beerschoten had to walk into mine and say:

> I have experienced something nasty. After cvs'ing my tree 3 hours
> ago which would be approx 16:00 CET, I did a buildworld, installed
> it, compiled a new kernel.

Dammit. You didn't even tell me what kind of card you have. Do you
really need me to ask you for this? Go back and boot the kernel in 
verbose mode and show me *EXACTLY* what it says.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Beneath The Planet Of The Mondays

1999-09-25 Thread Bill Paul

Uh, hello? Hi? Is this thing on? *tap* *tap* *squeel!* Oops,
sorry. Listen, does anybody have any idea why there hasn't been a
-current snapshot on current.freebsd.org since Septembet 18th?
Anyone? People do realize that make release is not the same as
make buildworld, and are testing the former at least once in a while,
right?

I realize this is -current and all and mistakes happen, but make
release basically constitutes a 'full build' of FreeBSD and if it
doesn't work, especially for a whole week, it looks kinda bad.
Unfortunately, when the snapshots on current.freebsd.org fall over,
nobody knows exactly what causes the problem except Jordan, and he
keeps gnawing through his limbs to excape the traps I set for him.
I wish the build logs were online somewhere (hint hint).

When somebody breaks the build at M$, the offender is made to wear
a viking hat. I would suggest a similar punishment for those who
break our own build, except that I suspect many of you are wearing
viking hats already.

-Bill
 
-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Is the wb driver broken?

1999-09-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, John Polstra 
had to walk into mine and say:

> > Is there any reason why you're not letting it autodetect (which is
> > what it does by default, or with media autoselect).
> 
> I inherited an /etc/rc.conf file that somebody else set up.  I'll
> try letting it autodetect and see what happens.
> 
> But ... after I sent off that last mail, it acquired the carrier
> after a small delay.  I don't know why it didn't see it at first.

It could be that either the PHY or the hub port take a second to
get their brains together.
 
> > Make sure it's plugged in, make sure the link light is lit.
> 
> Er, I happen to be 1000 miles away from the link light.  Maybe if I
> squint real hard ... :-)

You know, I have a rule about not doing software upgrades on
equipment that I can't actually get my hands on.
 
> Thanks for the help.  I seem to be over the hump on the wb interface
> anyway.

I trust this means it's actually passing traffic.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Is the wb driver broken?

1999-09-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, John Polstra 
had to walk into mine and say:

> I've run into a problem with the wb0 interface (Winbond) on a
> machine running -current from yesterday.  (That's before any of
> the segset_t changes went in.)  Unfortunately, the machine is
> cvsup-master.freebsd.org, which makes this pretty urgent.

I converted the wb driver to miibus ages ago. Your description makes
it sound like the problem just magically appeared yesterday. That's
a no-no, m'kay?
 
> When I try to ifconfig the device, I get "ifconfig: SIOCSIFMEDIA:
> Device not configured":
> 
> cvsup-master# ifconfig wb0 inet 204.216.27.25  netmask 255.255.255.240 
>  media 100baseTX mediaopt half-duplex
> ifconfig: SIOCSIFMEDIA: Device not configured

You don't need to explicitly specify mediaopt half-duplex anymore.
Specifying media 100baseTX without mediaopt full-duplex implies
half-duplex. Leave off the mediaopt half-duplex part and it will work.


-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Is the wb driver broken?

1999-09-29 Thread Bill Paul

Of all the gin joints in all the towns in all the world, John Polstra 
had to walk into mine and say:
 
> >> cvsup-master# ifconfig wb0 inet 204.216.27.25  netmask 255.255.255.240 
> >>  media 100baseTX mediaopt half-duplex
> >> ifconfig: SIOCSIFMEDIA: Device not configured
> > 
> > You don't need to explicitly specify mediaopt half-duplex anymore.
> > Specifying media 100baseTX without mediaopt full-duplex implies
> > half-duplex. Leave off the mediaopt half-duplex part and it will work.
> 
> OK, I did that and it made the SIOCSIFMEDIA message go away.  But
> now it's not showing carrier:
> 
> Doing initial network setup: hostname domain.
> wb0: flags=8843 mtu 1500
> inet 204.216.27.25 netmask 0xfff0 broadcast 204.216.27.31
> ether 00:00:e8:18:5b:1d 
> media: 100baseTX status: no carrier
> supported media: autoselect 100baseTX  100baseTX 10baseT/UTP 
> 10baseT/UTP none
> 
> Any other ideas?

Is there any reason why you're not letting it autodetect (which is
what it does by default, or with media autoselect). Make sure it's
plugged in, make sure the link light is lit. Try to ping somebody
on the network (or run tcpdump on the interface). You can't just
sit there and look at it: you have to experiment.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Issues with xl0

1999-10-04 Thread Bill Paul

WARNING: the following reply contains Extreme Ranting which may be too
intense for young audiences. Those not wishing to experience Extreme 
Ranting should #define NO_EXTREME_RANTING.

Of all the gin joints in all the towns in all the world, Bryan Bursey 
had to walk into mine and say:

> I attempted to move from -STABLE to -CURRENT last night, but without any
> luck.  I decided to start with a current snapshot (19990928), but was
> unable to install using the floppies provided on releng3.freebsd.org.

#ifndef NO_EXTREME_RANTING
But the exact reason why you were unable to install is a secret, right?
Clearly the details of the failure would be of no use to anyone, so you
chose not to share them, yes?

How many times do I have to say it: "it didn't work," "it failed,"
"I couldn't make it do blah" and similar vague descriptions don't help
anybody. Don't start in with a vague statement about a problem and then
expect to be asked for more details later: give the details first! It
saves a lot of time!
#endif
 
> Thinking it might have been a floppy issue,

#ifndef NO_EXTREME_RANTING
It may as well have been an arthritis issue for all we know.
#endif

> I used my 3.3-STABLE floppies
> and simply changed the install options so that I'd get 4.0.  This worked
> until I restarted my machine at the end of the install.  It came back up
> ok, but I was again unable to connect to the network.

This is too vague. You're leaving out a ton of details, like: did you
even see the xl0 probe messages in the kernel. You know, basic stuff which
nobody else will know since we're not able to see over your shoulder.
 
> Can anyone tell me if there are known issues with the xl0 driver in 4.0,
> or if it has been superceded by another driver which works with 3Com
> 3C900B.

#ifndef NO_EXTREME_RANTING
No no no. *You* tell *us* if there are any issues! *You* tell *us* if
you're having any problems! And then *you* tell *us* in explicit detail
what they are! How hard is it to understand that!

No, there isn't any other driver. But because you didn't make eves the
slightest effort to explain your problem, I can't begin to even help you.
#endif

You didn't specify which 3c900B card you have: there are several
of them with different media options:

- 3c900B-FL 10baseFL fiber-optic
- 3c900B-TPO 10baseT "Twisted Pair Only"
- 3c900B-TPC 10baseT and 10base2 "Twisted Pair and Coax"
- 3c900B-COMBO 10baseT, 10base2 and 10base5 (AUI)

#ifndef NO_EXTREME_RANTING
If you'd bothered to watch what happens when the kernel boots, you would
have been able to tell whether or not the 3c900B card was detected (and
I know it was in spite of your unwillingless to say so). You would also
have been able to tell what media was selected (10baseT, 10base5 or
10base2, depending a bit on exactly which model card you have, which you
also didn't tell us). Then had you bothered to rub two brain cells
together, you might have been able to tell if maybe the default media
selection read from the EEPROM was incorrect and possibly tried to use 
ifconfig to set it correctly.
#endif

If you have a TPO or FL adapter, then there's only one media choice, and
the driver should have selected it properly. If you have a TPC or a COMBO
adapter, then somebody may have fiddled with the 3C90XCFG.EXE utility and
selected the wrong default media in the EEPROM. The driver will only use
what the EEPROM says; it doesn't autoprobe. If you used the 3C90XCFG.EXE
utility to select the "auto" choice, then the driver will pick a reasonable
default and expect you to be clever enough to change it with ifconfig if
the choice is wrong. For example, for a COMBO card, it will choose 10baseT.
If you don't like 10baseT, you can do the following:

# ifconfig xl0 media 10base2/BNC
# ifconfig xl0 media 10base5/AUI

Or if you really want 10baseT:

# ifconfig xl0 media 10baseT/UTP

If you want to use this setting during the install, then enter the media
option command in the box that says "extra options to ifconfig" in the
TCP/IP configuration screen (i.e. "media 10base2/BNC").

> Thanks for any help (or other random thoughts).

#ifndef NO_EXTREME_RANTING
You want random thoughts? Fine: I wish it would stop raining, I hope
the Mets make it to the playoffs, my shoes are too tight, you're ugly
and your mother dresses you funny.

I know what you're thinking: "why is he being so nasty?" Because I can't
stand it when people expect me to play the "minimum information" game,
and you are by no means the first. Some people may be able to read a 
chewing gum wrapper and divine the secrets of the universe, but I'm not 
one of them.
#endif

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Mas

Still waiting for xl driver reports

1999-10-10 Thread Bill Paul

A while back I posted a message here saying that I'd changed the xl driver
a bit to hopefully improve performance for 3c90xB and later adapters (i.e.
the "cyclone," "hurricane" and "tornado" chipsets). I asked for people
to report if the changes helped, hurt, made no difference or were totally 
broken.

So far not one person has said so much as a word to me on this subject.

I need feedback from people so that I know it's safe to merge this
stuff into -stable, so let's hear it already. It's been several weeks
since I made the changes. Surely there are people running -current
with 3Com 3c90xB cards.

To reiterate, this only concerns people with the following adapters:

- 3c905B-TX 10/100
- 3c905B-FX/SC fiber optic
- 3c905B-COMBO 10/100 plus BNC and AUI
- 3c905C-TX 10/100
- 3c980-TX server adapter
- 3c980B-TX server adapter
- 3c980C-TX server adapter
- 3cSOHO100-TX 10/100

To a lesser extent it also concerns people with these adapters (these
are 10Mbps only so the change isn't likely to be as noticable):

- 3c900B-TPO twisted pair only
- 3c900B-TPC twisted paid and coax (BNC)
- 3c900B-COMBO twisted pair, AUI and BNC
- 3c900B-FL fiber optic

Ideally, the changes should provide slightly faster performance with
less CPU usage. Performance/CPU overhead comparisons with other cards
(in the same machine!) would he helpful as well as comparisons with
the same 3Com card using the older driver revision.

Things that would not be helpful include:

- Asking about an unrelated problem from 3.2-RELEASE, 3.3-RELEASE or
  3.3-STABLE.

- Telling me that your card isn't detected properly and not realizing
  that you have "plug and play OS" set to "yes" in your BIOS config.

- Asking about a completely different card. From a completely different
  manufacturer.

- Saying that you'll be happy to run some tests "as soon as you find
  some time." If you couldn't find the time by now, you never will.

- Giving me an excuse for not sending me any feedback earlier. I
  don't care if your dog got run over, your house was invaded by
  giant ants, your entire family contracted the bubonic plague or
  aliens stole your computer.

- Asking me how to set up a 3Com card in Linux. (Comparing the xl
  driver's performance with the Linux 3Com driver is acceptable,
  provided you run the comparison on the same hardware. Comparing
  a PIII 600Mhz host running Linux to a PII 300Mhz host running
  FreeBSD is not a fair comparison. Unless FreeBSD ends up being
  faster. :)


-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Texas Chainsaw Monday

1999-10-20 Thread Bill Paul

Doing nightly build attempt for 4.0-19991020-CURRENT at Wed Oct 20 02:06:54 CDT 1999
Updating source tree...
? release.out
Making release...
Release build of 4.0-19991020-CURRENT was an abject failure.
[...]
===> sbin/tunefs
install -c -s -o root -g wheel -m 555   tunefs /vol2/release/sbin
===> sbin/umount
install -c -s -o root -g wheel -m 555   umount /vol2/release/sbin
===> sbin/vinum
install -c -s -o root -g wheel -m 555   vinum /vol2/release/sbin
===> sbin/kget
install -c -s -o root -g wheel -m 555   kget /vol2/release/sbin
===> sbin/mount_nwfs
install -c -s -o root -g wheel -m 555   mount_nwfs /vol2/release/sbin
install: mount_nwfs: No such file or directory
*** Error code 71
[...]

Can somebody please explain this to me? The fact that mount_nwfs doesn't
exist seems to indicate that compiling mount_nwfs failed. Yet if compiling
mount_nwfs failed, why didn't it stop at the compilation failure?

I suspect the answer has to do with some sort of obj directory problem,
but it's impossible to tell that based on the build report.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Intel PRO/1000 Gigabit driver

1999-10-21 Thread Bill Paul

Of all the gin joints in all the towns in all the world, John Reynolds~ 
had to walk into mine and say: > 
 
> A friend just passed this along:
> 
>   http://www.newsalert.com/bin/story?StoryId=Coa6pWbKbyte0mtu
> 
> Intel PRO/1000 Gigabit support for Linux. Source code too (non-GPL'ed very
> much like a BSD-ish license).

Just in case anyone is wondering, I refuse to create BSD drivers based
soley on information from Linux drivers. I don't want any damn Linux
source: I want the programming manual used to create the driver in
the first place. Why? If Intel is willing to release unencumbered 
Linux source, then they should be willing to release the manual as well.
If they're not willing, then I don't want anything to do with them.

> I don't have the technicals to understand how hard it would be to port, but
> the code is there for those who do! :)

You work for Intel yet claim technical ignorance? I dunno man... :)

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Texas Chainsaw Monday

1999-10-22 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Boris Popov had 
to walk into mine and say:

> On Wed, 20 Oct 1999, Bill Paul wrote:
> 
> > install -c -s -o root -g wheel -m 555   mount_nwfs /vol2/release/sbin
> > install: mount_nwfs: No such file or directory
> 
>   Ok, it seems that I found why mount_nwfs failed to build: I'm use
> 'install' instead of ${INSTALL} in the libncp.

Unfortunately, this has not fixed the problem: the build report for
today (Oct 22) shows the same error.

*sigh*

-Bill


-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



gdb slain in drive-by commit, film at 11

1999-01-16 Thread Bill Paul

Gdb has stopped working recently in -current. In a snapshot from
October 24th, it's fine. In a snapshot from November 15th (before
the great gcc switchover), it's hosed. I've been told it's hosed in
today's -current as well. Here are the symptoms:

tuba# uname -sr
FreeBSD 4.0-19991115-CURRENT
tuba# cat f.c
#include 
main()
{
printf("hello world\n");
}
tuba# cc -g f.c
tuba# gdb -q a.out
(gdb) run
Starting program: /tmp/a.out 
warning: find_solib: Can't read pathname for load map: Bad address

Segmentation fault (core dumped)
tuba#

A statically compiled executable works though:

tuba# cc -static -g f.c
tuba# gdb -q a.out
(gdb) run
Starting program: /tmp/a.out 
hello world

Program exited with code 014.
(gdb) 


Okay, 'fess up: who's the wise guy.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Looking for testers...

1999-01-17 Thread Bill Paul
kernel message buffer?
  (I.e. does /sbin/dmesg show anything weird.)

If the interface operates normally and there are no errors, then I'd
like to know that as well. :)

Note: please don't ask me for a -stable version of this driver. I'm
posting this to -current for a reason. If you're not running -current,
then either set up a -current box or just sit back and enjoy the show.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Looking for testers...

1999-01-17 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Andrew Gallatin 
had to walk into mine and say:
 
> Bill Paul writes:
>  > For those who may not know, I've been tinkering with a new 'tulip clone'
>  > driver for various PCI ethernet cards. I'm attempting to combine support
>  > for several tulip like chipsets into a single driver in an attempt to
>  > reduce code bloat. I've gotten things to where I think they work okay,
>  > but I'm looking for testers who have FreeBSD-current running with the
>  > following PCI chipsets:
> 
> YES! Hurray!  You are my hero!  I have been suffering under the
> if_de driver which utterly fails to grok 100Mb full duplex on all my
> 21143 equipped alphas.  This news has made my week, my month!

Careful. I said that the only 21143 cards I have use MII transceivers.
I don't have any cards that use symbol mode and built-in NWAY. The
Macronix chips copy the 21143's built-in NWAY pretty closely and
they work pretty well, but I don't know how well it works with an
actual 21143. I don't know how DEC set up the ethernet in the alphas:
if they used an MII transceiver, then it should work okay, but if
not you could be in for trouble. I wish I didn't have to say that,
but I just don't have the hardware to test with.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Update of if_dc driver

1999-11-25 Thread Bill Paul

Okay, I've had a couple of reports so far about the if_dc driver which
were mostly positive. I've also gotten some new hardware and did some
more testing and bug fixing:

- Fixed support for non-MII 10/100 cards based on the 21143 chip. This
  includes the DEC DE500-BA and the built-in 21143 ethernet on alpha
  machines. The DE500-BA is now being distributed by Cabletron.

- Changed dc_attach() so that if probing for an MII-based PHY fails on
  21143 cards, it will fail over to using the dcphy pseudo driver and
  SYM mode.

- Fixed a few minor problems with autonegotiation on Macronix and PNIC II
  cards.

- Simplified dc_pnic_rx_bug_war() a bit. Now we keep track of descriptor
  and mbuf indexes instead of pointers.

- Compiled KLD modules for both x86 and alpha platforms using
  gcc 2.95.2.

The driver should work correctly now with most 21143 10/100 cards.
If anybody has an Adaptec, ZNYX or other multiport 21143 card, I'd
be interested to know how it works with these. I've tested it with
a D-Link DFE-570TX 4-port card and it seems to work well. Again, the
driver is at http://www.freebsd.org/~wpaul/dc.tar.gz. If you have
FreeBSD-current and a supported card, please give it a try and let
me know how it holds up. Supported cards include:

- Intel 21143 10/100 NICs (Kingston KNE100TX, DEC DE500-BA, D-Link
  DFE-570TX, Adaptec 6244 (I think), possibly ZNYX and others)
- Macronix 98713, 98713A, 98715A, 98725, LC82C115 PNIC II NICs
  (NDC SOHOware, LinkSys LNE100TX V2.0, CNet Pro120A, CNet Pro120B,
  SVEC PN102TX)
- ASIX AX88140A or AX88141 NICs (Alfa Inc. GFC2204, CNet Pro110B)
- ADMtek AL981 Comet or AL985 Centaur
- Davicom DM9102 NICs (Jaton Corporation XPressNet)
- Lite-On 82c168 and 82c169 NICs (LinkSys LNE100TX, Matrox FastNIC,
  Kingston KNE110TX, Netgear FA310-TX Rev D1, D2 or D3)

My goal is to try and get this driver into 4.0 as soon as possible so
I can use it as a replacement for the al, ax, dm, pn and mx drivers.
However, there's a small problem: the de driver already supports the
21143, although it does so poorly according to some people. We can't
have both drivers trying to support the same chip. I want to be able
to turn off 21143 support in if_de and let if_dc handle them, but I
don't want to annoy people who are using if_de with 21143 cards now
and not having any trouble. What do people think? Does anybody have
anything against me transfering support for the 21143 from if_de to
if_dc? Does anybody have a better idea? I'm open to suggestions.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Last if_dc update before import

1999-12-01 Thread Bill Paul

I've made one more round of fixes to the if_dc driver, which I hope
will be the last before I import it into -current this coming weekend
so that I can get it in before the feature freeze. The changes are:

- Modified the behavior of dc_mii_tick() so that the DC_REDUCED_MII_POLL
  flag works for non-21143 chips that either don't have a 'link failure'
  bit in their status registers or have one which doesn't behave like
  the one in the 21143 (the 21143's link failure bit works for both
  10 and 100Mbps modes). The problem is that with MII-based NICs, we
  need to poll the PHY status register in order to monitor the link
  state, however this involves a lot of work for chips with a serial
  MDIO interface (all that bit banging) and I have observed that polling
  during periods of heavy receive activity sometimes causes RX CRC
  errors. With the 21143, we avoid polling unless the link fail bit
  goes up; for the other chips, we avoid polling if a) a we have any
  outstanding packets on the TX queue and b) the RX state as shown by 
  CSR5 indicates that that the receiver is not idle.

- Enabled DC_REDUCED_MII_POLL flag for all non-21143 chips that use
  bit-bang MDIO interface as a result of the previous change.

- Corrected dc_eeprom_idle() and dc_eeprom_getword() so that they
  pull the clock line low before setting chip select: this way we
  start out our transaction with the EEPROM with the clock low, which
  conforms with the timing diagrams in the 93C66 datasheet. This
  allows us to read the EEPROM on the ADMtek AN985 cards again: I
  noticed yesterday that this didn't work anymore even though it
  had worked when I tested the ADMtek cards with this driver some
  weeks ago. I suspect this is related to the gcc switch: it could
  be that the previous code worked by accident due to some peculiarity
  in egcs 1.1.2 which is no longer present in gcc 2.95.2.

  Reading the 93C64 EEPROMs on all the other cards still works as
  before.

- Disabled TX descriptor polling for the ADMtek cards. Descriptor
  polling means that the chip checks for packets in the TX ring itself
  rather than waiting for the host to issue a TX DMA start request.
  However some chips appear to constantly generate 'no TX buffer
  present' interrupts each time they poll and find the ring is empty.
  This causes unnecessary interrupt activity when the transmitter
  is idle.

- Wrote a man page.

As usual, the tarball is at http://www.freebsd.org/~wpaul/dc.tar.gz.
I haven't been able to produce alpha KLD binaries since beast.cdrom.com
appears to be dead.

Note that I am still accepting bug/non-bug reports. I'm particularly
interested in hearing from people with PNIC 82c168 and 82c169 cards 
(LinkSys LNE100TX, NetGear FA310-TX Rev D1, D2, D3, Kingston KNE110TX, 
Matrox FastNIC, etc...) since I want to be certain that I've gotten
all of the various chip workarounds working right.

NOTE: once I add this driver to the tree, I intend to remove the
al, ax, dm, pn and mx drivers and man pages since they should no
longer be needed. Please remember to update your rc.conf files
accordingly! I will post an additional heads up when I finally do
the deed.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



HEADS UP: if_dc imports, al, ax, pn, dc and mx removed

1999-12-04 Thread Bill Paul

Heads up people: the if_dc driver and all its bits and pieces are now
in the tree and the al, ax, dm, pn and mx drivers have been removed.
People previously using these drivers need to update their /etc/rc.conf
files accordingly. Also note that if_dc should now handle 21143-based
NICs. If you were previously using if_de for a 21143 card, you may
see if_dc take over support for it depending on your kernel config.
If you have a 21143-based NIC which worked with if_de and *doesn't*
work with if_dc, please let me know ASAP. Most 10/100 NICs should
work fine: the only questionable ones are 10Mbps only versions. If
you have a NIC that doesn't work, please show me the output of
pciconf -l from your system when reporting a problem.

As usual, the place to complain is: [EMAIL PROTECTED]

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Correction to /usr/src/sys/i386/conf/LINT for D-Link DFE-530TX+

1999-12-18 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:

> The D-Link DFE-530TX+ uses the 'rl' driver, not the 'vr' driver.  I
> don't know if there's a DFE-530TX (without the '+') so I'm leaving the
> entry for that in the 'vr' driver notes intact.

Both exist. The DFE-530TX is most definitely a VIA Rhine card and needs
the vr driver. I have one. I only recently learned of the existence of the 
DFE-530TX+, which uses the RealTek 8139 and needs the rl driver.

Yes, it's dumb to change the whole card design and do nothing to update
the model number except stick a "+" on the end, but that's how it goes. 
D-Link also has a habit of selling certain cards only some markets. 
For example, there apparently also exists a DFE-540TX card that uses a 
Macronix chip, however it was never sold in the U.S., only in Asia.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



General ata grousing

1999-12-22 Thread Bill Paul

In an earlier post on -hackers, I mentioned that attempting to kldload the
usb.ko module after the kernel had booted would panic the system. So far
I've managed to track this problem all the way down down to
sys/i386/isa/intr_machdep.c:add_intrdesc(). The system crashes when the
uhci_pci module tries to set up an interrupt handler using bus_setup_intr().
I strongly suspect this is being caused by an unpleasant interaction with
the ata driver: just my luck, the ATA controller, USB UHCI controller and
power management happen to be implemented as subfunctions of the same PCI
device. (Note that having /boot/loader pre-load the usb module along with
the kernel does work.)

In my case, each function is assigned IRQ 11 by the BIOS. I would think that
each driver would register a handler for this IRQ using bus_alloc_resource()
and bus_setup_intr() with the RF_SHAREABLE flag. However from what I can
tell, the ATA driver isn't doing this in its PCI attach routine. I'm not 
sure why. What is doing is very weird: it appears that it tries to call 
inthand_add() directly in at least one part of the code. I'm nowhere near 
understanding the whys and the wherefores for all this yet, something 
tells me this has to be related to the USB problem. By some special 
magic, everything just happens to work right when the devices are probed 
at boot time (and of course, nobody thought to test any other case), but 
things break very badly when trying to load the usb.ko module *after* the 
system has booted.

I don't want to sound like an ungrateful wretch, unduly criticizing
someone else's code, especially at so late a date, but there are some 
other things that just seem like they really shouldn't be there:

- Platform dependencies. The inthand_add() thing I mentioned previously
  appears to be an x86-specific kludge, and there's an alpha kludge to
  go along with it. There should be some way to get rid of this.

- Magic numbers everywhere. I see lots of places where I/O and PCI config
  registers are being manipulated using just hard coded register offsets
  and bitmasks. Magic numbers are bad, -kay?

- Use of inb/outb instead of bus_space_read_X()/bus_space_write_X().
  My understanding is that bus_space_read_X()/bus_space_write_X() are
  the prefered way of doing register accesses. inb/out and friends are
  deprecated.

Anyway, I'm going to continue trying to hunt down the interrupt setup
problem once I get home tonight (nice thing about having a laptop for
a test box: you don't have to leave the test machine at work and frob
it remotely). If anyone has any insights, please feel free to share
them.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-22 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
> I'm adding Bill Paul to the list specifically.
> 
> Hmm.  Now this is odd!  I think I may have found something!
> 
> All of my 'rl' driver cards fail this test:

Oh sure. Bet the farm on the absolute worst NIC on the whole damn planet,
why don't you. Why spend a few bucks on some nice 3c905B or 3c905C cards
and beat up on them when you can buy ten RealTek cards for a dollar. About
as reliable as a pair of tin cans and a piece of string, but gosh they
sure are cheap.

You'll have to wait until at least tomorrow before I can look into this,
since I won't be able to do any debugging until I throw my one and only
RealTek 8139 sample adapter into a machine and run some tests with it.

>   rl0:  irq 11 at device 3.0 on pci0
>   rl0: Ethernet address: 00:50:ba:d1:89:05
>   miibus0:  on rl0

pciconf -l would be nice here too (to see the PCI revision code).
 
> Methinks there is something going on with the 'rl' driver and/or
> the RealTek cards!

Gee, y'think? I don't suppose you ran any similar tests with, say,
one of those LinkSys cards you had the other day. Or maybe a 3Com card.
I mean, it's just a little anti-climactic, you know? I put all that
blood, sweat and tears into if_xl and if_dc, but do people do stress
tests with them to help me identify weaknesses? No, they pound on
the house of cards that is if_rl.

*sigh*

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-22 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
> (taking this off -current)
> 
> apollo# linktest -s 51 -f1 lander 1-51 byte payload -> errors
> lander# linktest -s 51 -f1 apollo
> 
> apollo# linktest -s 52 -f1 lander 52+ byte payload -> no errors
> lander# linktest -s 52 -f1 apollo
> 
> 
> You know, this kinda sounds like a jabber lockup.
> 
> Bill, are you following the *MINIMUM* ethernet frame size specification 
> for ethernet?

*sigh* No, I've been living on Mars since 1975 and we don't get IEEE spec
documents up here.

Yes, I know there's a minimum frame length of 60 bytes. And the rl_encap()
routine has the following code:

/* Pad frames to at least 60 bytes. */
if (m_head->m_pkthdr.len < RL_MIN_FRAMELEN) {
m_head->m_pkthdr.len +=
(RL_MIN_FRAMELEN - m_head->m_pkthdr.len);
m_head->m_len = m_head->m_pkthdr.len;
}

The RealTek doesn't autopad, so you have to handle it manually. You're
only allowed one DMA buffer per transmission, so outbound packets are
coalesced into a single mbuf cluster buffer in rl_encap(). A cluster
buffer is always 2K, and frames can never be larger than 1514 bytes, so
we know there'll always be plenty of room. In the case of frames less
60 bytes, I just adjust bump up m_pkthdr.len and m_len. This adjuster
length gets used later in rl_start() when transmission is triggered.

Incidentally, you should be using tcpdump -n -e -i rl0 to measure the
actual frame length of failing and succeeding transmissions: that's
usually a much better indicator of what might be going wrong. You could
calculate it from the data buffer length, but I suck at math; I find it's
easier just to monitor the offending frames.

-Bill

=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Bill Paul

Okay, I patched if_rl.c in -current to fixe the problem demonstrated by 
Matt's linktest program. The bug was actually on the receive side of the 
rl driver, not the transmit side. A packet can wrap from the end of the 
RX buffer back to the beginning, and in some cases these packets would 
get lost due to botched use of m_pullup(). I can run the linktest 
program now without losing any frames.

There's another way around this which is to allocate a whole mbuf
cluster when you know the packet is wrapped and bcopy the data manually
instead of using m_devget(), but I'm not sure I want to waste a whole
cluster just for that case.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Bill Paul
it goes back to normal... suddenly everything is working properly
> again.

And what happens if instead of auto, you use "ifconfg dc0 media 100baseTX
mediaopt full-duplex" to lock the media setting down? Or what happens if
you shut down and restart the X server?

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Woa! May have found something - 'rl' driver and small packets (was Re: Odd TCP glitches in new currents)

1999-12-23 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Matthew Dillon 
had to walk into mine and say:
 
> I'm trying to narrow down the area enough that I can mess with the 
> driver myself and hopefully locate the problem, since it can't be
> reproduced easily.   I was hoping the magic number 64 could be
> related to something - and you have apparently been able to do that,
> which gives me a place to start anyway.   netstat shows the trigger
> to be the reception of 64 packets rather then the transmission, though.
> Is there anything at all about the number 64 that could be related to
> the receiver?

64 is also the number of descriptors/buffers in the RX ring. When you
fill up the RX ring, the chip is supposed to generate a 'no RX buffer
available' interrupt. The driver will check the RX ring for packets
when either an 'RX OK' or 'no RX buffers available' interrupt is
delivered, but you should be getting an 'RX OK' interrupt on every
received packet.

The datasheet for the PNIC II is at:

http://www.freebsd.org/~wpaul/Macronix/PNIC_II.PDF

This is the datasheet LinkSys gave me when they first came out with
the LNE100TX v2.0 board. It's very similar to the Macronix 98715A
datasheet.
 
> I'm pretty sure that the box was getiting receive interrupts because
> every time I sent a packet to it from the outside systat -vm showed
> a PCI interrupt for the network device.  However 'netstat -in 1' did
> not show the statistics for the received packets until 64 had 
> accumulated.  It could be that the statistics are not being accumulated
> on a per-reception basis and that the receive packets are actually
> getting through, and that its the transmit side which is broken.  I don't
> know the code well enough yet to make the determination.

The dc_rxeof() routine is what increments ifp->if_ipackets, so if
netstat -in doesn't show any change until after 64 packets have arrived,
then it isn't getting the 'RX OK' interrupts. But I promise you that I
have never seen a condition where 'RX OK' interrupts failed to arrive
even though 'no RX buffer available' interrupts did. The interrupt handler
re-enables interrupts just before it exits, so there should never be a
case where interrupts are turned off and never turned back on again.

-Bill

> I'll try that next time the problem occurs but I doubt it will have 
> any effect.  Changing the duplex mode does not appear to reset the port 
> whereas forcing the media to 'auto' does appear to reset the port.  This 
> is actually another problem (switches don't appear to pick up the duplex
> change if the port isn't reset), but not one I'm concerned with.

In general what you want to do is a) switch modes and b) reset the link
so that the guy on the other side re-senses the media. However both sides
can only agree on the duplex setting as the result of an NWAY autoneg
session: if you manually select 100baseTX full duplex, the link partner
can only sense the link speed (100mbs as opposed to 10) but not the
duplex mode. The rule is that if you don't have NWAY but can sense the
link speed, you default to half duplex and let the operator manually
fix things if necessary (that's what operators are for). Of course this
only works if the switch has a management interface that allows you
to configure things like that. Some don't, which can make your life tough.

I'm pretty sure the speed and duplex setting don't really have anything
to do with this particular problem though. I was just wondering why
renegotiating the media would have any effect. It's possible that
dc_init() may be called in there somewhere, which could be resetting
all of the driver state.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: PCI-CardBus bridge + PCMCIA Lucent WaveLAN IEEE troubles

2000-04-14 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Michael I. 
Vasilenko had to walk into mine and say:
 
> pccardd[166]: Card "Lucent Technologies"("WaveLAN/IEEE") matched "Lucent 
>Technologies" ("WaveLAN/IEEE") 
> pccardd[166]: Using I/O addr 0x100, size 64 
> pccardd[166]: Setting config reg at offs 0x3e0 to 0x41, Reset time = 50 ms 
> pccardd[166]: Assigning I/O window 0, start 0x100, size 0x40 flags 0x5 
> /kernel: wi0:  at port 0x100-0x13f irq 7 slot 0 on pccard0
> /kernel: wi0: Ethernet address: 00:60:1d:f6:cc:5d  ^^^
> 
> and machine just hangs completly. 

You did disable the parallel port on this machine so that you can safely
use IRQ 7, right? And I don't mean "take the parallel port driver out
of the kernel config." I mean "go into the computer's BIOS setup screen
and turn the parallel port off."

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: NFS, rl0 and Alpha

2000-05-04 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Gary Jennejohn 
had to walk into mine and say:

> OK. Unfortunately, gdb core dumps when I try to analyze a crash dump
> with a debugging kernel :( Even worse, gdb core dumps when I try to
> run a debugging gdb in gdb to find out why gdb is core dumping when
> I try to debug a kernel with symbols :(( Wonderful.

I suspect this may have something to do with the way packets sometimes
wrap from the end of the RX buffer pool to the beginning. This might
result in fragmentation across multiple mbufs in some cases (I think).
If I squint hard enough, I can see a way for the data to end up misaligned
in one of the additional mbufs.

Try this patch. It's an untested hack (I don't have a RealTek card
in a test box right this second) but should fix the problem if it's
what I think it is.

-Bill

P.S.: Regardless, somebody should fix gdb.


-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

*** if_rl.c.origSat Apr 29 14:15:10 2000
--- if_rl.c Thu May  4 22:16:31 2000
***
*** 913,919 
goto fail;
}
  
!   sc->rl_cdata.rl_rx_buf = contigmalloc(RL_RXBUFLEN + 32, M_DEVBUF,
M_NOWAIT, 0, 0x, PAGE_SIZE, 0);
  
if (sc->rl_cdata.rl_rx_buf == NULL) {
--- 911,917 
goto fail;
}
  
!   sc->rl_cdata.rl_rx_buf = contigmalloc(RL_RXBUFLEN + 1518, M_DEVBUF,
M_NOWAIT, 0, 0x, PAGE_SIZE, 0);
  
if (sc->rl_cdata.rl_rx_buf == NULL) {
***
*** 1122,1129 
wrap = (sc->rl_cdata.rl_rx_buf + RL_RXBUFLEN) - rxbufpos;
  
if (total_len > wrap) {
m = m_devget(rxbufpos - RL_ETHER_ALIGN,
!  wrap + RL_ETHER_ALIGN, 0, ifp, NULL);
if (m == NULL) {
ifp->if_ierrors++;
printf("rl%d: out of mbufs, tried to "
--- 1120,1132 
wrap = (sc->rl_cdata.rl_rx_buf + RL_RXBUFLEN) - rxbufpos;
  
if (total_len > wrap) {
+   /*
+* Fool m_devget() into thinking we want to copy
+* the whole buffer so we don't end up fragmenting
+* the data.
+*/
m = m_devget(rxbufpos - RL_ETHER_ALIGN,
!   total_len + RL_ETHER_ALIGN, 0, ifp, NULL);
if (m == NULL) {
ifp->if_ierrors++;
printf("rl%d: out of mbufs, tried to "
***
*** 1132,1145 
m_adj(m, RL_ETHER_ALIGN);
m_copyback(m, wrap, total_len - wrap,
sc->rl_cdata.rl_rx_buf);
-   if (m->m_len < sizeof(struct ether_header))
-   m = m_pullup(m,
-   sizeof(struct ether_header));
-   if (m == NULL) {
-   printf("rl%d: m_pullup failed",
-   sc->rl_unit);
-   ifp->if_ierrors++;
-   }
}
cur_rx = (total_len - wrap + ETHER_CRC_LEN);
} else {
--- 1135,1140 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: NFS, rl0 and Alpha

2000-05-05 Thread Bill Paul

Of all the gin joints in all the towns in all the world, Gary Jennejohn 
had to walk into mine and say:

[...] 
> Yes, this patch fixes the problem. Thank you, Bill Paul !

*sigh* It figures. Ok, I applied the patch to -current and -stable. 
We now return you to your regularly scheduled program. Please drive
through.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: [EMAIL PROTECTED] | Center for Telecommunications Research
Home:  [EMAIL PROTECTED] | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Looking for testers for if_dc patches

2000-05-30 Thread Bill Paul

Several people have reported problems with if_dc botching autonegotiation
on 21143 NICs with non-MII media, such as the DEC/Compaq DE500-BA and
the built-in 10/100 ethernet on some alphas. As my first official act
as a BSDi/WC employee, I sat down and tried to fix this. I produced
some patches for if_dc.c/if_dcreg.h and dcphy.c, which are sitting at
http://people.freebsd.org/~wpaul/dc_test. To apply them, do the following:

# cd /sys/pci
# patch < if_dc.patch
# cd /sys/dev/mii
# patch < dcphy.patch

These patches should work on either 4.0-STABLE or 5.0-CURRENT. (They
should also work on 4.0-RELEASE.) There are also some fixes for the
Macronix 98713A/98715/98715A and the LC82C115 PNIC II, which also
use the 21143-style NWAY interface.

Note that I still need to add code to properly set the LEDs on 21143
boards. I went after the autoneg problem first since it was somewhat
more pressing. In any event, please try these patches and report the
results to [EMAIL PROTECTED]

-Bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Looking for testers for if_dc patches

2000-05-30 Thread Bill Paul

> On Tue, May 30, 2000 at 12:28:25AM -0700, Bill Paul wrote:
> > Several people have reported problems with if_dc botching autonegotiation
> > on 21143 NICs with non-MII media, such as the DEC/Compaq DE500-BA and
> > the built-in 10/100 ethernet on some alphas. As my first official act
> > as a BSDi/WC employee, I sat down and tried to fix this. I produced
> > some patches for if_dc.c/if_dcreg.h and dcphy.c, which are sitting at
> > http://people.freebsd.org/~wpaul/dc_test. To apply them, do the following:
> 
> [...]
> cc -c -O -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing 
>-prototypes -Wpointer-arith -Winline -Wcast-qual  -fformat-extensions -ansi -g 
>-nostdinc -I- -I. -I../.. -I../../../include  -D_KERNEL -include opt_global.h -elf  
>-mno-fp-regs -ffixed-8 -Wa,-mev56  ../../pci/if_dc.c
> ../../pci/if_dc.c: In function `dc_init':
> ../../pci/if_dc.c:2697: structure has no member named `dc_flgs'
> *** Error code 1
> 
> Stop in /var/d7/src-2000-05-28/src/sys/compile/CICELY9.
> 
> This is on 5.0-CURRENT as of 28th May on alpha

Grrr. Typo on my part, sorry. It should be flags, not flgs. I just fixed
the patch file. You can download it again, or just correct the typo manually.

-Bill
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Looking for testers for if_dc patches

2000-06-01 Thread Bill Paul


> Hi Bill,
> 
> I applied your patches to -current without incidents. 
> 
> I have a testbox (Digital dual P6) that gives:
> 
> May 31 10:56:38 p6 /kernel: dc0:  port
[...]
> May 31 11:03:27 p6 /kernel: dc0: watchdog timeout
> 
> This box can also house an Alpha Miata MX5 mainboard, the Intel & Alpha
> boards use the same PCI riser card that also contains the 21143 chip. 
> The patches don't seem to help on this particular hardware. I will try
> to give the Alpha a spin too, later today. BTW: ifconfig-ing to use
> 10baseT/UTP does not help either. The media bulkhead is a 10baseT/10base2
> one. if_de has no problems:

Alright, hold it. Stop. Just to make sure I understand:

- There's one interface involved here
- It has a 21143 chip
- It has 10baseT and AUI ports
- It's supposed to be 10Mbps only

If this is all correct, then I'd like you to try the following:

- Run pciconf -l on this machine and obtain the PCI ID for this device.
  The device ID is the hex number after the "chip=" section in the output.
  For the sake of this example, let's say it's 0x12345678.

- Bring up /sys/dev/mii/dcphy.c in your favorite editor.

- Look for the following code in the dcphy_attach() routine:

case COMPAQ_PRESARIO_ID:
/* Example of how to only allow 10Mbps modes. */
sc->mii_capabilities = BMSR_ANEG|BMSR_10TFDX|BMSR_10THDX;  
break;

- Add your PCI device ID like this:

case COMPAQ_PRESARIO_ID:
case 0x12345678:
/* Example of how to only allow 10Mbps modes. */
sc->mii_capabilities = BMSR_ANEG|BMSR_10TFDX|BMSR_10THDX;
break;

One thing I discovered is that trying to enable 100Mbps autoneg on
a device that only has a 10Mbps port doesn't work. This broke the
support for the 10Mbps ethernet in certain Compaq Presario machines,
which is why I special-cased it. This will not make the AUI port
work (I need to add extra code for that) but it if this is the same
problem as the Compaq, it should allow the 10baseT port to work.

Let me know if this has any effect.

-Bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Looking for testers for if_dc patches

2000-06-01 Thread Bill Paul

> > - There's one interface involved here
> 
> Correct.
> 
> > - It has a 21143 chip
> 
> Well, the de driver says 21142. The dc driver says 21143.

It's just a difference in chip revision, really.
 
> This one does not have AUI so that is not going to be a problem. What I do
> wonder, though, is what will happen if a 10/100Mbit bulkhead is installed on
> this machine. I don't expect the PCI ID to change (right?). I can pull
> the 10/100 bulkhead from my Miata GL to give this a try.

It would help if you could look at both of them and tell me what chips
are on them. The 21143 can do 10Mbps all by itself, but for 100Mbps
you'd need an extra transceiver. I've been working under the assumption
that they're just using the built-in 10baseT port on the 21143, but
it's possible they're using the GPIO bits to do some funny business
to switch the ports.
 
> In the meantime I gave your patch a quick try and I unfortunately don't
> see a change in behaviour. Still watchdog timeouts and no connection.
> 
> Question: I had expected dmesg and ifconfig to report 10Mbit only modes.
> They still show 100 as supported media in addition to the 10Mbit modes.

You have to be able to tell that the chip only supports 10Mbps modes.
The 21143 is a 100Mbps chip, and only in certain cases do people design
10Mbps-only NICs around it. The problem is that to know if you've got
only 10Mbps, you normally have to slog through the SROM info, however a
lot of card vendors get this wrong, so I don't even bother with it.
 
> There is something else that might interest you: when replacing a 10 Mbit
> only bulkhead with a 10/100 one you need to connect it to the PCI bulkhead
> with a different cable to a different connector (on the PCI bulkhead). The
> 10/100 one is silkscreened as MII. 

Then it probably has a 10/100 PHY on it. Assuming the driver can probe
it without having to flip any magic GPIO bits, it should work.

> Could this mean the driver sees a MII interface while in this particular
> setup the bulkhead is connected to something non-MII ? Wild guess maybe..

I'm sure it is non-MII. It's still supposed to work, however it's hard
to tell just what I'm supposed to do to make it happy from way over here.

-Bill 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Looking for testers for if_dc patches

2000-06-02 Thread Bill Paul

> 
> For reference the ID reported is:
> 
> de0@pci0:3:0:  class=0x02 card=0x chip=0x00191011 rev=0x11 hdr=0x00

Hm, ok. First of all, I made a mistake in what I told you. The code in
dcphy.c checks the subsystem ID, not the device ID. The device ID is always
the same, since that identifies the 21143 chip, however the subsystem ID
can vary from board to board depending on the manufacturer's whims.
The odd thing is that the subsystem ID here is 0x (the "card="
value), however that doesn't rule out running our test.

So, go back to dcphy.c and do this:
 
case COMPAQ_PRESARIO_ID:
case 0x:
   /* Example of how to only allow 10Mbps modes. */
   sc->mii_capabilities = BMSR_ANEG|BMSR_10TFDX|BMSR_10THDX;
   break;

Let me know if this has any effect.

-Bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: my dc now doesn't work

2001-10-11 Thread Bill Paul

> 
> After the last cvsup (changes from 29 of september) i've got dead
> dc (21143 based NIC).

You have to tell us _exactly_ what card you have. Find the manufacturer
and model info. Look on the box the card came in. Look at the card itself.
Show us the output from pciconf -l so we can see the PCI vendor/device ID
info. Yes, this information is important. Yes, I'm irritated that you
didn't provide it straight off. (But then nobody ever does. Guys? You
don't need me to ask you to provide this information. It's common sense.
It's staring you right in the face.)

> LEDs are dead, but card is successfully probed and
> attached, so i have device but can't use it. What should i send to help
> investigate this problem?

Knowing exactly what card this is will help. You can't debug this
problem: I'm going to have to figure out a way to test and debug this
myself, which is going to suck, as I no longer have an easy way to
do FreeBSD work now that Wind River has pulled the plug on the test
lab.

If you want to be really nice, you can arrange to have this machine
made accessible remotely (via an alternate network interface) and
let me tinker with it via ssh. Otherwise, you'll have to wait for
me to put together a test setup locally.

> Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: dc0: Ethernet address: 
>00:80:ad:90:b4:38
> Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: miibus0:  on dc0
> Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: dcphy0: interface> on miibus0
> Oct 11 11:57:37 ws-ilmar /boot/kernel.old/kernel: dcphy0:  10baseT, 10baseT-FDX, 
>100baseTX, 100baseTX-FDX, auto

I strongly suspect that the recent changes to the miibus code by jlemon
have hosed the dcphy driver, which is very sensitive. (You don't want to
know how long it took me to get it working halfway decently.)

-Bill

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: newcard/cardbus instabilities

2001-03-22 Thread Bill Paul

> 
> That's a bit ugly.
> 
> > xl0: <3Com 3c575C Fast Etherlink XL> port 0x3000-0x307f mem
> > 0x4402-0x4403,0x44002480-0x440024ff,0x44002400-0x4400247f irq 10 at
> > device 0.0 on cardbus1
> > xl0: chip is in D6 power mode -- setting to D0
> 
> I'm a bit worried about this; "D6" doesn't really exist, so it's possible 
> that something is going wrong here.
> 
> Bill; you might have some better  ideas than I do.  Suggestions?

My suggestion? Chop out the power management stuff in xl_attach()
and see what happens. The xl driver is using the pci_get_powerstate()
and pci_set_powerstate() routines right now in order to check for PCI
NICs that have been forced into the D3 state by Windoze during shutdown.
However, those functions are internal to the PCI bus code, and I'm not
sure what will happen when you try to use them with devices that are
children of a cardbus bus.

So, edit /sys/pci/if_xl.c, find the xl_attach() function, and comment
out/#ifdef out/delete the section that checks the power state of the card.
Like Mike says, the D6 state is bogus.

Unfortunately, I can't test this myself at the moment since I find myself
without a laptop. I might be able to coerce^Wconvince John Baldwin to
let me test this with his though.

-Bill

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Fixing ypbind with TI-RPC

2001-03-26 Thread Bill Paul

Ok. Friday I sat down and tried to make the -m option to ypbind work
correctly using the new TI-RPC code. Unfortunately, my test machine
chose that day to eat itself. Even more unfortunately, it was an AMD
900Mhz Thunderbird. Today, I started working on another box and managed
to get things to work, but there are some problems that still need
solving. I need some input to decide how to do this.

The problem is with the code in yp_ping.c. This module contains a special
version of clntudp_call() which has been modified in two ways:

1) If the XDR encode routine is specified as NULL, it skips the transmit
   portion of clntudp_send() and jumps straight to receiving and decoding
   the reply.
2) When processing a reply, the routine omits the check of the transaction
   ID, so that the reply will be processed even if its XID doesn't match
   the XID of the request that was last sent.

This is done so that we can send a bunch of YPPROC_DOMAIN_NONACK requests
to different servers, each with a different transaction ID, then wait to
see who replies first. Distinguishing the servers based on the XID gets
around the case where the server is multihomed and replies on an interface
other than the one where it received the original RPC. This is basically
an asynchronous RPC, where the request and response are handled separately
rather than in the context of a single clntudp_call().

Anyway, now that we have the TI-RPC library, the magic clntudp_a_call()
routine needs to be changed to a clnt_dg_a_call(). Unfortunately, when I
tried to do this, I ran into a serious problem:

- The clnt_dg.c module has several module-wide lock variables which are
  shared between the create/call/destroy methods. Trying to set up a
  private call method won't work, because the lock variables are static,
  hence not exported from the clnt_dg.o object module. As a hack I created
  a separate clnt_dg.c module which I linked directly into a test ypbind
  binary, but this is not what I consider a proper solution.

Basically, I can't do things the way I did them with the older RPC code
because of the threading/mutex locks. Even building a separate clnt_dg.o
module with modifications was harder than it needed to be because the
clnt_dg.c code #includes special header files within the libc source
(src/lib/libc/include) which aren't available if you aren't building
the world.

The solution I'm leaning towards at the moment is adding the necessary
hacks to src/lib/libc/rpc/clnt_dg.c in such a way that they can be enabled
when needed with a special CLSET flag using clnt_control(). Then I can
rip out the custom call method code from yp_ping.c entirely. I'm a little
reluctant to do this since I was under the impression that creating a
custom method should still work, but it looks as if this problem is
endemic even to the original Sun TI-RPC code, not just us.

Comments? Questions? Pie?

-Bill

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Fixing ypbind with TI-RPC

2001-03-26 Thread Bill Paul

> > 
> > Why can't you just enable sigio on the reply socket, send all the
> > requests with a 0 timeout and then wait for a signal to either
> > interrupt the sending or to notify you when you complete sending?
> > 
> > Your solution seems awfully complex for what seems to be a simple
> > problem; doing a directed broadcast and taking the first answer you
> > get back.
> > 
> > Is the whole reason you need to do this because you're using the
> > xid to differentiate between the servers?

Once upon a time, a young coder needed to be able to send multiple
RPC requests and then wait for a single reply, instead of adhering to
the strict request/reply mechanism in the existing code. So he studied
the problem over many long nights. He fretted. He fussed. He tried not
to reinvent the wheel or cure cancer while he was at it. Eventually,
he made a couple of minor changes to a piece of existing code that did
exactly what he wanted.

The end.

> If that's true then you have a couple of options...
> 
> We could add a hack to our version of the rpc library to
> allow manipulation/query of the xid.  Or if the reply's source
> doesn't match any of the destinations you can remember it and
> send out another ping to that address.

You're already allowed to get/set the transaction ID using CLGET_XID
and CLSET_XID, and I intend to use this too. All I want to do is invoke
YPPROC_DOMAIN_NONACK on a bunch of servers and see which one replies
first, and I need to do it using unicasts because broadcasts won't get
forwarded across routers or point-to-point dialup links. I originally
created this monstrosity for NIS+, and decided it would be useful to
silence the people who insisted on putting their NIS clients on separate
networks from their NIS servers, and didn't want to set up NIS slaves
on the remote subnets like they were supposed to.

I'm including a patch to clnt_dg.c and clnt.h that adds the functionality
I need. It's quite small, and it's the path of least resistance in this
case, which is why I prefer it in this case in spite of the brokenness
it seeks to supress. Oh, there's also a fix for a bug in here. At least,
I think it's a bug. The clnt_dg_call() routine increments the transaction
ID before transmitting a request. It assumes that the XID is a 32-bit
value at a certain position in the block to be sent, and simply does a
cast to a u_int32_t and an in-place increment. The problem is, this value
is actually in network byte order, so to increment it properly, you need
to ntohl() it first, increment, then htonl() it back. Of course, if you
believe Sun, all the world's a SPARC, so for them it doesn't matter. This
really only becomes a problem if you actually use the CLGET_XID and
CLSET_XID control codes on a UDP client handle: the code in clnt_dg_control()
does the proper byte swapping, but clnt_dg_call() doesn't.

I'm not positive I'm doing the right thing here, but without this fix,
my newly hacked __yp_ping() routine produces some weird results.

-Bill


*** clnt_dg.c.orig  Mon Mar 26 21:17:00 2001
--- clnt_dg.c   Mon Mar 26 21:21:08 2001
***
*** 126,131 
--- 126,132 
char*cu_outbuf;
u_int   cu_recvsz;  /* recv size */
struct pollfd   pfdp;
+   int cu_async;
charcu_inbuf[1];
  };
  
***
*** 238,243 
--- 239,245 
cu->cu_total.tv_usec = -1;
cu->cu_sendsz = sendsz;
cu->cu_recvsz = recvsz;
+   cu->cu_async = FALSE;
(void) gettimeofday(&now, NULL);
call_msg.rm_xid = __RPC_GETXID(&now);
call_msg.rm_call.cb_prog = program;
***
*** 312,317 
--- 314,320 
socklen_t fromlen, inlen;
ssize_t recvlen = 0;
int rpc_lock_value;
+   u_int32_t xid;
  
sigfillset(&newmask);
thr_sigsetmask(SIG_SETMASK, &newmask, &mask);
***
*** 336,347 
  
  call_again:
xdrs = &(cu->cu_outxdrs);
xdrs->x_op = XDR_ENCODE;
XDR_SETPOS(xdrs, cu->cu_xdrpos);
/*
 * the transaction is the first thing in the out buffer
 */
!   (*(u_int32_t *)(void *)(cu->cu_outbuf))++;
if ((! XDR_PUTINT32(xdrs, &proc)) ||
(! AUTH_MARSHALL(cl->cl_auth, xdrs)) ||
(! (*xargs)(xdrs, argsp))) {
--- 339,357 
  
  call_again:
xdrs = &(cu->cu_outxdrs);
+   if (cu->cu_async == TRUE && xargs == NULL)
+   goto get_reply;
xdrs->x_op = XDR_ENCODE;
XDR_SETPOS(xdrs, cu->cu_xdrpos);
/*
 * the transaction is the first thing in the out buffer
+* XXX Yes, and it's in network byte order, so we should to
+* be careful when we increment it, shouldn't we.
 */
!   xid = ntohl(*(u_int32_t *)(void *)(cu->cu_outbuf));
!   xid++;
!   *(u_int32_t *)(void *)(cu->cu_outbuf) = htonl(xid);
! 
if ((! XDR_PUTINT32(xdrs, &proc)) ||
   

Need reviewers for busdma changes to ethernet driver

2001-08-09 Thread Bill Paul

Hi folks:

Well, after threatening to do it for a long time, I finally sat down and
converted one of my ethernet drivers to use the bus_dma API so that I
no longer have to do things like call contigmalloc() and/or vtophys()
directly. The changes I made are to the driver in -current, and the
new code is at:

http://www.freebsd.org/~wpaul/SiS/busdma

I have tested this driver on FreeBSD/x86 using a NatSemi DP83815
card (the Netgear FA312TX) and it seems to work fine for me. However,
I'm not 100% certain I used the busdma API properly in all cases.
If anyone with a busdma clue would care to look over the code and
see everything looks more or less legal, I would appreciate it. My
main concern is that I'm using bus_dma_load() and bus_dma_unload()
correctly (i.e. such that I'm not leaking any resources).

Unless anyone raises serious objections, I would like to commit this
code ASAP (the last test I really need to do is make sure it works
correctly on an alpha).

-Bill

=
-Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu
 [EMAIL PROTECTED] | Wind River Systems
=
"I like zees guys. Zey are fonny guys. Just keel one of zem." -- The 3 Amigos
=

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Where to put new bus_dmamap_load_mbuf() code

2001-08-20 Thread Bill Paul

Okay, I decided today to write a bus_dmamap_load_mbuf() routine to
make it a little easier to convert the PCI NIC drivers to use the
busdma API. It's not the same as the NetBSD code. There are four
new functions:

bus_dmamap_load_mbuf()
bus_dmamap_unload_mbuf()
bus_dmamap_sync_mbuf()
bus_dmamap_destroy_mbuf()

This is more or less in keeping with the existing API, except the new
routines work exclusively on mbuf lists. The thing I need to figure
out now is where to put the code. The current suggestion from jhb is
to create the following two new files:

sys/kern/kern_busdma.c
sys/sys/busdma.h

The functions are machine-independent, so they shouldn't be in
sys///busdma_machdep.c. I mean, they could go there, but
that would just result in code duplication. If somebody has a better
suggestion, now's the time to speak up. Please let's avoid creating
another bikeshed over this.

Current code snapshot resides at:

http://www.freebsd.org/~wpaul/busdma

There's also a modified version if the Adaptec "starfire" driver there
which uses the new routines. I'm running this version of the driver on
a test box in the lab right now.

-Bill

--
=====
-Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu
 [EMAIL PROTECTED] | Wind River Systems
=
"I like zees guys. Zey are fonny guys. Just keel one of zem." -- The 3 Amigos
=

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Where to put new bus_dmamap_load_mbuf() code

2001-08-20 Thread Bill Paul

> 
> Another thing- maybe I'm confused- but I still don't see why you want to
> require the creating of a map each time you want to load an mbuf
> chain. Wouldn't it be better and more efficient to let the driver decide when
> and where the map is created and just use the common code for loads/unloads?

Every hear the phrase "you get what you pay for?" The API isn't all that
clear, and we don't have a man page or document that describes in detail
how to use it properly. Rather than whining about that, I decided to
tinker with it and Use The Source, Luke (tm). This is the result.

My understanding is that you need a dmamap for every buffer that you want
to map into bus space. Each mbuf has a single data buffer associated with
it (either the data area in the mbuf itself, or external storage). We're
not allowed to make assumptions about where these buffers are. Also, a
single ethernet frame can be fragmented across multiple mbufs in a list.

So unless I'm mistaken, for each mbuf in an mbuf list, what we
have to do is this:

- create a bus_dmamap_t for the data area in the mbuf using
  bus_dmamap_create()
- do the physical to bus mapping with bus_dmamap_load()
- call bus_dmamap_sync() as needed (might handle copying if bounce
  buffers are required)
- 
- do post-DMA sync as needed (again, might require bounce copying)
- call bus_dmamap_unload() to un-do the bus mapping (which might free
  bounce buffers if some were allocated by bus_dmamap_load())
- destroy the bus_dmamap_t

One memory region, one DMA map. It seems to me that you can't use a
single dmamap for multiple memory buffers, unless you make certain
assumptions about where in physical memory those buffers reside, and
I thought the idea of busdma was to provide a consistent, opaque API
so that you would not have to make any assumptions.

Now if I've gotten any of this wrong, please tell me how I should be
doing it. Remember to show all work. I don't give partial credit, nor
do I grade on a curve.

> > Yay!
> > 
> > The current suggestion is fine except that each platform might have a more
> > efficient, or even required, actual h/w mechanism for mapping mbufs.

It might, but right now, it doesn't. All I have to work with is the
existing API. I'm not here to stick my fingers in it and change it all
around. I just want to add a bit of code on top of it so that I don't
have to go through quite so many contortions when I use the API in
network adapter drivers.
 
> > I'd also be a little concerned with the way you're overloading stuff into mbuf
> > itself- but I'm a little shakier on this.

I thought about this. Like it says in the comments, at the device driver
level, you're almost never going to be using some of the pointers in the
mbuf header. On the RX side, *we* (i.e. the driver) are allocating the
mbufs, so we can do whatever the heck we want with them until such time
as we hand them off to ether_input(), and by then we will have put things
back the way they were. For the TX side, by the time we get the mbufs
off the send queue, we always know we're going to have just an mbuf list
(and not an mbuf chain), and we're going to toss the mbufs once we're done
with them, so we can trample on certain things that we know don't matter
to the OS or network stack anymore.

The alternatives are:

- Allocate some extra space in the DMA descriptor structures for the
  necessary bus_dmamap_t pointers. This is tricky with this particular
  NIC, and a little awkward.
- Allocate my own private arrays of bus_dmamap_t that mirror the DMA
  rings. This is yet more memory I need to allocate and free at device
  attach and detach time.

I've got space in the mbuf header. It's not being used. It's right
where I need it. Why not take advantage of it?

> > Finally- why not make this an inline?

Er... because that idea offended my delicate sensibilities? :)

-Bill

--
=
-Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu
 [EMAIL PROTECTED] | Wind River Systems
=
"I like zees guys. Zey are fonny guys. Just keel one of zem." -- The 3 Amigos
=

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Where to put new bus_dmamap_load_mbuf() code

2001-08-22 Thread Bill Paul


> >Maybe, but bus_dmamap_load() only lets you map one buffer at a time.
> >I want to map a bunch of little buffers, and the API doesn't let me
> >do that. And I don't want to change the API, because that would mean
> >modifying busdma_machdep.c on each platform, which is a hell that I
> >would rather avoid.
> 
> bus_dmamap_load() is only one part of the API.  bus_dmamap_load_mbuf
> or bus_dmamap_load_uio or also part of the API.  They just don't happen
> to be impmeneted yet. 8-)  Perhaps there should be an MD primitive
> that knows how to append to a mapping?  This would allow you to write
> an MI loop that does exactly what you want.

Any one of those ideas would be just fine. I eagerly await their
realization. :)
 
> >It's a separate list. The driver is reponsible for allocating the
> >head of the list, then it hands it to bus_dmamap_list_alloc() along
> >with the required dma tag. bus_dmamap_list_alloc() then calls
> >bus_dmapap_create() to populate the list. The driver doesn't have
> >to manipulate the list itself, until time comes to destroy it.
> 
> Okay, but does this mean that bus_dmamap_load_mbuf no longer takes
> a dmamap?  Drivers may want to allocate/manage the dmamaps in a
> different way.

Yes, bus_dmamap_load_mbuf() accepts a dma tag, the head of the
dmamap list, an mbuf, an segment array and a segment count. The
Driver allocates the segment array with a certain number of
members. It passes the array and segment count to bus_dmamap_load_mbuf(),
which treats the segment count as the maximum number of segments
that it can return to the caller. Once all the mappings have been
done, it updates the segment count to indicate how many segments
were actually needed. Then the driver transfers the info from
the segment array into its DMA descriptor structures and kicks
off the DMA operation.

Once the device signals the transfer is done, the driver calls
bus_dmamap_unload_mbuf() and bus_dmamap_destroy_mbuf() to unload
the maps and return them to the map list for later use. It isn't
until the driver calls bus_dmamap_list_destroy() that the dmamaps
are actually released and the list free()ed.

-Bill

--
=
-Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu
 [EMAIL PROTECTED] | Wind River Systems
=
"I like zees guys. Zey are fonny guys. Just keel one of zem." -- The 3 Amigos
=

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Where to put new bus_dmamap_load_mbuf() code

2001-08-22 Thread Bill Paul

> >My understanding is that you need a dmamap for every buffer that you want
> >to map into bus space.
> 
> You need one dmamap for each independantly manageable mapping.  A
> single mapping may result in a long list of segments, regardless
> of whether you have a single KVA buffer or multiple KVA buffers
> that might contribute to the mapping.

Yes yes, I understand that. But that's only if you want to map
a buffer that's larger than PAGE_SIZE bytes, like, say, a 64K
buffer being sent to a disk controller. What I want to make sure
everyone understands here is that I'm not typically dealing with
buffers this large: instead I have lots of small buffers that are
smaller than PAGE_SIZE bytes. A single mbuf alone is only 256
bytes, of which only a fraction is used for data. An mbuf cluster
buffer is usually only 2048 bytes. Transmitted packets are typically
fragmented across 2 or 3 mbufs: the first mbuf contains the header,
and the other two contain data. (Or the first one contains part
of the header, the second one contains additional header data,
and the third contains data -- whatever.) At most I will have 1500
bytes of data to send, which is less than PAGE_SIZE, and that 1500
bytes will be fragmented across a bunch of smaller buffers that
are also smaller than PAGE_SIZE. Therefore I will not have one
dmamap with multiple segments: I will have a bunch of dmamaps
with one segment each.

(I can hear somebody out there saying: "What about jumbo frames?"
Yes, with jumbo frames, I will have 9K buffers to deal with, and
in that case, you could have one dmamap with several segments, and
I am taking this into account with the updated code I've written.)

> >So unless I'm mistaken, for each mbuf in an mbuf list, what we
> >have to do is this:
> >
> >- create a bus_dmamap_t for the data area in the mbuf using
> >  bus_dmamap_create()
> 
> Creating a dmamap, depending on the architecture, could be expensive.
> You really want to create them in advance (or pool them), with at most
> one dmamap per concurrent transaction you support in your driver.

The only problem here is that I can't really predict how many transactions
will be going at one time. I will have at least RX_DMA_RING maps (one for
each mbuf in the RX DMA ring), and some fraction of TX_DMA_RING maps.
I could have the TX DMA ring completely filled with packets waiting
to be DMA'ed and transmitted, or I may have only one entry in the ring
currently in use. So I guess I have to allocate RX_DMA_RING + TX_DMA_RING
dmamaps in order to be safe.

> >- do the physical to bus mapping with bus_dmamap_load()
> 
> bus_dmamap_load() only understands how to map a single buffer.
> You will have to pull pieces of bus_dmamap_load into a new
> function (or create inlines for common bits) to do this
> correctly.  The algorithm goes something like this:
> 
>   foreach mbuf in the mbuf chain to load
>   /*
>* Parse this contiguous piece of KVA into
>* its bus space regions.
>*/
>   foreach "bus space" discontiguous region
>   if (too_many_segs)
>   return (error);
>   Add new S/G element
> 
> With the added complications of deferring the mapping if we're
> out of space, issuing the callback, etc.

Why can't I just call bus_dmamap_load() multiple times, once for
each mbuf in the mbuf list?

(Note: for the record, an mbuf list usually contains one packet
fragmented across multiple mbufs. An mbuf chain contains several
mbuf lists, linked together via the m_nextpkt pointer in the
header of the first mbuf in each list. By the time we get to
the device driver, we always have mbuf lists only.)

> Chances are you are going to use the map again soon, so destroying
> it on every transaction is a waste.

Ok, I spent some more time on this. I updated the code at:

http://www.freebsd.org/~wpaul/busdma

The changes are:

- Tried to account for the case where an mbuf data region is larger
  than a page, i.e. when we have an mbuf with a 9K external buffer
  attached for use a jumbo ethernet frame.
- Added routines to allocate a chunk of maps in a singly linked list,
  from which the other routines can grab them as needed. The driver
  attach routine calls bus_dmamap_list_init() with the max number of
  dmamaps that it will need, then the detach routine calls
  bus_dmamap_list_destroy() to nuke them when the driver is unloaded.
  The bus_dmamap_load_mbuf() routine uses the pre-allocated dmamaps
  from the list and bus_dmamap_list_destroy() returns them to the
  list when the transaction is completed.
- Updated the modified if_sf driver to use the new code.

Again, I've got this code running on the test box in the lab, so it's
correct inasmuch 

Re: Where to put new bus_dmamap_load_mbuf() code

2001-08-22 Thread Bill Paul


> The fact that the data is less than a page in size matters little
> to the bus dma concept.  In other words, how is this packet presented
> to the hardware?  Does it care that all of the component pieces are
> < PAGE_SIZE in length?  Probably not.  It just wants the list of
> address/length pairs that compose that packet and there is no reason
> that each chunk needs to have it own, and potentially expensive, dmamap.

Maybe, but bus_dmamap_load() only lets you map one buffer at a time.
I want to map a bunch of little buffers, and the API doesn't let me
do that. And I don't want to change the API, because that would mean
modifying busdma_machdep.c on each platform, which is a hell that I
would rather avoid.

> >Why can't I just call bus_dmamap_load() multiple times, once for
> >each mbuf in the mbuf list?
> 
> Due to the cost of the dmamaps, the cost of which is platform and
> bus-dma implementation dependent - e.g. could be a 1-1 mapping to
> a hardware resource.  Consider the case of having a full TX and RX
> ring in your driver.  Instead of #TX*#RX dmamaps, you will now have
> three or more times that number.
> 
> There is also the issue of coalessing the discontiguous chunks if
> there are too many chunks for your driver to handle.  Bus dma is
> supposed to handle that for you (the x86 implementation doesn't
> yet, but it should) but it can't if it doesn't understand the segment
> limit per transaction.  You've hidden that from bus dma by using a
> map per segment.

Ok, a slightly different question: what happens if I call
bus_dmamap_load() more than once with different buffers but with
the same dmamap?

> >(Note: for the record, an mbuf list usually contains one packet
> >fragmented across multiple mbufs. An mbuf chain contains several
> >mbuf lists, linked together via the m_nextpkt pointer in the
> >header of the first mbuf in each list. By the time we get to
> >the device driver, we always have mbuf lists only.)
> 
> Okay, so I haven't written a network driver yet, but you got the idea,
> right? 8-)

Just don't get 3c509 and 3c905 misxed up and we'll be fine. :)

> >- Added routines to allocate a chunk of maps in a singly linked list,
> >  from which the other routines can grab them as needed.
> 
> Are these hung off the dma tag or something?  dmamaps may hold settings
> that are peculuar to the device that allocated them, so they cannot
> be shared with other clients of bus_dmamap_load_mbuf.

It's a separate list. The driver is reponsible for allocating the
head of the list, then it hands it to bus_dmamap_list_alloc() along
with the required dma tag. bus_dmamap_list_alloc() then calls
bus_dmapap_create() to populate the list. The driver doesn't have
to manipulate the list itself, until time comes to destroy it.

-Bill

--
=
-Bill Paul(510) 749-2329 | Senior Engineer, Master of Unix-Fu
 [EMAIL PROTECTED] | Wind River Systems
=
"I like zees guys. Zey are fonny guys. Just keel one of zem." -- The 3 Amigos
=

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: cvs commit: src/sys/pci if_dc.c

2000-09-03 Thread Bill Paul

> Hello Bill,
> 
> After the following commit, my system fail to connect to network.
> If I backout, seems to work again. Any comments appreciated.

No no no. *You* are the one who's supposed to make the comments.
Like exactly what card do you have (make/model)? Exactly what speed
and duplex mode are you using? (10mbps? 100mbps? full duplex? half
duplex?) What ifconfig command do you use to bring up the interface?
What kind of hub/switch/whatever is the card connected to? You know,
all the stuff that I can't figure out for myself because I can't see
your computer from way over here.

-Bill 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: cvs commit: src/sys/pci if_dc.c

2000-09-05 Thread Bill Paul

> Hello Bill,
> 
> I'm sorry about that. Here's some information that I can gather:
> 1. The Intel 21143 chips is intergrated in NEC VersaPro NoteBook PC.
>No LED to indicate the network activity are available.
> 
> 2. It is connected to 10BaseT Hub (HP 28688B) at half duplex.

Ok, two more things:

- Show me the output of pciconf -l.
- Is this supposed to be a 10/100 interface or just 10mbps?

-Bill 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: "Driver Floppy" implementation (Re: "make release" breakage - dokern.sh patch 2)

2000-11-01 Thread Bill Paul

>I'm not sure whether the problem of loading secondary usb modules is a
>problem in 4.x but it is easy to try.

>Boot a machine without usb support compiled in. after login, kldload
>usb, then the miibus and then the if_aue modules. If that works, you
>should be ok.

>I cannot test this as at the moment as I don't have a STABLE box (will
>have once the first RC comes out of JKH factories).

I usually do the following:

# kldload usb   (probes USB controllers)
# kldload miibus
# kldload if_aue
# usbd -f /dev/usb0

If the device has already been plugged in, starting usbd will cause
it to be probed/attached by the aue driver. If not, it will be detected
when it's plugged in later.

-Bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: if_rl.c broken ? Realtek 8139 not longer recognised.

2000-11-01 Thread Bill Paul

> Hi,
> 
> I have a realtek ethernet card. The normal dmesg is this:
> 
> rl0:  port 0xb400-0xb4ff mem 0xd900-0xd9ff irq 10 
>at device 11.0 on pci0
> rl0: Ethernet address: 00:e0:7d:7d:cd:35
> miibus0:  on rl0
> rlphy0:  on miibus0
> rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> 
> With the change to 8bit wide eeprom reads instead of the 6bit wide reads, the message
> is now:
> 
> rl0:  port 0xb400-0xb4ff mem 0xd900-0xd9ff irq 10 
>at device 11.0 on pci0
> rl0: Ethernet address: 00:e0:7d:7d:cd:35
> rl0: unknown device ID: 4a7
> 
> I changed if_rl.c to confirm that it really is the 6/8 bit change:

Just fixed this. It should be 0x8129 that we compare with, not 8129.
Sorry about that. Note that the cardbus hacks aren't in -stable yet
so it wasn't affected.

-Bill 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: vx driver patch

2000-11-06 Thread Bill Paul

> Someone (I can't find who in my records, please let me know if it was
> you so I can credit you in the commit message) sent out patches to
> make the vx driver not use the pci compat shims.  I just found it in
> my home directory, applied it, tweaked things very minorly and it
> builds and boots.  Trouble is, I don't have a vortex to test with.  It
> also appears that there is no driver maintainer at this time, so I
> thought I'd send it here.

Unfortunately, there are a couple of problems with this patch. Somebody
tried copying the EISA attachment code too closely: there's only one
I/O space that needs to be allocated (the pci_io allocation is bogus).
The IRQ allocation needs the RF_SHAREABLE flag or it will blow up in
the case where the IRQ is shared with another device. Also, the vx
driver still uses the ugly hack of statically allocated softc structs.

I was working on this in the office the other day and just got done
testing it. I have patches to fix all of this, plus make it use the
bus_space_*() stuff instead of inb/outb/etc, plus allow it to be compiled
as a KLD. The only thing I didn't do was implement detach routines,
which means the driver can be loaded as a KLD, but not unloaded.
The driver should also build in the alpha.

I'll commit the changes to -current shortly.

-Bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Getting at cardbus CIS data from inside drivers

2000-11-20 Thread Bill Paul

Okay. Recently, David O'Brien handed me an Intel 10/100 Cardbus NIC,
which uses the 21143-PB chip. It's a non-MII card (has a Quality Semi
symbol PHY). Unfortunately, it looks like Intel has taken a few shortcuts
with this card: the serial EEPROM doesn't contain any useful information.
Instead, the MAC address and, I presume, the GPIO programming info is
stored in the CIS. When the card is inserted, the cardbus code prints
out several 'Function Extension' lines, one of which contains the MAC
address. The problem is, there's no way for me to obtain this info
from inside the driver, unless I map the expansion ROM directly and
grovel through the CIS myself, which I don't want to do.

I have the card working at the moment using a couple of ugly cheats:
I programmed the MAC address in manually using ifconfig dc0 ether blah,
and I brute forced the GPIO settings so that all of the pins are
configured as outputs and are forced to 1's. This seems to be enough
to activate the transceiver, and I can exchange traffic. (I'm composing
this e-mail with it right now.) The LED programming is still off though:
both LEDs are lit green, and stay on regardless of link indication or
speed.

Is there any support planned for externalizing the CIS info somehow,
i.e. by providing bus methods to call the CIS parsing routines? Another
way to do it would be to pass the info down to the child device using
ivars. I would imaging that there's similar support for this in Windows,
otherwise Intel's driver wouldn't work.

-Bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Any people with 3c905CX cards out there?

2001-01-12 Thread Bill Paul

3Com has yet another revision of the Tornado chipset floating around out
there on newer 3c905C adapters. Supposedly, these are marked as 3c905CX
and have become available within the last couple of months. I've seen
some noise on the Linux mailing lists that seems to indicate that some
driver mods were necessary due to reset timing differences introduced
in the new chipset, however I haven't been able to get my hands on one
of these cards yet so I don't know whether or not there are also problems
with FreeBSD. Nobody has reported any yet, but it would be nice to
confirm the issue one way or the other. If someone has one of these
cards and is using it with the xl driver, I'd be interested to know
how well (or how badly) it's working.

-Bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Network/ARP problem? Maybe pn driver?

1999-01-29 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Christopher 
Masto had to walk into mine and say:

> On Fri, Jan 29, 1999 at 02:52:07PM -0800, Bill Fenner wrote:
> > Can you run a "tcpdump arp" on the machine that is having the problem,
> > as well?  This could help to determine if it's a driver problem (e.g.
> > if the replies don't show up) or an ARP problem (e.g. if the replies
> > do show up but arp doesn't use them).
> 
> Good idea.
> 
> Hmm.  Running tcpdump seems to make the problem go away.  The ARP
> replies show up immediately appear in the table.  Clue.

You should have tried that first.

There's something I'd like you to try for me. (Don't delay in trying
this; I've had a long string of people who appear suddenly, complain
about a problem of some sort, then vanish before I can extract enough
information from them to find a solution.)

I was menaced by a bug in the PNIC's receive DMA operation which, according 
to all my tests, only appeared in promiscuous mode. I devised a workaround,
however it's only enabled when the IFF_PROMISC flag is set on the
interface. Running tcpdump (without the -p flag) places the interface
in promiscuous mode and enables the workaround. Given what you've said,
it's possible that we need to enable the workaround all the time, not
just in promiscuous mode.

Do me the following:

- Bring up /sys/pci/if_pn.c in your favorite editor.
- Locate the pn_rxeof() function and find the following code:

#ifdef PN_PROMISC_BUG_WAR 
/*
 * XXX The PNIC seems to have a bug that manifests
 * when the promiscuous mode bit is set: we have to
 * watch for it and work around it.
 */
if (sc->pn_promisc_war && ifp->if_flags & IFF_PROMISC) {
[...]
- Change the if() clause so that it looks like this:

if (sc->pn_promisc_war /*&& ifp->if_flags & IFF_PROMISC*/) {

  (In other words, comment out the test for the IFF_PROMISC flag.)

This will enable the workaround all the time and allow the receiver bug 
to be detected and handled properly.

Compile a new kernel with this change and see if the problem persists.
Report back your findings (one way or the other) so that I'll know if
I should modify the code in the repository.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


CAP port and non-IP multicast

1999-01-30 Thread Bill Paul
Somebody wrote me recently to tell me they were having trouble getting
the Columbia Appletalk package to work with a PCI ethernet card. Looking
through both the Columbia Appletalk code and the kernel, I think the
problem is a general one not necessarily related to a given ethernet
driver. I'm not sure what the proper fix is though.

The CAP code contains a module called cap60/support/ethertalk/bpfiltp.c
which contains library support code for libcap when the package is
built with EtherTalk Phase 2 support. As the name implies, it works
with BPF, but it also contains the pi_addmulti() routine. The aarpd
program uses this function to join the 09:00:07:ff:ff:ff multicast
group. Since this is not an IP multicast group, you have to specify
something besides AF_INET as the family when using SIOCADDMULTI to
join.

The question is, what should this something else be. In 2.2.x, you
have to use AF_UNSPEC, but in 3.x and up, you have to use AF_LINK.
The CAP port uses AF_UNSPEC in both cases, which is incorrect if
you're building the port on a 3.0 (or 4.0) host.

What's the right way to fix this? There are really two possibilities:
1) change bpfiltp.c so that it conditionally uses AF_UNSPEC or AF_LINK
depending on the OS release on which the port is being compiled, or
2) change sys/net/if_ethersubr.c so that it treats AF_UNSPEC and
AF_LINK the same. I expect changing the CAP code would be the more
'politically correct' approach, but it doesn't seem unreasonable to
allow backwards compatibility in the kernel code either.


-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Even more interesting NFS problems..

1999-01-31 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Jordan K. 
Hubbard had to walk into mine and say:

> > Why is it clearly broken?  proto=tcp,vers=3 is what is in 3.0-RELEASE,
> > Amd in 3.0 works for many.  I won't defend that the new Amd works the
> > best with us, but then neither did the old Amd.
> 
> Erm, I haven't tried it between 3.0 and 3.0 boxes because all my test
> environments currently involve one of each (4.0 and 3.0), but I can
> certainly say that in none of these test environments does amd work at
> all.  On freefall, for example, it's really simple to demonstrate the
> error.  First, we start amd:
> 
> # amd -a /net -c 1800 -k i386 -d freebsd.org -l syslog /host /etc/amd.map

Err On all of the machines where I use amd, I don't use -l syslog.
I use -l /tmp/.automsg (or some other filename that lusers aren't likely
to trip over). You get _MUCH_ more information this way. I strongly
suggest trying this and observing the results when you try to automount
something.

I've found that am-utils is much more verbose than previous versions of
amd so you may not want to leave it that way permanently, but you can't
beat it for troubleshooting.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


SIOCADDMULTI doesn't work, proposed fix

1999-01-31 Thread Bill Paul
After experimenting some more, I've come to the conclusion that trying to
manually add a non-IP ethernet multicast address doesn't work properly.
The ether_resolvemulti() assumes that addresses will be specified as
either AF_LINK or AF_INET; if the family is AF_LINK, it assumes that
a struct sockaddr_dl will be used. However, the user is supposed to
pass the address using a struct ifreq, and struct ifreq uses struct
sockaddr, not struct sockaddr_dl.

The original code in 2.2.x expected a struct sockaddr with a family
of AF_UNSPEC. This no longer works in 3.0, which breaks compatibility.
Among other things, the Columbia Appletalk port doesn't work because
of this.

As an aside, the equal() macro in /sys/net/if.c does a bcmp() using
sa_len as the length of the data to check, but doesn't account for
the possibility of sa_len being 0 (this makes it always return true,
which can yield false positives).

The patches included with this post change /sys/net/if.c and
/sys/net/if_ethersubr.c so that adding a mutlicast address with 
SIOCADDMULTI using a struct sockaddr and AF_UNSPEC works again. I would 
like Those Who Know More Than I (tm) to review these changes and offer 
criticisms and comments.

These patches are against 3.0-RELEASE but should apply to -current
and -stable as well.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

*** if.c.orig   Sun Jan 31 17:13:01 1999
--- if.cSun Jan 31 17:10:36 1999
***
*** 186,192 
register struct ifaddr *ifa;
  
  #define   equal(a1, a2) \
!   (bcmp((caddr_t)(a1), (caddr_t)(a2), ((struct sockaddr *)(a1))->sa_len) == 0)
for (ifp = ifnet.tqh_first; ifp; ifp = ifp->if_link.tqe_next)
for (ifa = ifp->if_addrhead.tqh_first; ifa; 
 ifa = ifa->ifa_link.tqe_next) {
--- 186,193 
register struct ifaddr *ifa;
  
  #define   equal(a1, a2) \
!   (((struct sockaddr *)(a1))->sa_len && \
!bcmp((caddr_t)(a1), (caddr_t)(a2), ((struct sockaddr *)(a1))->sa_len) == 0)
for (ifp = ifnet.tqh_first; ifp; ifp = ifp->if_link.tqe_next)
for (ifa = ifp->if_addrhead.tqh_first; ifa; 
 ifa = ifa->ifa_link.tqe_next) {
***
*** 636,642 
return EOPNOTSUPP;
  
/* Don't let users screw up protocols' entries. */
!   if (ifr->ifr_addr.sa_family != AF_LINK)
return EINVAL;
  
if (cmd == SIOCADDMULTI) {
--- 637,644 
return EOPNOTSUPP;
  
/* Don't let users screw up protocols' entries. */
!   if (ifr->ifr_addr.sa_family != AF_LINK &&
!   ifr->ifr_addr.sa_family != AF_UNSPEC)
return EINVAL;
  
if (cmd == SIOCADDMULTI) {
*** if_ethersubr.c.orig Sun Jan 31 17:13:07 1999
--- if_ethersubr.c  Sun Jan 31 17:00:54 1999
***
*** 778,783 
--- 778,800 
u_char *e_addr;
  
switch(sa->sa_family) {
+   case AF_UNSPEC:
+   e_addr = (u_char *)&sa->sa_data;
+   if ((e_addr[0] & 1) != 1)
+   return EADDRNOTAVAIL;
+   MALLOC(sdl, struct sockaddr_dl *, sizeof *sdl, M_IFMADDR,
+  M_WAITOK);
+   sdl->sdl_len = sizeof *sdl;
+   sdl->sdl_family = AF_LINK;
+   sdl->sdl_index = ifp->if_index;
+   sdl->sdl_type = IFT_ETHER;
+   sdl->sdl_nlen = 0;
+   sdl->sdl_alen = ETHER_ADDR_LEN;
+   sdl->sdl_slen = 0;
+   e_addr = LLADDR(sdl);
+   bcopy((char *)&sa->sa_data, (char *)e_addr, ETHER_ADDR_LEN);
+   *llsa = (struct sockaddr *)sdl;
+   return 0;
case AF_LINK:
/* 
 * No mapping needed. Just check that it's a valid MC address.

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: SIOCADDMULTI doesn't work, proposed fix

1999-01-31 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Garrett Wollman 
had to walk into mine and say:

> <  said:
> 
> > a struct sockaddr_dl will be used. However, the user is supposed to
> > pass the address using a struct ifreq, and struct ifreq uses struct
> > sockaddr, not struct sockaddr_dl.
> 
> This is called ``poor man's inheritance''.
> 
> I believe it is an error for any code to use AF_UNSPEC for any purpose
> other than masks (where it makes sense since the address family is
> normally not included in the mask).  A `sockaddr_dl', while by default
> longer than a `sockaddr', in this case will fit withing the structure
> allotted.
> 
> In the future, I fully expect that `sockaddr' will be of maximal
> length (we need this for IPv6).

There's still one small problem: the code as it stands now can return
success and still not update the multicast filter. If you pass a structure
with AF_LINK as the family but with the length set to 0, if_addmulti()
falsely detects that the entry already matches an existing one and
returns success (it the equal() macro equates to a bcmp(), which tries
to compare 0 bytes worth of data and returns success). In my opinion,
this is a bug: either the equal() macro should return false, or the
zero length field should be detected by a sanity check and the function
should return EINVAL.
 
> > The patches included with this post change /sys/net/if.c and
> > /sys/net/if_ethersubr.c so that adding a mutlicast address with 
> > SIOCADDMULTI using a struct sockaddr and AF_UNSPEC works again. I would 
> > like Those Who Know More Than I (tm) to review these changes and offer 
> > criticisms and comments.
> 
> There are two things which should be done here.
> 
> First, the kernel AppleTalk code should be fixed to join the necessary
> multicast groups when an interface is first configured for AppleTalk.
> (By preference the AARP implementation should be entirely in the
> kernel as well, but that's more of a challenge.)  Second, the generic
> ether_resolvemulti function should be enhanced to know about AppleTalk
> multicast addresses.

The Columbia Appletalk code is not the same as netatalk: it's implemented
entirely in user space and uses BPF as well as manually joining multicast
groups. The existing Columbia Appletalk port, which works on 2.2.x, uses
SIOCADDMULTI with a family of AF_UNSPEC. I rifled through a bunch of
man pages in 3.0-RELEASE trying to find the Right Way To Do This (tm)
but came up empty. If the right way to do this is to cast the struct 
sockaddr to a struct sockaddr_dl and use AF_LINK, then this should be
documented somewhere. (If it is documented and I missed it, feel free
to slap me around and point me in the right direction.)

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: problem with vr0

1999-02-02 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Chia-liang Kao 
had to walk into mine and say: > 
 
> I have problem with my newly bought D-link DFE530TX on my -current
> (which is very new).
> 
> I have tested my NIC using the master/slave mode program came with my
> NIC with my room mate. And the results show the NIC work correctly.
>
> The most strange thing is that I can see the ethernet address of the
> other ip, see the following infomation. But I can't get the interface
> to work at all.

AG!!! I really don't want to get mad at you personally, but this
is really starting to annoy me. Virtually every time anybody reports a
problem, the only thing they ever say is "it doesn't work." WHAT DOESN'T
WORK EXACTLY!?! Describe the problem(s)!! Show us examples!! Show us
error messages!! Does it catch fire?! Does it spit pea soup at you and
speak in tongues?! Does it lie around the house all day and refuse to
cut its hair and get a job!? WHAT!!

You have not explained exactly what is going wrong. You have not 
explained what it is that you're trying to do which isn't working.
You have not explained how you came to the conclusion that the card
"isn't working." Show us what happens if you type 'ping 192.168.100.1'. 
Don't attempt to paraphrase the error messages: quote them exactly.
Does ping not illustrate the problem accurately? Fine: choose another
example and show us the results.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: problem with vr0

1999-02-02 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Chia-liang Kao 
had to walk into mine and say:
 
> I'm really sorry about this, Bill. I'll be very careful and make a
> recheck before sending out problem report next time. And really thank
> you for shouting at me instead of leaving my problem along.

Shouting is my specialty. I get a lot of practice.
 
> I did a `ping 192.168.100.1', and there is no response and no messages
> at all. I think the most interesting part of this is that I can see
> both of the lights on the hub blinking when I ping 192.168.100.1;
> while only the light of the other side blinks when he pings me.

What kind of hub is this?

> So we're starting to doubt the problem is the receiving function of my
> side.  And we test again with `trafshow'. Then I found he does receive
> my packet and replies when I ping him, while I can only see the
> packets I sent out but no packets from his side.
> 
> But sometimes it works for a tiny second, like the following:
> 
> # traceroute i1
> traceroute to i1 (192.168.100.1), 30 hops max, 40 byte packets
>  1  i1 (192.168.100.1)  0.720 ms * *

Are you using any unusual networking tricks, like network address 
translation or firewalling or IP aliasing? People tend to forget to 
mention things like that. There are some things I'm curious about:

- What does netstat -in show? Are there any input errors? Are there
  any input packets? (If the input packet counter keeps incrementing
  then it has to be receiving something.)

- Do you see any suspicious messages when you do a dmesg to look at the
  kernel message buffer? The vr driver should report receive errors if
  it encounters any.

- If you run tcpdump on the vr0 interface (tcpdump -n -e -i vr0) can
  you see the traffic from the other host? Try the following:

# arp -d 192.168.100.1
# tcpdump -n -e -i vr0 &
# ping -c 5 192.168.100.1

  Show us the output.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: problem with vr0

1999-02-03 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Chia-liang Kao 
had to walk into mine and say:

> * From: Bill Paul 
> * Date: Wed, 3 Feb 1999 00:24:27 -0500 (EST)
> *  
> * > I did a `ping 192.168.100.1', and there is no response and no messages
> * > at all. I think the most interesting part of this is that I can see
> * > both of the lights on the hub blinking when I ping 192.168.100.1;
> * > while only the light of the other side blinks when he pings me.
> * 
> * What kind of hub is this?
> 
> It's a nonaccredited 5-port 10Bast-T hub which we used to connect
> outside world via another interface (my de0 and his ed0). And when
> we're trying to use this hub for internal connection only via both of
> our newly bought dfe530s, we're in trouble.

Whoa whoa. Wait a minute; stop right there. Let me see if I understand
this. You have a 5 port hub. One port has the connection that links you
to the outside world (it goes to your router/switch/whatever). Another
second port connects to your machine at de0. A third port connects to
your roommate's machine on his ed0.

And you have your vr0 interface and your roommate's vr0 interface both 
connected to this _same_ hub as well? (See, this is why I yell: I can see
how somebody might try this and not think that it might cause problems.
If I was right there looking at your systems I could probably spot this
immediately, but it was only blind luck that you happened to mention
it now, otherwise I could have spent months going back and forth with
you via e-mail before finally dragging this piece of information out
of you.)

Uhm. I dunno. That doesn't seem right somehow. It adds another variable
that has to be accounted for. The problem here is that when one of you 
sends a packet, it will end up a) delivered to _two_ interfaces on the
target host and b) it will be echoed back to the other interface on the
source host. Remember: an ordinary hub just retransmits whatever it hears 
on one port to every otgher port. Given that you don't seem to be 
experiencing any transmit or receive errors on the vr0 interface, I get 
the feeling that this configuration may be contributing to the problem 
somehow.

You need to do one of three things to test to see if this is your problem:

- Obtain (purchase/borrow/steal) a second hub, and connect all the 
  192.168.100 interfaces to it all by themselves.

- Connect your vr0 interface to your roommate's vr0 interface directly 
  using a crossover cable. (A crossover cable has the transmit and receive
  pairs reversed on one end.)

- Temporarily unplug your de0 interface and his ed0 interface from the
  hub and leave just the vr0 interfaces plugged in. Use arp -d to remove
  each others' ARP entries from your respective ARP caches so that we
  start fresh. If you can successfully ping each other via the 
  192.168.100 interfaces and exchange traffic, then you have found the 
  problem. (This is the easiest test, and it doesn't cost anything. :)

If you had a _switch_ instead of a hub, then your configuration would
probably work because a switch will only deliver traffic to one port
(the port where the interface with the destination ethernet address is
attached) instead of all ports. (Except for broadcasts and multicasts,
without extra configuration.)

At least, that's my suspicion.

> * - What does netstat -in show? Are there any input errors? Are there
> *   any input packets? (If the input packet counter keeps incrementing
> *   then it has to be receiving something.)
> * 
> 
> There are some Ipkts but very few as you can see in the following.
> 
> # netstat -in
> Name  Mtu   Network   AddressIpkts IerrsOpkts Oerrs  Coll
> de0   150000.80.c8.46.1e.d4   313987 3411118717   185  2651
> de0   1500  140.112.240/2 140.112.240.59313987 3411118717   185  2651
> vr0   150000.80.c8.ef.82.09   16 015804 0 0
> vr0   1500  192.168.100   192.168.100.2 16 015804 0 0

Hm... No transmit or receive errors. I wonder what all the output traffic is
though.
 
> * - If you run tcpdump on the vr0 interface (tcpdump -n -e -i vr0) can
> *   you see the traffic from the other host? Try the following:
> * 
> * # arp -d 192.168.100.1
> * # tcpdump -n -e -i vr0 &
> * # ping -c 5 192.168.100.1
> * 
> *   Show us the output.
> 
> PING 192.168.100.1 (192.168.100.1): 56 data bytes
> 14:32:35.481753 0:80:c8:ef:82:9 ff:ff:ff:ff:ff:ff 0806 60: arp who-has 
> 192.168.100.1 tell 192.168.100.2
> 14:32:36.486348 0:80:c8:ef:82:9 ff:ff:ff:ff:ff:ff 0806 60: arp who-has 
> 192.168.100.1 tell 192.168.100.2
> 14:32:36.486561 0:80:c8:ef:3c:3f 0:80:c8:ef:82:9 0806 60: arp reply 
> 192.168.100.1 is-at 0:80:c8:ef:3c:3f
> 14:32:36.486625 0:80:c8:ef:82:9 0:80:c8:ef:3c:3f 08

Re: problem with vr0

1999-02-03 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Chia-liang Kao 
had to walk into mine and say:

> * And you have your vr0 interface and your roommate's vr0 interface both 
> * connected to this _same_ hub as well? (See, this is why I yell: I can see
> * how somebody might try this and not think that it might cause problems.
> * If I was right there looking at your systems I could probably spot this
> * immediately, but it was only blind luck that you happened to mention
> * it now, otherwise I could have spent months going back and forth with
> * you via e-mail before finally dragging this piece of information out
> * of you.)
> 
> Certainly not, sorry that I didn't specify precisely. I meant we used
> the hub very well connecting us and the outside world, and then we
> decided to use the hub for internal connection only. So the hub is now
> connecting our vr0's and nothing else. (Of course, the power adapter
> is connected. :)

Ah, okay. My bad. It sure looked like you were saying you had everything
attached to the same hub.

> We even swapped our cards and the result (the ping/trafshow test) is the same.
> 
> Also, the vr0 currently on my box was originally his, and he used the
> card to connect outside world in the past. Shouldn't be a kernel
> issue, since I have tried to get it right by booting his kernel.

What kind of machine/CPU does your friend have?
 
> Anyway, I'll try the first two tests tomorrow. (Ya, you know it, I'll
> steal one.)
> 
> * > vr0   150000.80.c8.ef.82.09  16 015804 0 0
> * > vr0   1500  192.168.100   192.168.100.216 015804 0 0
> * 
> * Hm... No transmit or receive errors. I wonder what all the output traffic is
> * though.
> 
> When I ping him, he can receive my packets and replies, while I can't
> get his reply. I think that's where th output packet came from. (ie
> the icmp outgoing packets when I ping him). And `netstat -in' on his box
> shows the input and output packets on vr0 are nearly identical.

Hm. I have some more questions:

- In your first posting, you mentioned this:
  vr0:  rev 0x06 int a irq 12 on pci0.19.0
  IRQ 12 is normally used by the mouse (if you have a PS/2 mouse). Do you
  have a mouse or PS/2 mouse port on this machine? (I suspect you don't but
  I have to ask.)

- How many PCI bus slots does your machine have?

- Have you tried putting the vr0 card in a different slot? Have you tried
  putting it in the slot where the de0 card is now?

- What PCI chipset do you have? The test machine in which I currently have
  my sample VIA Rhine card installed is an Intel Pentium 200 system that
  says the following:

  chip0  rev 1 on pci0:0:0
  chip1  rev 1 on pci0:7:0
  chip2  rev 0 on pci0:7:1
  [...]
  vr0  rev 6 int a irq 9 on pci0:15:0 
  vr0: Ethernet address: 00:a0:0c:c0:01:e7
  vr0: autoneg complete, no carrier

- Can you show me the output of the following:

  pciconf -r pci0:19:0 0xc

  I want to see what the latency timer setting looks like.

This may be something do to with your particular PCI chipset or motherboard;
unfortunately, I have only Intel systems here so it's hard to duplicate
your exact setup.

-Bill


-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: problem with vr0

1999-02-03 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Chia-liang Kao 
had to walk into mine and say:
[chop]

> * - Can you show me the output of the following:
> * 
> *   pciconf -r pci0:19:0 0xc
> * 
> *   I want to see what the latency timer setting looks like.
> It shows `0x2008'

Hmm... Alright, I have a patch I'd like you to try. I don't know that
this will really have an effect, but I'm curious to see what it does.
If this doesn't work, then the only other thing I can think of is if
you can give me login access to your machine so that I can try some
experiments.

Anyway, to apply the patch, do the following:

- Save this message to /tmp/vr.patch (or something similar).
- Become root.
- Type the following:

# cd /sys/pci
# patch < /tmp/vr.patch

- Compile a new kernel and boot it.

Let me know if this has any effect on the card's behavior.

-Bill

-- 
=
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

*** ../CVSWORK/sys_pci/if_vr.c  Mon Feb  1 16:25:52 1999
--- if_vr.c Wed Feb  3 16:11:24 1999
***
*** 899,905 
vm_offset_t pbase, vbase;
  #endif
u_char  eaddr[ETHER_ADDR_LEN];
!   u_int32_t   command;
struct vr_softc *sc;
struct ifnet*ifp;
int media = IFM_ETHER|IFM_100_TX|IFM_FDX;
--- 899,905 
vm_offset_t pbase, vbase;
  #endif
u_char  eaddr[ETHER_ADDR_LEN];
!   u_int32_t   command, lat;
struct vr_softc *sc;
struct ifnet*ifp;
int media = IFM_ETHER|IFM_100_TX|IFM_FDX;
***
*** 988,993 
--- 988,1002 
goto fail;
}
  
+   /* bump up the latency timer a little */
+   command = pci_conf_read(config_id, VR_PCI_LATENCY_TIMER);
+   lat = (command & 0xFF00) >> 8;
+   if (lat < 64) {
+   command &= 0x00FF;
+   command |= 0x4000;
+   pci_conf_write(config_id, VR_PCI_LATENCY_TIMER, command);
+   }
+ 
/* Reset the adapter. */
vr_reset(sc);
  
***
*** 1675,1680 
--- 1684,1692 
  
VR_CLRBIT(sc, VR_TXCFG, VR_TXCFG_TX_THRESH);
VR_SETBIT(sc, VR_TXCFG, VR_TXTHRESH_STORENFWD);
+ 
+   /* Adjust configuration a little */
+   CSR_WRITE_2(sc, VR_BCR0, 0x0006);
  
/* Init circular RX list. */
if (vr_list_rx_init(sc) == ENOBUFS) {

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: NIS woes

1999-02-07 Thread Bill Paul
Of all the gin joints in all the towns in all the world, Dag-Erling 
Smorgrav had to walk into mine and say:

> I have a very simple NIS configuration at home: niobe is the server
> and luna, my scratch box, is the client. Niobe runs 4.0-CURRENT, and
> luna runs 3.0-RELEASE until 'make world' finishes on niobe so I can
> make installworld over NFS. In addition to being the NIS and NFS
> server, niobe is also its own NIS client, and I have no trouble at all
> looking up NIS maps on niobe.
> 
> Luna, however, seems absolutely allergic to NIS. Everything is
> configured correctly as far as I can see
[chop]

Sure, that's what they all say. The N in NIS stands for Network. This
means that you should be concentrating your diagnostic efforts on the
network. Are you using an insane amount of IP aliases? Did you try
to run tcpdump on the interface that connects the two machines together?
>From both sides? Are your netmasks correct?

Did you check to see if 'domainname' returns the correct information?

> Running the server in debug mode shows absolutely no activity of any
> kind from luna. There's nothing wrong with the network connection
> (LPIP);

I don't believe you. Like I said: run tcpdump on both sides. See if
you actually have traffic pertaining to NIS travelling between the
two machines.

What the hell is LPIP anyway.
 
> Any suggestions?

Tcpdump, tcpdump and more tcpdump.

-Bill

-- 
=====
-Bill Paul(212) 854-6020 | System Manager, Master of Unix-Fu
Work: wp...@ctr.columbia.edu | Center for Telecommunications Research
Home:  wp...@skynet.ctr.columbia.edu | Columbia University, New York City
=
 "It is not I who am crazy; it is I who am mad!" - Ren Hoek, "Space Madness"
=

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


  1   2   >