This is an intriguing problem that certaiinly is going against
everything I know about how pf and bridging is supposed to work on
OpenBSD. Anyhow, I have come up with some things that might help you
ascertain what is going on with the firewall. I reread your initial
emails and follow ups to ensure that I didn't miss anything. Keep in
mind, I am not a Sparc fan or user, so if your problem is only present
on a Sparc, then I will not be of much help.
Yes, I'm getting the feeling that what I'm seeing is "not normal." As
I've said, I have a suspicion that it's due to the le[dma] SBUS
interfaces not having their own MAC address, and that somehow getting
confused at the bridge level. I'm thinking about getting a QFE to test
this out.
1) Have you tried assigning static arp for testing purposes? (arp
static IPAddress MAC permanent)
I don't think this is going to help, but it may lead you to the root
of the problem if there is a problem with the arp requests or if
something is changing these requests (the firewall, perhaps)
I don't understand how this can help, because you can't assign the
static arp to a specific interface. `arp -a` shows:
? (192.168.1.9) at 00:0a:95:79:cb:8a on le0
? (192.168.1.130) at 00:05:02:13:50:98 on le0
Which is correct for the first, but not the second, which ought to be
one le2. If I add it manually, it appears immediately on le0. I
suspect this might be because le2 has no IP address, so le0 gets picked
off out of the routing tables. `netstat -r` shows the route to both
machines to be on le0, even though tcpdump shows pings ONLY on the
correct interface for each.
So, what this tells me is that pf, arp, and netstat are reporting
things that are garbled at layer 3.
`tcpdump` is NOT confused about what is happening, because the traffic
it sees is on the actual network. That is, if I run tcpdump on le0,
tcpdump on le2, and tcpdump on the other two machines, everything seems
to be working 100% as expected. When I ping 192.168.1.9, tcpdump on
le0 sees it, on le2 does not; on 192.168.1.9 sees it, and on
192.168.1.130 does not. Similar for pinging 192.168.1.130. So, the
good news is, at leyer 2, OpenBSD is NOT confused about where traffic
is supposed to go, or where it comes from.
Layer 3 is wrong, Layer 2 is correct, and bridge is confused and
doesn't tag properly.
2) Have you assigned what MACs are permitted on the interfaces?
(flush; learn le0 static le0 MAC; discover le0; etc)
If the LAN host ane WLAN host are only permitted on their respective
interfaces on the bridge, it could resolve your problem of them being
logged on the wrong interface.
I can add the MAC address of a machine on le2 as a static entry in
brconfig (after flushing), and it then does not re-learn it as le0.
arp still learns it as le0. However, doing this, pinging 192.168.1.130
STILL matches rules for out on le0 (though it's on le2, and bridge has
a static address for it on le2).
3) Have you looked at the packets using a span port? (addspan intx in
bridgename.bridge0 file)
The span port might give you more insight as to how the packets are
flowing inside the firewall. This could help you troubleshoot why the
packets are being logged (apparently) on the wrong interfaces.
I'm not sure what this does. brconfig(8) says the port must be no be a
member of the bridge. So if I take le1 off my external network for
testing purposes, and add it as a spanport to bridge0 using "addspan
le1," then all frames received on bridge0 will be retransmitted on le1?
Okay, what am I looking for that I haven't already seen by running
tcpdump on the client machines? I know for a fact that traffic between
192.168.1.130 does not appear on the network connected to le0 (and
similar for .9 / le2). If I tcpdump on le1 while it's a spanport, will
tcpdump tell me whether the traffic looks to have gone to le0 or le2?
These SBUS cards having all the same MAC makes linklevel headers
meaningless, too.
4) Have you put a rule pass out on le0 and rule pass out on le2 in
your bridgename.bridge0 file?
This could be the factor affecting why you arent' "seeing" the packets
pass both in and out each interface.
Yes, my simplest-case rules are:
pass in log quick on le0 keep state
pass in log quick on le2 keep state
pass out log quick on le0 keep state
pass out log quick on le2 keep state
And a more complex set is:
pass in log-all quick on le0 tagged t_wan keep state
pass in log-all quick on le2 tagged t_lan keep state
pass in log-all quick on le2 tagged t_wan keep state
pass in log-all quick on le0 tagged t_lan keep state
pass out log-all quick on $lan from any to any tagged t_wan keep state
pass out log-all quick on $lan from any to any tagged t_lan keep state
pass out log-all quick on $wap from any to any tagged t_wan keep state
pass out log-all quick on $wap from any to any tagged t_lan keep state
pass in log-all quick on le0 keep state
pass in log-all quick on le2 keep state
pass out log-all quick on le0 keep state
pass out log-all quick on le2 keep state
Even with keeping state and logging all, I never see more than 2 rules
get logged against when pinging to and from the OpenBSD box itself
(though bridge-level tagging doesn't always work for these, and I wind
up hitting the non-"tagged" rules some of the time).
If this doesn't help, I'm sorry because I can't think of anything else
besides going line for line through the source code.
I'm considering it.
If it were me, I'd just say hell with it and assign a different subnet
range on the different interfaces and use rules only to get packets
across it. You could NAT between them so they look like they are on
the same segment, but this might not accomplish what you are trying to
do.
Well, I'm open to other suggestions. Is there a (built-in) way I can
forward 224.x.x.x traffic from one subnet to the other without using a
bridge?
One option is to just further restrict access to the OpenBSD box from
the LAN to the level I'd want to restrict it to for the WLAN. Another
is to see if using NIC's that support different MAC addresses fixes the
problem. A third is to go line-by-line through the source until I find
out what's happening.
In either case, I wish you luck and would love to hear a solution if
you reach one.
Believe me, if I ever figure it out, I'll post it with as many keywords
as possible. No one should have to ever do this again. :-)
Thanks,
JMF