Re: So hey let's talk about this nftables ordering situation.

Robin Lee Powell Mon, 24 Feb 2025 22:58:57 -0800

On Mon, Feb 24, 2025 at 04:25:58PM -0500, Laine Stump wrote:
> On 2/21/25 7:02 PM, robinleepow...@gmail.com wrote:
> > So I, like many other people, have hit problems with nftables ordering, as 
> > has been discussed on this mailing list MANY TIMES.
> > 
> > This whole thing seemed ridiculous so I asked the nftables people about 
> > what one is *supposed* to do in this situation.  It turns out that the 
> > standard solution is for libvirt's nftables rules to set a packet mark 
> > (there's a collision possibility here but it's a 32 bit integer if you pick 
> > one at random it shouldn't be a problem) and then the user adds a rule to 
> > exclude packets with that mark from any reject rules they might have, or 
> > explicitly accept marked packets in their own chains, or whatever.
> 
> Was the discussion on a public forum somewhere? I'd like to look at exactly
> what they said.

Yep!  Sorry, thought I linked to it, oops.  
https://lore.kernel.org/netfilter/132daf73-668f-4321-8945-c809db227...@redhat.com/T/#t

> > It's not *as nice* as the iptables situation, but having documentation that 
> > says "if you're using nftables make sure that packets with mark 79892 are 
> > accepted in all your chains" is quite straightforward compared to the 
> > current situation of "LOL good luck".  (I'm not blaming anyone there!, the 
> > current situation is impossible for libvirt to navigate and it's not 
> > anyone's fault.)
> 
> It does still require that the other utilities know this secret number, and
> agree to "anti-reject" it as we've requested, though. Also doesn't this
> require that libvirt's table is processed first, before the other utilities'
> tables? Otherwise, if the other tables are traversed before libvirt has a
> chance to mark the packet with the special number, they won't get the
> signal, so they'll reject the traffic. So I we would have set our table as a
> higher priority, but then what if someone else sets their table with an
> *even higher* priority? e.g. firewalld has "priority filter + 10" for its
> forwarding rules, so could make ours "priority filter + 20", but what if,
> e.g. docker decided to make theirs "priority filter +50"?). (yes, that's all
> a rhetorical question. I guess in the end everything like this that we do
> will chip away a bit more at the list of people who encounter problems; it
> will never reach 0, but it will at least get closer :-))

Yep, those are all real concerns.  :sigh:

> Aside from that, libvirt's nftables rules are default accept, and it has no
> rules looking at traffic that is destined for the host, only for forwarded
> traffic that is going *through* the host, mainly with the intent of
> rejecting stuff it doesn't like. So are you/they suggesting that this
> forwarded traffic be marked with the special "libvirt code"? Or that we
> should also add back rules that match input DNS/DHCP/TFTP  on the
> libvirt-created bridges, and have them both accept and mark those packets?

I think it'd have to be the latter to actually work.

> > If y'all don't like that, what's working excellently for me is adding 
> > `iifname "virbr*" accept` to my rule chain.  FWIW.
> 
> Just keep in mind that "iifname" has to fetch the name of the interface and
> do a string comparison for each packet, while "iif" just does a quick
> comparison of ifindex, which I think is already saved away in the skb (of
> course wildcards aren't possible in that case, but if you have just a couple
> of libvirt networks it's still more efficient to have a rule using "iif" for
> each interface.

The reason I have to use iifname is that at the time my rules are
loaded, the virbr interfaces *don't exist*.  Like I actually have no
choice; it won't work the other way, unless I'm badly missing
something.

> > It was very hard to navigate through this situation because there's no 
> > documentation that this problem even exists.
> 
> Yeah, that's my fault. When I added the nftables backend, I forgot to update
> https://libvirt.org/firewall.html (which is in docs/firewall.rst in the
> libvirt sources). (also at the time I wrote the code, I I keep remembering
> that I should do that, but only when I'm in the middle of something else and
> somehow I haven't managed to even write it down on a list.

No attack intended; FOSS work is hard.  :)

> > My suggestion is to describe the situation at 
> > https://libvirt.org/firewall.html and suggest the virbr* fix, and down the 
> > road maybe look at this mark thing.
> 
> That's a kind of a broad solution though - libvirt's rules only reject
> specific traffic between libvirt-created bridges (and incoming traffic from
> outside a bridge's direct connects in the case of forward mode='nat'),
> Anywhere they allow traffic, they allow *all* of it. The real problematic
> stuff is traffic between the guests and the host (the rules we've had for
> iptables that are absent in nftables are those to allow inbound DNS, DHCP,
> and TFTP that are arriving on a virbr* interface, and destined for the
> host). If you allow *all* traffic for virbr*, then you're leaving the host
> wide open to all traffic from any guests (since libvirt's own rules are
> default accept). I think the suggestion needs to be more than just "allow
> all incoming on virbr*".

That's fair; I suppose we could post something equivalent to the old
iptables rules?

> > I'd like to help.  I'm happy to write up issues for this, and I'm happy to 
> > write the updates to the firewall docs; just tell me what you'd like me to 
> > do.
> 
> firewall.rst should really be a shortened intro that links to the current
> firewall.html for iptables (maybe renaming it "iptables.rst/html"?), and to
> a new nftables.rst/html for information about nftables (including an
> explanation of the "many tables, all must resolve to 'accept' problem.)
> 
> Since I've never gotten around to it in spite of wanting it done, I'd
> certainly be happy to review an update done by anyone else :-)

Acknowledged.  :)
Re: So hey let's talk about this nftables ordering situation.

Reply via email to