On 04.08.2014 15:58, Luigi Rizzo wrote:
On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V. Chernikov
wrote:
On 02.08.2014 12:33, Alexander V. Chernikov wrote:
On 02.08.2014 10:33, Luigi Rizzo wrote:
On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
<melif...@freebsd.org <mailto:melif...@freebsd.org>> wrote:
Hello all.
I'm currently working on to enhance ipfw in some areas.
The most notable (and user-visible) change is named
table support.
The other one is support for different lookup algorithms
for different
key types.
For example, new ipfw permits writing this:
ipfw table tb1 create type cidr
ipfw add allow ip from table(tl1) to any
ipfw add allow ip from any lookup dst-ip tb1
ipfw table if1 create type iface
ipfw add skipto tablearg ip from any to any via table(if1)
or even this:
ipfw table fl1 create type
flow:src-ip,proto,dst-ip,dst-port
ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444
ipfw add allow ip from any to any flow table(fl1)
all these changes fully preserve backward compatibility.
(actually tables needs now to be created before use and
their type needs
to match with opcode used, but new ipfw(8) performs
auto-creation
for cidr tables).
There is another thing I'm going to change and I'm not
sure I can keep
the same compatibility level.
Table values, from one point of view, can be classified
to the following
types:
- skipto argument
- fwd argument (*)
- link to another object (nat, pipe, queue)
- plain u32 (not bound to any object)
(divert/tee,netgraph,tag/utag,limit)
There are the following reasons why I think it is
necessary to implement
explicit table values typing (like tables):
- Implementing fwd tablearg for IPv6 hosts requires
indirection table
- Converting nat/pipe instance ids to names renders
values unusable
- retiring old hack with storing saved pointer of found
object/rule
inside rule w/o proper locking
- making faster skipto
??????i don't buy the idea that you need typed arguments
for all the cases above. Maybe the case that
may make sense is the fwd argument (and in the future
something else).
We already discussed, i think, the fact that now it
is legal to have references to non existing things
(skipto, pipes etc.) implemented as u32.
Removing that would break configurations.
It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or in userspace, so
we can (and should) avoid breaking in this particular way.
Can you please explain your vision on values another time?
As far as I understand, you're not against it in general, but the
details matter:
* IP address can be one of the types (it won't break much, and
we can
simply skip that one for MFC)
* what about typing for nat/pipes ? we're not going to convert
their ids
to names? (or maybe you can suggest other non-disruptive way?)
* everything else is type "u32"
Correct, I am mostly concerned about the details, not on the
general concept.
To summarize the discussion Alexander and I had about converting
identifiers from numbers to arbitrary strings (this is partly
related
to the values stored in tables, but I think we should have a
coherent
behaviour)
1. CURRENTLY ipfw uses numeric identifiers in a small range (16
bits or less)
for rules, pipes, queues, tables, probably nat instances.
2. CURRENTLY, in all the above contexts, it is legal to reference a
non existing object (rule, pipe, table names, etc.),
and the kernel will do something reasonable, namely jump to the
next rule, drop traffic for non existing pipes, and so on.
3. of course we want to preserve backward compatibility both for
the ioctl interface, and for user configurations.
4. The in-kernel representation of identifiers is not visible to
users,
so we can use a numeric representation in the kernel for
identifiers.
Strings like "12345" are converted with atoi() or the like,
whereas for other identifiers or numbers outside of the 2^16
range
the kernel manages a translation table, allocating new numeric
identifiers if a new string appears.
This permits backward compatibility for old rulesets, and
does not
impact performance because the translation table is only
used during rules additions or deletion.
Yes. However this requires either holding either (1) 2 pointers
(old&new
arrays), or (2) 65k+ index array, or (3) chained hash table.
(1) would require additional pointers for each subsystem (and some
additional management),
(2) will definitely upset embedded guys and
(3) is worse in terms of performance
With this in mind, i think we should follow a similar approach for
objects stored in tables, hence
if an u32 value was available in the past, it must be
available also in the new implementation.
The issue with tables is that some convoluted configuration could
use the same table to reference pipes _and_ rules _and_ perhaps
other things represented as numbers (the former is not too strange,
if i have a large configuration i might place sections at rules
12000, 13000, 14000... and associate pipes with the same numberic
identifier to each block of rules).
Typed table values would clearly disturb backward compatibility
in the above configurations. However it should not be difficult
to accept arbitrary strings as the values stored in tables, and
then store multiple representations as appropriate, including:
Well, I've thought about thas one. It may be an option, but the
details
are not so promising (below)
- the string representation, unconditionally
- for names that can be resolved by DNS, the ipv6 and ipv4
address(es)
associated with them. ipfw already translates hostnames in rules
so this is POLA
I'm not happy what ipfw(8) is doing instead of translation. The
proper
way would be not simply using first AF_INET answer but saving ALL
IPv4+IPv6 records inside rule (and some more tracking should be done
afterwards, but that's totally different story). Additionally, I'm
unsure if we really need next-hop value expressed as hostname (how
can
we deal with multiple addresses and diffrent AFs?). We may store
strings
(and I think we should do it) but I'm unsure about this particular
option of interpreting them.
- for other strings, a u32 from the translation table as previously
indicated
- and for numeric values, the u32 representation (truncated if
needed,
according to whatever is the existing behaviour)
- <add other representations if needed>
If we cannot generate an u32 we will put some value (e.g. 0)
that hopefully will not cause confusion.
As far as I understand, we accept some string "s" as table value
inside
the kernel, than, we have some logic that says:
oh, dummynet pipe has the same name "s"s, oh, nat entity with name
"s"
has just been created, let's save indices.
That would require additional indirection table like:
index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ]
( so we will have 2-level indirection table for fwd if we do IPv6)
We can optimize this if we use "same name -> same kidx" approach
regardless of kernel object we're refering to. That might require
some
more memory, but that's OK from my point of view.
So we end up with
int [ skipto idx | fwd idx | obj idx ]
idx "0" is special value which means the same as 2.CURRENT
That looks better, but still way to complex.
I do care about compatibility, but it's hard to improve things
without
changing.
I'd like to propose the following:
* Split values into 3 types ("ip|nexthop", "number", "object")
* Do not insist on object existence, use value "0" to mimic 2.CURRENT
behavior.
* Retain full compatibility by introducing special value type
"legacy"
which matches any type and is backed by given indirection table.
* Issue warning in ipfw(8) binary on all auto-created tables that
auto-creation is legacy and this behavior will be dropped in next
major
release (e.g. 11.0)
* Save this behavior in MFC but drop "legacy" tables in head after a
month after actual MFC.
That do you think?
If we do it this way, we should be able to preserve backward
compatibility _and_ add features that people may need.
cheers
luigi