Re: ipfw named objejcts, table values and syntax change

Alexander V. Chernikov Tue, 19 Aug 2014 10:37:41 -0700

On 19.08.2014 20:06, Dmitry Selivanov wrote:

19.08.2014 17:50, Alexander V. Chernikov пишет:
On 15.08.2014 19:20, Alexander V. Chernikov wrote:
On 15.08.2014 18:19, Dmitry Selivanov wrote:
15.08.2014 17:25, Alexander V. Chernikov пишет:
On 08.08.2014 16:11, Dmitry Selivanov wrote:
04.08.2014 23:51, Alexander V. Chernikov пишет:
On 04.08.2014 15:58, Luigi Rizzo wrote:
On Mon, Aug 04, 2014 at 01:44:26PM +0400, Alexander V.Chernikov wrote:
On 02.08.2014 12:33, Alexander V. Chernikov wrote:
On 02.08.2014 10:33, Luigi Rizzo wrote:
On Fri, Aug 1, 2014 at 11:08 PM, Alexander V. Chernikov
<melif...@freebsd.org <mailto:melif...@freebsd.org>> wrote:

      Hello all.

      I'm currently working on to enhance ipfw in some areas.
The most notable (and user-visible) change is namedtable support.The other one is support for different lookupalgorithms for different
      key types.

      For example, new ipfw permits writing this:

      ipfw table tb1 create type cidr
      ipfw add allow ip from table(tl1) to any
      ipfw add allow ip from any lookup dst-ip tb1

      ipfw table if1 create type iface
ipfw add skipto tablearg ip from any to any viatable(if1)
      or even this:
ipfw table fl1 create typeflow:src-ip,proto,dst-ip,dst-port
      ipfw table fl1 add 10.0.0.5,tcp,10.0.0.6,80 4444
      ipfw add allow ip from any to any flow table(fl1)

      all these changes fully preserve backward compatibility.
(actually tables needs now to be created before useand their type needsto match with opcode used, but new ipfw(8) performsauto-creation
      for cidr tables).
There is another thing I'm going to change and I'm notsure I can keep
      the same compatibility level.
Table values, from one point of view, can beclassified to the following
      types:

      - skipto argument
      - fwd argument (*)
      - link to another object (nat, pipe, queue)
      - plain u32 (not bound to any object)
      (divert/tee,netgraph,tag/utag,limit)
There are the following reasons why I think it isnecessary to implement
      explicit table values typing (like tables):
- Implementing fwd tablearg for IPv6 hosts requiresindirection table- Converting nat/pipe instance ids to names rendersvalues unusable- retiring old hack with storing saved pointer offound object/rule
      inside rule w/o proper locking
      - making faster skipto


??????i don't buy the idea that you need typed arguments
for all the cases above. Maybe the case that
may make sense is the fwd argument (and in the future
something else).
We already discussed, i think, the fact that now it
is legal to have references to non existing things
(skipto, pipes etc.) implemented as u32.
Removing that would break configurations.
It depends on actual implementation. This can be preserved by
auto-creating necessary objects in kernel and/or inuserspace, so
we can (and should) avoid breaking in this particular way.
Can you please explain your vision on values another time?
As far as I understand, you're not against it in general, but the
details matter:
* IP address can be one of the types (it won't break much, andwe can
simply skip that one for MFC)
* what about typing for nat/pipes ? we're not going to converttheir ids
to names? (or maybe you can suggest other non-disruptive way?)
* everything else is type "u32"
Correct, I am mostly concerned about the details, not on thegeneral concept.
To summarize the discussion Alexander and I had about converting
identifiers from numbers to arbitrary strings (this is partlyrelatedto the values stored in tables, but I think we should have acoherent
behaviour)
1. CURRENTLY ipfw uses numeric identifiers in a small range (16bits or less)
    for rules, pipes, queues, tables, probably nat instances.
2. CURRENTLY, in all the above contexts, it is legal toreference a
    non existing object (rule, pipe, table names, etc.),
and the kernel will do something reasonable, namely jump tothe
    next rule, drop traffic for non existing pipes, and so on.

3. of course we want to preserve backward compatibility both for
    the ioctl interface, and for user configurations.
4. The in-kernel representation of identifiers is not visibleto users,so we can use a numeric representation in the kernel foridentifiers.
    Strings like "12345" are converted with atoi() or the like,
whereas for other identifiers or numbers outside of the2^16 range
    the kernel manages a translation table, allocating new numeric
    identifiers if a new string appears.
This permits backward compatibility for old rulesets, anddoes not
    impact performance because the translation table is only
    used during rules additions or deletion.
Yes. However this requires either holding either (1) 2 pointers(old&new
arrays), or (2) 65k+ index array, or (3) chained hash table.
(1) would require additional pointers for each subsystem (and some
additional management),
(2) will definitely upset embedded guys and
(3) is worse in terms of performance
With this in mind, i think we should follow a similar approach for
objects stored in tables, hence

    if an u32 value was available in the past, it must be
    available also in the new implementation.

The issue with tables is that some convoluted configuration could
use the same table to reference pipes _and_ rules _and_ perhaps
other things represented as numbers (the former is not toostrange,
if i have a large configuration i might place sections at rules
12000, 13000, 14000... and associate pipes with the same numberic
identifier to each block of rules).

Typed table values would clearly disturb backward compatibility
in the above configurations. However it should not be difficult
to accept arbitrary strings as the values stored in tables, and
then store multiple representations as appropriate, including:
Well, I've thought about thas one. It may be an option, but thedetails
are not so promising (below)
- the string representation, unconditionally
- for names that can be resolved by DNS, the ipv6 and ipv4address(es)associated with them. ipfw already translates hostnames inrules
   so this is POLA
I'm not happy what ipfw(8) is doing instead of translation. Theproper
way would be not simply using first AF_INET answer but saving ALL
IPv4+IPv6 records inside rule (and some more tracking should bedone
afterwards, but that's totally different story). Additionally, I'm
unsure if we really need next-hop value expressed as hostname(how canwe deal with multiple addresses and diffrent AFs?). We may storestrings
(and I think we should do it) but I'm unsure about this particular
option of interpreting them.
- for other strings, a u32 from the translation table aspreviously
   indicated
- and for numeric values, the u32 representation (truncated ifneeded,
   according to whatever is the existing behaviour)
- <add other representations if needed>
If we cannot generate an u32 we will put some value (e.g. 0)
that hopefully will not cause confusion.
As far as I understand, we accept some string "s" as table valueinside
the kernel, than, we have some logic that says:
oh, dummynet pipe has the same name "s"s, oh, nat entity withname "s"
has just been created, let's save indices.

That would require additional indirection table like:

index | [ skipto idx | nat idx | pipe idx | queue idx | fwd index ]
( so we will have 2-level indirection table for fwd if we do IPv6)

We can optimize this if we use "same name -> same kidx" approach
regardless of kernel object we're refering to. That mightrequire some
more memory, but that's OK from my point of view.

So we end up with
int [ skipto idx | fwd idx | obj idx ]

idx "0" is special value which means the same as 2.CURRENT

That looks better, but still way to complex.
I do care about compatibility, but it's hard to improve thingswithout
changing.

I'd like to propose the following:
* Split values into 3 types ("ip|nexthop", "number", "object")
* Do not insist on object existence, use value "0" to mimic2.CURRENT
behavior.
* Retain full compatibility by introducing special value type"legacy"
   which matches any type and is backed by given indirection table.
* Issue warning in ipfw(8) binary on all auto-created tables that
auto-creation is legacy and this behavior will be dropped innext major
release (e.g. 11.0)
* Save this behavior in MFC but drop "legacy" tables in headafter a
month after actual MFC.

That do you think?
If we do it this way, we should be able to preserve backward
compatibility _and_ add features that people may need.

cheers
luigi
Here is my idea: tablearg should contain more than one value. Ithink getting several values from one table lookup is faster thanseveral table lookups with one value.Let tablearg be not just uint32, but array with different valuetypes inside it.
There are some use cases where we might need 2-level value lookup(e.g. algo returning index for index table where actual datareside) and each data item canreally be up to 64-bytes long. The problem is in actualpartitioning and compatibility.
For example I have many such rules:
allow src-ip 1.2.3.4 MAC any 11:22:33:44:55:66 recv vlan1234dst-ip 1.1.1.1
Sorry, what task are you solving by using given rules?
Small ISP, clients have static IP with MAC-authorization. Src ifacemust be checked to prevent IP-spoofing. Dst-IP sometimes is usedfor p2p-channels.
These rules can be replaced with such construction:
allow src-ip table(1) MAC any tablearg[1] recv tablearg[2] dst-iptablearg[3]
But I don't think indexing by value is a good idea. I thinkindex==starting byte is a better way:allow src-ip table(1) MAC any tablearg:0 recv tablearg:6 dst-iptablearg:32where MAC's 6 bytes are from 0 to 5 in tablearg; iface string isfrom 6 and till \0, but less than 26 bytes; and IPv4's 4 bytesare from 32 to 35.
So we need to create table for it:
table 1 set MAC:0 string:6:26 ip:32
table 1 add 1.2.3.4 11:22:33:44:55:66 vlan1234 1.1.1.1

String can be used both for iface and comment.
Other possible value types:
uint16 for nat, pipe, skipto and other 2-bytes actions
IPv4 4 bytes
CIDRv4 5 bytes
IPv6 16 bytes
CIDRv6 17 bytes
table_id 2 bytes - link to another table
Well, it seems we have enough space to store most of these,however, problems seem to remain the same: typing and compatibility.When you're creating new table (or it is auto-created) whichvalues types should be assumed ? All of them?
Default - as usually uint32.
I can't see "uint32" value in the list you have specified before.I'll rephrase:what value types (from the list above or similar) should ipfw(8) orkernel fill in case of "default" table?
(And once again, what should we print as value) ?
Please think about
a) old ipfw binaries
b) new ipfw binaries using exactly the same ruleset they are alreadyusing (with, for example, both "skipto tablearg" and "fwd tablearg "tables).
At that time I meant default table "header" is "ip:0" (in my context).It would be completely compatible with old ipfw tables.
I've increased kernel<>userland 'struct tentry' value field to 64 bytes.
It looks like we were talking about a bit different things.
Let me try to explain the problem I'm stuck with:

We may take the road you've suggested, it looks OK:

* by default tables are created with "all-values" mask.
* ipfw(8) value treats default "ipfw table X add Y val" input wherevalue is u32 number as input data for each type specified inall-values without returning error
* for non-default mask value data should be validated.
e.g. if we have table with valtype="skipto,nat,pipe,ip4,ip6" and"100" as input -> it turns to "100,100,0.0.0.0,::".
I don't fully understand. One "100" value for all valtypes? Then "100"can't be equal "0.0.0.0" and "::". Or you meant "100,100,0,0" as input?

We have to handle the case when user with _unmodified_ scripts tries touse new ipfw (either with new binary or the old one).

The goal is not to throw error and break everything, of course.

If we have value with valtype="skipto,ip6" and "100" as input ->error while the valid one would be "100,2a01::1:111", for example.
I'm unsure how should one be able to update _specific_ value (e.g.update nat id or skipto arg), but that's not the problem.
Maybe new command would help, like "ipfw table X set Y newval".
The problem arises if we start talking about using names fornat/pipe/queue ids instead of numbers.If we have nat instances "nat1", "11" and "23", and one specifies"44" as part of value, logic starts to be complex:
we either require nat "44" to exists (and I'm unsure if we canauto-create it *) or start doing complex stuff like tracking allthose non-existing objects:e.g. add some special record somewhere that we're wating for natinstance "44" to be created, than auto-update given value with itskernel index,than, do something reasonable if nat "44" instance is destroyed (OK,nat instance can't be destroyed, but pipe can)... and we have to do the same for pipes/queues and any followingkernel object.
Or we have to require user to reference existing objects only (createexplicitly before use). This one makes things easier in code, butrequire user to change their scripts.
It looks like there is no consensus on that point.
User can destroy object after table creating. I think this way: "noobject - no packet (explicitly deny)". No need to check object existence.

Yes, but even this behavior has to be supported by kernel:
Let me explain in more details:
user calls -> ipfw nat "23" iface ...

Kernel sees string "23" which is not the name of any existing natinstance, so it creates one and allocates new kernel index for that (letit be 1).The same for "nat1" -> 2 and "11" -> 3. Kernel indexes are purelyinternal and can not be referenced by userland.


So, when you enter "44" inside new value, the following happens:
1) some special object binding name "44" and value of record X is created

2) nat instance list is searched to see if "44" is and existing name. Ifentry is found, its kernel index is saved to "value", 0 is saved otherwise.3) If nat entry is destroyed, we have to walk all entries and set theirappropriate parts back to 0 (otherwise some other entry may use thisindex later leading to packes being aliased to another nat instance."show" command would print incorrect values, too).

This can be done (and we have to write code for each type of kernelobject, e.g. one for nat, one for pipe/queue, etc..), but require a lotof code which we would have to support forever.I'd like an idea to enforce hard bindings (with, maybe, someintermediate period of compatible behavior for MFC).

* Maybe auto-creation is not so tricky and we should try to evaluateit..
What should `ipfw table X list` show as "value" field ?
I added table "header" in this line:
table 1 set MAC:0 string:6:26 ip:32
I don't think that user should be able to set any offsets inuserland. Exact offsets of variable of given type needs to beenforced by kernel,so you may fill that you want "mac" and "ip" as values for giventable, but not lengths or offsets.
Does your way allow to use strings (e.g. iface or comments)?

I'm not sure on what you're going to do with interfaces as values.

Comments - per value or per table entry? I can think of it, but probablynot all algorithms will support that functionality.

So `ipfw table X list` should show something like this:
---table(0)---
1.2.3.4/32 11:22:33:44:55:66 vlan1234 1.1.1.1
We can also add "header" description in output (with or withoutadditional parameter - depends on compatibility needs) like this:
---table(0)--- addr MAC iface IPv4
How should ipfw(8) treat "add 1.1.1.1 0" input?
It should look at table "header" and return error message like"Value doesn't match table header"
What will happen if we want to add another type field to thislist? (MAC address of Infiniband MAC address, for example).
I don't think there is a sense to mix both MAC[6] and MAC[20]values in 1 table. It is easier to create 2 tables with different"headers".For Infiniband we can add another type: MAC20 (or something likethis). Or we can use "MAC"-type like string type(see above):MAC:6:25 (1st and last bytes, or 1st and length).
Table value length can be set for example with loader tunablelike net.inet.ip.fw.table_value_length.Even with default uint32 value length we can get 2 uint16 valuesor 4 uint8 values, this can help in some configurations.
This way is more complex, but much more flexible. It's likenetgraph subsystem.
I think it suites both Alexander and Luigi requests.


_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ipfw named objejcts, table values and syntax change

Reply via email to