Hello,

The new code looks much better. It's huge leap forward.  I don't mind if this
big diff will get committed. Hrvoje did test the whole diff. Trying to split it
to smaller changes might bring in code which is not tested enough.  We don't
know how individual pieces work independently.

I've got some questions/comments which are perhaps worth to sort out
before commit.

the diff touches those files:

    net/if.c
        - no objections/comments

    net/ifq.c
        - no objections/comments

    net/netisr.h
        - no objections/comments

    net/pf_norm.c
        - no objections/comments

    net/pf_ioctl.c
        - no objections/comments

    net/pfvar.h
        - no objections/comments

    net/pfvar_priv.h
        - no objections/comments

    netinet/ip_ipsp.h
        - no objections/comments

    netinet/in_proto.c
        - just wondering why we rename/introducing pfsync_input4().
          is there a plan to make pfsync(4) to work over IPv6?

    net/pf.c
        state expiration is overhauled, new code looks better.
        once new code will settle down we might need to revisit 
        'set timeout interval' description in pf.conf(5),
        because if I understand the new code right, the state
        expiration timer runs every second now. We might also
        consider to expose `pf_purge_states_collect` as a new
        tuning knob to control state expiration. Anyway this
        can be always done as a follow up change.

        - function  pf_state_export() line 1319 comment XXX
            1319         if (READ_ONCE(st->sync_defer) != NULL) /* XXX */
            1320                 sp->state_flags |= htons(PFSTATE_ACK);
          what does it stand for?

        - function pf_purge_expired_states(): handling of static
          variable `cur` is unsafe w.r.t. pf_state_list. However
          the same issue is present in current. I think there is
          a diff floating around which makes cur member of
          pf_state_list structure.

    net/if_pfsync.h
        new if_pfsycnc changes codes for sync_state. This makes me wonder
        if we should also update PFSYNC_VERSION. Just to avoid some
        undesired/undefined behavior if only one firewall node in cluster gets
        upgraded.

    net/if_pfsync.c
        the diff currently uses two slices (PFSYNC_NSLICES). is there a plan to
        scale it up?  the slice can be simply viewed as a kind of task. IMO the
        number of slices can be aligned with number of cpu cores. Or is this
        too simplified? I'm just trying to get some hints on how to further
        tune performance.

        I noticed there are few dances with NET_LOCK() (UNLOCK()/LOCK())
        in up() and down() operations. I feel this comes from fact we
        use NET_LOCK() to also protect state of pfsync(4) itself. I wounder
        if we can make our life easier here with dedicated rw-lock similar
        to pfioctl_rw we have in pf(4). The dedicated rw-lock would make
        sure there is no other ioctl() operation intervening with us.
        this can be certainly left for follow up change.

        - I think lines 102-108 and 110 can be removed:
         102 #if 0
         103 #define DPRINTF(_fmt...) do {                           \
         104         printf("%s[%u]: ", __func__, __LINE__);         \
         105         printf(_fmt);                                   \
         106         printf("\n");                                   \
         107 } while (0)
         108 #else

        - function pfsync_clone_create(), line 384:
         381                 pool_init(&pfsync_deferrals_pool,
         382                     sizeof(struct pfsync_deferral), 0,
         383                     IPL_MPFLOOR, 0, "pfdefer", NULL);
         384                 /* pool_cache_init(&pfsync_deferrals_pool); */
          does it mean pool_cache_init() is not ready/well tested yet?

        - function pfsync_clone_create(), comments /* grumble ... */
          on lines 401, and 405, what is that? Are those related
          to line 440: /* stupid NET_LOCK */ is it about using time out
          to deal with NET_LOCK()? like avoid recursion on NET_LOCK()?

        - function pfsync_sendout() line 1539, comment /* XXX */
          what does it stand for?

        - function pfsync_clear_states() is it complete? it seems
          to me it has no effect (apart from exercise with reference)

I have not spot anything else.

thanks and
regards
sashan


Reply via email to