On Wednesday, November 26, 2014 08:19:36 PM Alfred Perlstein wrote:
> Author: alfred
> Date: Wed Nov 26 20:19:36 2014
> New Revision: 275136
> URL: https://svnweb.freebsd.org/changeset/base/275136
> 
> Log:
>   Make igb and ixgbe check tunables at probe time.
> 
>   This allows one to make a kernel module to tune the
>   number of queues before the driver loads.
> 
>   This is needed so that a module at SI_SUB_CPU can set
>   tunables for these drivers to take.  Otherwise getenv
>   is called too early by the TUNABLE macros.
> 
>   Reviewed by: smh
>   Phabric: https://reviews.freebsd.org/D1149

The explicit TUNABLE_INT_FETCH strikes me as the wrong approach and hackish.  
That is, each time you want to frob a tunable like this you will have to go 
add a bunch of code to explicitly re-read the tunable, etc.  Instead, you 
could have simply changed your helper module to use kernel_sysctlbyname() to 
set hw.igb.num_queues.  That would have required only a single character 
change to make the SYSCTL read/write instead of read/only for each tunable in 
question.  To be completely future proof (i.e. to handle loading the module in 
question post-boot), your module could both do setenv() and 
kernel_sysctlbyname().  That seems to be a more extensible approach in terms 
of allowing more of these to be changed in the future without having to do a 
manual bypass of the existing tunable infrastructure for each tunable.  I 
would much prefer that you revert this part and change the relevant tunables 
to CTLFLAG_RWTUN and update your out-of-tree module to use 
kernel_sysctlbyname().

Also, if you didn't run this by the Intel folks (e.g. jfv@) that might have 
been nice as a courtesy.

Also, please use the existing resource_int_value() that uses 'hint.igb.0.foo'
instead of inventing a different wheel (device_get_int()).  This is what all 
the other drivers in the tree that provide per-instance tunables use.  You
could perhaps have device_get_int() as a wrapper around resource_int_value(),
but we should use the existing convention, not invent a conflicting one.

> Modified:
>   head/sys/dev/e1000/if_igb.c
>   head/sys/dev/ixgbe/ixgbe.c
>   head/sys/kern/subr_bus.c
>   head/sys/sys/bus.h
> 
> Modified: head/sys/dev/e1000/if_igb.c
> ============================================================================
> == --- head/sys/dev/e1000/if_igb.c    Wed Nov 26 18:03:25 2014        
> (r275135) 
+++
> head/sys/dev/e1000/if_igb.c   Wed Nov 26 20:19:36 2014        (r275136) @@ 
-188,6
> +188,7 @@ static char *igb_strings[] = {
>  /*********************************************************************
>   *  Function prototypes
>   *********************************************************************/
> +static int   igb_per_unit_num_queues(SYSCTL_HANDLER_ARGS);
>  static int   igb_probe(device_t);
>  static int   igb_attach(device_t);
>  static int   igb_detach(device_t);
> @@ -493,6 +494,11 @@ igb_attach(device_t dev)
>           OID_AUTO, "nvm", CTLTYPE_INT|CTLFLAG_RW, adapter, 0,
>           igb_sysctl_nvm_info, "I", "NVM Information");
> 
> +        SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
> +                        SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
> +                     OID_AUTO, "num_queues", CTLTYPE_INT | CTLFLAG_RD,
> +                     adapter, 0, igb_per_unit_num_queues, "I", "Number of 
> Queues");
> +

You don't need igb_per_unit_num_queues().
SYSCTL_ADD_INT(..., &adapter->num_queues) should have been used instead.

> @@ -2831,6 +2837,7 @@ igb_setup_msix(struct adapter *adapter)
>  {
>       device_t        dev = adapter->dev;
>       int             bar, want, queues, msgs, maxqueues;
> +     int             n_queues;
> 
>       /* tuneable override */
>       if (igb_enable_msix == 0)
> @@ -2858,8 +2865,18 @@ igb_setup_msix(struct adapter *adapter)
>               goto msi;
>       }
> 
> -     /* Figure out a reasonable auto config value */
> -     queues = (mp_ncpus > (msgs-1)) ? (msgs-1) : mp_ncpus;
> +     n_queues = 0;
> +     /* try more specific tunable, then global, then finally default to boot
> time tunable if set. */ +     if (device_getenv_int(dev, "num_queues",
> &n_queues) != 0) {
> +             device_printf(dev, "using specific tunable num_queues=%d", 
> n_queues);
> +     } else if (TUNABLE_INT_FETCH("hw.igb.num_queues", &n_queues) != 0) {
> +             if (igb_num_queues != n_queues) {
> +                     device_printf(dev, "using global tunable 
> hw.igb.num_queues=%d",
> n_queues); +                  igb_num_queues = n_queues;
> +             }
> +     } else {
> +             n_queues = igb_num_queues;
> +     }
> 
>  #ifdef       RSS
>       /* If we're doing RSS, clamp at the number of RSS buckets */
> @@ -2867,10 +2884,12 @@ igb_setup_msix(struct adapter *adapter)
>               queues = rss_getnumbuckets();
>  #endif
> 
> -
> -     /* Manual override */
> -     if (igb_num_queues != 0)
> -             queues = igb_num_queues;
> +     if (n_queues != 0) {
> +             queues = n_queues;
> +     } else {
> +             /* Figure out a reasonable auto config value */
> +             queues = (mp_ncpus > (msgs-1)) ? (msgs-1) : mp_ncpus;
> +     }

This moves the mp_ncpus auto config below the RSS bits, so now you
are ignoring the RSS value.  You probably should have left the
mp_ncpus line where it was.

One old bug here is that igb checked the tunable value twice.  It
should only check it once at the end of all the auto-tuning.

>       /* Sanity check based on HW */
>       switch (adapter->hw.mac.type) {
> @@ -2893,12 +2912,17 @@ igb_setup_msix(struct adapter *adapter)
>                       maxqueues = 1;
>                       break;
>       }
> -     if (queues > maxqueues)
> +     if (queues > maxqueues) {
> +             device_printf(adapter->dev, "requested %d queues, but max for 
> this
> adapter is %d\n", +               queues, maxqueues);
>               queues = maxqueues;
> -
> -     /* Manual override */
> -     if (igb_num_queues != 0)
> -             queues = igb_num_queues;
> +     } else if (queues == 0) {
> +             queues = 1;
> +     } else if (queues < 0) {
> +             device_printf(adapter->dev, "requested %d queues, but min for 
> this
> adapter is %d\n", +               queues, 1);
> +             queues = 1;
> +     }

So here is where I would check the tunable to allow it to override.  Assuming 
you revert the TUNABLE_INT_FETCH business then this only needs:

        /* Manual override */
        if (device_get_int(&n_queues) != 0)
                n_queues = igb_num_queues;
        if (n_queues != 0) {
                if (n_queues > maxqueues || n_queues < 0)
                        /* whine */
                else
                        queues = n_queues;
        }

-- 
John Baldwin
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to