date:20200917

On Thu, Sep 17, 2020 at 01:02:15PM +0800, Wong Vee Khee wrote:
> From: "Tan, Tee Min" 
>
> For driver open(), rtnl_lock is acquired by network stack but not in the
> resume(). Therefore, we introduce lock_acquired boolean to control when
> to use rtnl_lock|unlock() within stmmac_hw_setup().

Doesn't really make sense, if function needs to have lock acquired, the
caller is supposed to take it and function should have proper lockdep
annotation inside and not this conditional lock/unlock.

Thanks

>
> Fixes: 686cff3d7022 ("net: stmmac: Fix incorrect location to set 
> real_num_rx|tx_queues")
>

Extra line.

> Signed-off-by: Tan, Tee Min 
> ---
>  drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
> b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index df2c74bbfcff..22e6a3defa78 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -2607,7 +2607,8 @@ static void stmmac_safety_feat_configuration(struct 
> stmmac_priv *priv)
>   *  0 on success and an appropriate (-)ve integer as defined in errno.h
>   *  file on failure.
>   */
> -static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
> +static int stmmac_hw_setup(struct net_device *dev, bool init_ptp,
> +bool lock_acquired)
>  {
>   struct stmmac_priv *priv = netdev_priv(dev);
>   u32 rx_cnt = priv->plat->rx_queues_to_use;
> @@ -2715,9 +2716,15 @@ static int stmmac_hw_setup(struct net_device *dev, 
> bool init_ptp)
>   }
>
>   /* Configure real RX and TX queues */
> + if (!lock_acquired)
> + rtnl_lock();
> +
>   netif_set_real_num_rx_queues(dev, priv->plat->rx_queues_to_use);
>   netif_set_real_num_tx_queues(dev, priv->plat->tx_queues_to_use);
>
> + if (!lock_acquired)
> + rtnl_unlock();
> +
>   /* Start the ball rolling... */
>   stmmac_start_all_dma(priv);
>
> @@ -2804,7 +2811,7 @@ static int stmmac_open(struct net_device *dev)
>   goto init_error;
>   }
>
> - ret = stmmac_hw_setup(dev, true);
> + ret = stmmac_hw_setup(dev, true, true);
>   if (ret < 0) {
>   netdev_err(priv->dev, "%s: Hw setup failed\n", __func__);
>   goto init_error;
> @@ -5238,7 +5245,7 @@ int stmmac_resume(struct device *dev)
>
>   stmmac_clear_descriptors(priv);
>
> - stmmac_hw_setup(ndev, false);
> + stmmac_hw_setup(ndev, false, false);
>   stmmac_init_coalesce(priv);
>   stmmac_set_rx_mode(ndev);
>
> --
> 2.17.0
>

Re: [PATCH net-next 6/6] net: hns3: use napi_consume_skb() when cleaning tx desc

2020-09-17 Thread Yunsheng Lin

On 2020/9/16 16:38, Eric Dumazet wrote:
> On Wed, Sep 16, 2020 at 10:33 AM Saeed Mahameed  wrote:
>>
>> On Tue, 2020-09-15 at 15:04 +0800, Yunsheng Lin wrote:
>>> On 2020/9/15 13:09, Saeed Mahameed wrote:
 On Mon, 2020-09-14 at 20:06 +0800, Huazhong Tan wrote:
> From: Yunsheng Lin 
>
> Use napi_consume_skb() to batch consuming skb when cleaning
> tx desc in NAPI polling.
>
> Signed-off-by: Yunsheng Lin 
> Signed-off-by: Huazhong Tan 
> ---
>  drivers/net/ethernet/hisilicon/hns3/hns3_enet.c| 27
> +++-
> --
>  drivers/net/ethernet/hisilicon/hns3/hns3_enet.h|  2 +-
>  drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c |  4 ++--
>  3 files changed, 17 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> index 4a49a76..feeaf75 100644
> --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
> @@ -2333,10 +2333,10 @@ static int hns3_alloc_buffer(struct
> hns3_enet_ring *ring,
>  }
>
>  static void hns3_free_buffer(struct hns3_enet_ring *ring,
> -  struct hns3_desc_cb *cb)
> +  struct hns3_desc_cb *cb, int budget)
>  {
>   if (cb->type == DESC_TYPE_SKB)
> - dev_kfree_skb_any((struct sk_buff *)cb->priv);
> + napi_consume_skb(cb->priv, budget);

 This code can be reached from hns3_lb_clear_tx_ring() below which
 is
 your loopback test and called with non-zero budget, I am not sure
 you
 are allowed to call napi_consume_skb() with non-zero budget outside
 napi context, perhaps the cb->type for loopback test is different
 in lb
 test case ? Idk.. , please double check other code paths.
>>>
>>> Yes, loopback test may call napi_consume_skb() with non-zero budget
>>> outside
>>> napi context. Thanks for pointing out this case.
>>>
>>> How about add the below WARN_ONCE() in napi_consume_skb() to catch
>>> this
>>> kind of error?
>>>
>>> WARN_ONCE(!in_serving_softirq(), "napi_consume_skb() is called with
>>> non-zero budget outside napi context");
>>>
>>
>> Cc: Eric
>>
>> I don't know, need to check performance impact.
>> And in_serving_softirq() doesn't necessarily mean in napi
>> but looking at _kfree_skb_defer(), i think it shouldn't care if napi or
>> not as long as it runs in soft irq it will push the skb to that
>> particular cpu napi_alloc_cache, which should be fine.

Yes, we only need to ensure _kfree_skb_defer() runs with automic context.

And it seems NAPI polling can be in thread context with BH disabled in below
patch, so in_softirq() checking should be more future-proof?

* in_softirq()   - We have BH disabled, or are processing softirqs

net: add support for threaded NAPI polling
https://www.mail-archive.com/netdev@vger.kernel.org/msg348491.html


>>
>> Maybe instead of the WARN_ONCE just remove the budget condition and
>> replace it with
>>
>> if (!in_serving_softirq())
>>   dev_consume_skb_any(skb);

Yes, that is good idea, _kfree_skb_defer() is only called in softirq or
BH disabled context, dev_consume_skb_any(skb) is called in other context,
so driver author do not need to worry about the calling context of the
napi_consume_skb().

>>
> 
> I think we need to keep costs small.
> 
> So lets add a CONFIG_DEBUG_NET option so that developers can add
> various DEBUG_NET() clauses.

Do you means something like below:

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 157e024..61a6a62 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -5104,6 +5104,15 @@ do { 
\
 })
 #endif

+#if defined(CONFIG_DEBUG_NET)
+#define DEBUG_NET_WARN(condition, format...)   \
+   do {\
+   WARN(condition, ##__VA_ARGS__);
+   } while (0)
+#else
+#define DEBUG_NET_WARN(condition, format...)
+#endif
+
 /*
  * The list of packet types we will receive (as opposed to discard)
  * and the routines to invoke.
diff --git a/net/Kconfig b/net/Kconfig
index 3831206..f59ea4b 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -473,3 +473,9 @@ config HAVE_CBPF_JIT
 # Extended BPF JIT (eBPF)
 config HAVE_EBPF_JIT
bool
+
+config DEBUG_NET
+   bool
+   depends on DEBUG_KERNEL
+   help
+ Say Y here to add some extra checks and diagnostics to networking.
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bfd7483..10547db 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -904,6 +904,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
return;
}

+   DEBUG_NET_WARN(!in_serving_softirq(),
+   "napi_consume_skb() is called with non-zero budget 
outside softirq

Re: [patch] freeaddrinfo.3: memory leaks in freeaddrinfo examples

2020-09-17 Thread Michael Kerrisk (man-pages)

[CC += beej, to alert the author about the memory leaks 
in the network programming guide]

Hello Marko,

> On Thu, Sep 17, 2020 at 7:42 AM Michael Kerrisk (man-pages) <
> mtk.manpa...@gmail.com> wrote:
> 
>> Hi Marko,
>>
>> On Thu, 17 Sep 2020 at 07:34, Marko Hrastovec 
>> wrote:
>>>
>>> Hi,
>>>
>>> examples in freeaddrinfo.3 have a memory leak, which is replicated in
>> many real world programs copying an example from manual pages. The two
>> examples should have different order of lines, which is done in the
>> following patch.
>>>
>>> diff --git a/man3/getaddrinfo.3 b/man3/getaddrinfo.3
>>> index c9a4b3e43..4d383bea0 100644
>>> --- a/man3/getaddrinfo.3
>>> +++ b/man3/getaddrinfo.3
>>> @@ -711,13 +711,13 @@ main(int argc, char *argv[])
>>>  close(sfd);
>>>  }
>>>
>>> +freeaddrinfo(result);   /* No longer needed */
>>> +
>>>  if (rp == NULL) {   /* No address succeeded */
>>>  fprintf(stderr, "Could not bind\en");
>>>  exit(EXIT_FAILURE);
>>>  }
>>>
>>> -freeaddrinfo(result);   /* No longer needed */
>>> -
>>>  /* Read datagrams and echo them back to sender */
>>>
>>>  for (;;) {
>>> @@ -804,13 +804,13 @@ main(int argc, char *argv[])
>>>  close(sfd);
>>>  }
>>>
>>> +freeaddrinfo(result);   /* No longer needed */
>>> +
>>>  if (rp == NULL) {   /* No address succeeded */
>>>  fprintf(stderr, "Could not connect\en");
>>>  exit(EXIT_FAILURE);
>>>  }
>>>
>>> -freeaddrinfo(result);   /* No longer needed */
>>> -
>>>  /* Send remaining command\-line arguments as separate
>>> datagrams, and read responses from server */
>>>
>>
>> When you say "memory leak", do you mean that something like valgrind
>> complains? I mean, strictly speaking, there is no memory leak that I
>> can see that is fixed by that patch, since the if-branches that the
>> freeaddrinfo() calls are shifted above terminates the process in each
>> case.
>
> you are right about terminating the process. However, people copy that
> example and put the code in function changing "exit" to "return". There are
> a bunch of examples like that here https://beej.us/guide/bgnet/html/#poll,
> for instance.

Oh -- I see what you mean.

> That error bothered me when reading the network programming
> guide https://beej.us/guide/bgnet/html/. Than I looked for information
> elsewhere:
> -
> https://stackoverflow.com/questions/6712740/valgrind-reporting-that-getaddrinfo-is-leaking-memory
> -
> https://stackoverflow.com/questions/15690303/server-client-sockets-freeaddrinfo3-placement
> And finally, I checked manual pages and saw where these errors come from.
> 
> When you change that to a function and return without doing freeaddrinfo,
> that is a memory leak. I believe an example should show good programming
> practices. Relying on exiting and clearing the memory in that case is not
> such a case. In my opinion, these examples lead people to make mistakes in
> their programs.

Yes, I can buy that argument. I've applied your patch.

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Re: [PATCH] ptp: mark symbols static where possible

On Thu, Sep 17, 2020 at 10:25:08AM +0800, Herrington wrote:
> We get 1 warning when building kernel with W=1:
> drivers/ptp/ptp_pch.c:182:5: warning: no previous prototype for 
> ‘pch_ch_control_read’ [-Wmissing-prototypes]
>  u32 pch_ch_control_read(struct pci_dev *pdev)
> drivers/ptp/ptp_pch.c:193:6: warning: no previous prototype for 
> ‘pch_ch_control_write’ [-Wmissing-prototypes]
>  void pch_ch_control_write(struct pci_dev *pdev, u32 val)
> drivers/ptp/ptp_pch.c:201:5: warning: no previous prototype for 
> ‘pch_ch_event_read’ [-Wmissing-prototypes]
>  u32 pch_ch_event_read(struct pci_dev *pdev)
> drivers/ptp/ptp_pch.c:212:6: warning: no previous prototype for 
> ‘pch_ch_event_write’ [-Wmissing-prototypes]
>  void pch_ch_event_write(struct pci_dev *pdev, u32 val)
> drivers/ptp/ptp_pch.c:220:5: warning: no previous prototype for 
> ‘pch_src_uuid_lo_read’ [-Wmissing-prototypes]
>  u32 pch_src_uuid_lo_read(struct pci_dev *pdev)
> drivers/ptp/ptp_pch.c:231:5: warning: no previous prototype for 
> ‘pch_src_uuid_hi_read’ [-Wmissing-prototypes]
>  u32 pch_src_uuid_hi_read(struct pci_dev *pdev)
> drivers/ptp/ptp_pch.c:242:5: warning: no previous prototype for 
> ‘pch_rx_snap_read’ [-Wmissing-prototypes]
>  u64 pch_rx_snap_read(struct pci_dev *pdev)
> drivers/ptp/ptp_pch.c:259:5: warning: no previous prototype for 
> ‘pch_tx_snap_read’ [-Wmissing-prototypes]
>  u64 pch_tx_snap_read(struct pci_dev *pdev)
> drivers/ptp/ptp_pch.c:300:5: warning: no previous prototype for 
> ‘pch_set_station_address’ [-Wmissing-prototypes]
>  int pch_set_station_address(u8 *addr, struct pci_dev *pdev)
>
> Signed-off-by: Herrington 
> ---
>  drivers/ptp/ptp_pch.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)

This file is total mess.

>
> diff --git a/drivers/ptp/ptp_pch.c b/drivers/ptp/ptp_pch.c
> index ce10ecd41ba0..8db2d1893577 100644
> --- a/drivers/ptp/ptp_pch.c
> +++ b/drivers/ptp/ptp_pch.c
> @@ -179,7 +179,7 @@ static inline void pch_block_reset(struct pch_dev *chip)
>   iowrite32(val, (&chip->regs->control));
>  }
>
> -u32 pch_ch_control_read(struct pci_dev *pdev)
> +static u32 pch_ch_control_read(struct pci_dev *pdev)
>  {
>   struct pch_dev *chip = pci_get_drvdata(pdev);
>   u32 val;
> @@ -190,7 +190,7 @@ u32 pch_ch_control_read(struct pci_dev *pdev)
>  }
>  EXPORT_SYMBOL(pch_ch_control_read);

This function is not used and can be deleted.

>
> -void pch_ch_control_write(struct pci_dev *pdev, u32 val)
> +static void pch_ch_control_write(struct pci_dev *pdev, u32 val)
>  {
>   struct pch_dev *chip = pci_get_drvdata(pdev);
>
> @@ -198,7 +198,7 @@ void pch_ch_control_write(struct pci_dev *pdev, u32 val)
>  }
>  EXPORT_SYMBOL(pch_ch_control_write);


This function in use (incorrectly) by
drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c

Your patch will break it.

I didn't check other functions, but assume they are broken too.

Thanks

Re: [PATCH 2/2] crypto: ccree - add custom cache params from DT file

2020-09-17 Thread Gilad Ben-Yossef

hmm...

On Wed, Sep 16, 2020 at 4:48 PM kernel test robot  wrote:
>
> url:
> https://github.com/0day-ci/linux/commits/Gilad-Ben-Yossef/add-optional-cache-params-from-DT/20200916-152151
> base:   
> https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git 
> master
> config: arm64-randconfig-r015-20200916 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
> 9e3842d60351f986d77dfe0a94f76e4fd895f188)
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install arm64 cross compiling tool for clang build
> # apt-get install binutils-aarch64-linux-gnu
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
> All warnings (new ones prefixed by >>):
>
> >> drivers/crypto/ccree/cc_driver.c:120:18: warning: result of comparison of 
> >> constant 18446744073709551615 with expression of type 'u32' (aka 'unsigned 
> >> int') is always false [-Wtautological-constant-out-of-range-compare]
>cache_params |= FIELD_PREP(mask, val);
>^
>include/linux/bitfield.h:94:3: note: expanded from macro 'FIELD_PREP'
>__BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");\
>^~~
>include/linux/bitfield.h:52:28: note: expanded from macro 
> '__BF_FIELD_CHECK'
>BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull, \
>~^~~~
>include/linux/build_bug.h:39:58: note: expanded from macro 
> 'BUILD_BUG_ON_MSG'
>#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
>~^~~
>include/linux/compiler_types.h:319:22: note: expanded from macro 
> 'compiletime_assert'
>_compiletime_assert(condition, msg, __compiletime_assert_, 
> __COUNTER__)
>
> ^~~
>include/linux/compiler_types.h:307:23: note: expanded from macro 
> '_compiletime_assert'
>__compiletime_assert(condition, msg, prefix, suffix)
>~^~~
>include/linux/compiler_types.h:299:9: note: expanded from macro 
> '__compiletime_assert'
>if (!(condition))   \
>  ^

I am unable to understand this warning. It looks like it is
complaining about a FIELD_GET sanity check that is always false, which
makes sense since we're using a constant.

Anyone can enlighten me if I've missed something?

Thanks,
Gilad



-- 
Gilad Ben-Yossef
Chief Coffee Drinker

values of β will give rise to dom!

Re: [PATCH 6/6] Bluetooth: Add MGMT command for controller capabilities

2020-09-17 Thread kernel test robot

Hi Daniel,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bluetooth-next/master]
[also build test WARNING on next-20200916]
[cannot apply to net-next/master net/master v5.9-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Daniel-Winkler/Bluetooth-Add-new-MGMT-interface-for-advertising-add/20200917-042141
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git 
master
config: x86_64-randconfig-s022-20200917 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.2-201-g24bdaac6-dirty
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

   net/bluetooth/mgmt.c:3647:29: sparse: sparse: restricted __le16 degrades to 
integer
   net/bluetooth/mgmt.c:4104:9: sparse: sparse: cast to restricted __le32
>> net/bluetooth/mgmt.c:4386:27: sparse: sparse: incorrect type in assignment 
>> (different base types) @@ expected restricted __le16 [usertype] type @@  
>>got int @@
>> net/bluetooth/mgmt.c:4386:27: sparse: expected restricted __le16 
>> [usertype] type
>> net/bluetooth/mgmt.c:4386:27: sparse: got int

# 
https://github.com/0day-ci/linux/commit/171d4465b1f2811c76267c2f0acbcd0f77b5e99a
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Daniel-Winkler/Bluetooth-Add-new-MGMT-interface-for-advertising-add/20200917-042141
git checkout 171d4465b1f2811c76267c2f0acbcd0f77b5e99a
vim +4386 net/bluetooth/mgmt.c

  4360  
  4361  static int read_controller_cap(struct sock *sk, struct hci_dev *hdev,
  4362 void *data, u16 len)
  4363  {
  4364  u8 i = 0;
  4365  
  4366  /* This command will return its data in TVL format. Currently 
we only
  4367   * wish to include LE tx power parameters, so this struct can 
be given
  4368   * a fixed size as data types are not changing.
  4369   */
  4370  struct {
  4371  struct mgmt_tlv entry;
  4372  __s8 value;
  4373  } __packed cap[2];
  4374  
  4375  BT_DBG("request for %s", hdev->name);
  4376  memset(cap, 0, sizeof(cap));
  4377  
  4378  hci_dev_lock(hdev);
  4379  
  4380  /* Append LE tx power bounds */
  4381  cap[i].entry.type = MGMT_CAP_LE_TX_PWR_MIN;
  4382  cap[i].entry.length = sizeof(__s8);
  4383  cap[i].value = hdev->min_le_tx_power;
  4384  i++;
  4385  
> 4386  cap[i].entry.type = MGMT_CAP_LE_TX_PWR_MAX;
  4387  cap[i].entry.length = sizeof(__s8);
  4388  cap[i].value = hdev->max_le_tx_power;
  4389  i++;
  4390  
  4391  hci_dev_unlock(hdev);
  4392  
  4393  return mgmt_cmd_complete(sk, hdev->id, 
MGMT_OP_READ_CONTROLLER_CAP,
  4394   MGMT_STATUS_SUCCESS, cap, sizeof(cap));
  4395  }
  4396  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH] net: dsa: mt7530: Add some return-value checks

2020-09-17 Thread Landen Chao

Hi Alex,

Thanks for your review and fixing.
On Thu, 2020-09-17 at 03:50 +0800, Alex Dewar wrote:
[..]
> 
> If it is not expected that these functions will throw errors (i.e.
> because the parameters passed will always be correct), we could dispense
> with the use of EINVAL errors and just use BUG*() macros instead. Let me
> know if you'd rather I fix things up in that way.
The cpu port setting is passed by dts. Use EINVAL to catch unexpected
setting is fine.
> 
> Best,
> Alex
> 
>  drivers/net/dsa/mt7530.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> index 61388945d316..157d0a01faae 100644
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -945,10 +945,14 @@ static int
>  mt753x_cpu_port_enable(struct dsa_switch *ds, int port)
>  {
>   struct mt7530_priv *priv = ds->priv;
> + int ret;
>  
>   /* Setup max capability of CPU port at first */
> - if (priv->info->cpu_port_config)
> - priv->info->cpu_port_config(ds, port);
> + if (priv->info->cpu_port_config) {
> + ret = priv->info->cpu_port_config(ds, port);
> + if (ret)
> + return ret;
> + }
How about check return value in caller function, mt7530_setup() and
mt7531_setup(), too?
if (dsa_is_cpu_port(ds, i)) {
ret = mt753x_cpu_port_enable(ds, i);
if (ret)
return ret;
} else {
[..]
regards,
landen

[PATCH] ath10k: qmi: Skip host capability request for Xiaomi Poco F1

2020-09-17 Thread Amit Pundir

Workaround to get WiFi working on Xiaomi Poco F1 (sdm845)
phone. We get a non-fatal QMI_ERR_MALFORMED_MSG_V01 error
message in ath10k_qmi_host_cap_send_sync(), but we can still
bring up WiFi services successfully on AOSP if we ignore it.

We suspect either the host cap is not implemented or there
may be firmware specific issues. Firmware version is
QC_IMAGE_VERSION_STRING=WLAN.HL.2.0.c3-00257-QCAHLSWMTPLZ-1

qcom,snoc-host-cap-8bit-quirk didn't help. If I use this
quirk, then the host capability request does get accepted,
but we run into fatal "msa info req rejected" error and
WiFi interface doesn't come up.

Attempts are being made to debug the failure reasons but no
luck so far. Hence this device specific workaround instead
of checking for QMI_ERR_MALFORMED_MSG_V01 error message.
Tried ath10k/WCN3990/hw1.0/wlanmdsp.mbn from the upstream
linux-firmware project but it didn't help and neither did
building board-2.bin file from stock bdwlan* files.

This workaround will be removed once we have a viable fix.
Thanks to postmarketOS guys for catching this.

Signed-off-by: Amit Pundir 
---
Device-tree for Xiaomi Poco F1(Beryllium) got merged in
qcom/arm64-for-5.10 last week
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?id=77809cf74a8c

 drivers/net/wireless/ath/ath10k/qmi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/qmi.c 
b/drivers/net/wireless/ath/ath10k/qmi.c
index 0dee1353d395..37c5350eb8b1 100644
--- a/drivers/net/wireless/ath/ath10k/qmi.c
+++ b/drivers/net/wireless/ath/ath10k/qmi.c
@@ -651,7 +651,8 @@ static int ath10k_qmi_host_cap_send_sync(struct ath10k_qmi 
*qmi)
 
/* older FW didn't support this request, which is not fatal */
if (resp.resp.result != QMI_RESULT_SUCCESS_V01 &&
-   resp.resp.error != QMI_ERR_NOT_SUPPORTED_V01) {
+   resp.resp.error != QMI_ERR_NOT_SUPPORTED_V01 &&
+   !of_machine_is_compatible("xiaomi,beryllium")) { /* Xiaomi Poco F1 
workaround */
ath10k_err(ar, "host capability request rejected: %d\n", 
resp.resp.error);
ret = -EINVAL;
goto out;
-- 
2.7.4

[RFC PATCH] bpf: Fix potential call bpf_link_free() in atomic context

2020-09-17 Thread Muchun Song

The in_atomic macro cannot always detect atomic context. In particular,
it cannot know about held spinlocks in non-preemptible kernels. Although,
there is no user call bpf_link_put() with holding spinlock now. Be the
safe side, we can avoid this in the feature.

Signed-off-by: Muchun Song 
---
 kernel/bpf/syscall.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 178c147350f5..6347be0a5c82 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2345,12 +2345,8 @@ void bpf_link_put(struct bpf_link *link)
if (!atomic64_dec_and_test(&link->refcnt))
return;
 
-   if (in_atomic()) {
-   INIT_WORK(&link->work, bpf_link_put_deferred);
-   schedule_work(&link->work);
-   } else {
-   bpf_link_free(link);
-   }
+   INIT_WORK(&link->work, bpf_link_put_deferred);
+   schedule_work(&link->work);
 }
 
 static int bpf_link_release(struct inode *inode, struct file *filp)
-- 
2.20.1

Re: [oss-drivers] [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-17 Thread Simon Horman

On Wed, Sep 09, 2020 at 01:06:39PM -0700, Joe Perches wrote:
> fallthrough to a separate case/default label break; isn't very readable.
> 
> Convert pseudo-keyword fallthrough; statements to a simple break; when
> the next label is case or default and the only statement in the next
> label block is break;
> 
> Found using:
> 
> $ grep-2.5.4 -rP --include=*.[ch] -n 
> "fallthrough;(\s*(case\s+\w+|default)\s*:\s*){1,7}break;" *
> 
> Miscellanea:
> 
> o Move or coalesce a couple label blocks above a default: block.
> 
> Signed-off-by: Joe Perches 

...

> diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c 
> b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c
> index 252fe06f58aa..1d5b87079104 100644
> --- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp6000_pcie.c
> @@ -345,7 +345,7 @@ static int matching_bar(struct nfp_bar *bar, u32 tgt, u32 
> act, u32 tok,
>   baract = NFP_CPP_ACTION_RW;
>   if (act == 0)
>   act = NFP_CPP_ACTION_RW;
> - fallthrough;
> + break;
>   case NFP_PCIE_BAR_PCIE2CPP_MapType_FIXED:
>   break;
>   default:

This is a cascading fall-through handling all map types.
I don't think this change improves readability.

...

[PATCH 3/3] docs: bpf: ringbuf.rst: fix a broken cross-reference

2020-09-17 Thread Mauro Carvalho Chehab

Sphinx warns about a broken cross-reference:

Documentation/bpf/ringbuf.rst:194: WARNING: Unknown target name: 
"bench_ringbufs.c".

It seems that the original idea were to add a reference for this file:

tools/testing/selftests/bpf/benchs/bench_ringbufs.c

However, this won't work as such file is not part of the
documentation output dir. It could be possible to use
an extension like interSphinx in order to make external
references to be pointed to some website (like kernel.org),
where the file is stored, but currently we don't use it.

It would also be possible to include this file as a
literal include, placing it inside Documentation/bpf.

For now, let's take the simplest approach: just drop
the "_" markup at the end of the reference. This
should solve the warning, and it sounds quite obvious
that the file to see is at the Kernel tree.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/bpf/ringbuf.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/bpf/ringbuf.rst b/Documentation/bpf/ringbuf.rst
index 4d4f3bcb1477..6a615cd62bda 100644
--- a/Documentation/bpf/ringbuf.rst
+++ b/Documentation/bpf/ringbuf.rst
@@ -197,7 +197,7 @@ a self-pacing notifications of new data being availability.
 being available after commit only if consumer has already caught up right up to
 the record being committed. If not, consumer still has to catch up and thus
 will see new data anyways without needing an extra poll notification.
-Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbufs.c_) show that
+Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbufs.c) show that
 this allows to achieve a very high throughput without having to resort to
 tricks like "notify only every Nth sample", which are necessary with perf
 buffer. For extreme cases, when BPF program wants more manual control of
-- 
2.26.2

[PATCH 0/3] Additional doc warning fixes for issues at next-20200915

2020-09-17 Thread Mauro Carvalho Chehab

There are a couple of new warnings introduced at linux-next.

This small patch series address them.

The complete series addressing (almost) all doc warnings is at:

https://git.linuxtv.org/mchehab/experimental.git/log/?h=doc-fixes

I'll keep rebasing such tree until we get rid of all doc warnings upstream,
hopefully in time for Kernel 5.10.

Mauro Carvalho Chehab (3):
  docs: kasan.rst: add two missing blank lines
  mm: pagemap.h: fix two kernel-doc markups
  docs: bpf: ringbuf.rst: fix a broken cross-reference

 Documentation/bpf/ringbuf.rst | 2 +-
 Documentation/dev-tools/kasan.rst | 2 ++
 include/linux/pagemap.h   | 8 
 3 files changed, 7 insertions(+), 5 deletions(-)

-- 
2.26.2

Re: resolve_btfids breaks kernel cross-compilation

On Wed, Sep 16, 2020 at 02:47:33PM -0500, Seth Forshee wrote:
> The requirement to build resolve_btfids whenever CONFIG_DEBUG_INFO_BTF
> is enabled breaks some cross builds. For example, when building a 64-bit
> powerpc kernel on amd64 I get:
> 
>  Auto-detecting system features:
>  ...libelf: [ [32mon[m  ]
>  ...  zlib: [ [32mon[m  ]
>  ...   bpf: [ [31mOFF[m ]
>  
>  BPF API too old
>  make[6]: *** [Makefile:295: bpfdep] Error 1
> 
> The contents of tools/bpf/resolve_btfids/feature/test-bpf.make.output:
> 
>  In file included from 
> /home/sforshee/src/u-k/unstable/tools/arch/powerpc/include/uapi/asm/bitsperlong.h:11,
>   from /usr/include/asm-generic/int-ll64.h:12,
>   from /usr/include/asm-generic/types.h:7,
>   from /usr/include/x86_64-linux-gnu/asm/types.h:1,
>   from 
> /home/sforshee/src/u-k/unstable/tools/include/linux/types.h:10,
>   from 
> /home/sforshee/src/u-k/unstable/tools/include/uapi/linux/bpf.h:11,
>   from test-bpf.c:3:
>  
> /home/sforshee/src/u-k/unstable/tools/include/asm-generic/bitsperlong.h:14:2: 
> error: #error Inconsistent word size. Check asm/bitsperlong.h
> 14 | #error Inconsistent word size. Check asm/bitsperlong.h
>|  ^
> 
> This is because tools/arch/powerpc/include/uapi/asm/bitsperlong.h sets
> __BITS_PER_LONG based on the predefinied compiler macro __powerpc64__,
> which is not defined by the host compiler. What can we do to get cross
> builds working again?

could you please share the command line and setup?

thanks,
jirka

[PATCH net-next 4/8] devlink: Support get and set state of port function

devlink port function can be in active or inactive state.
Allow users to get and set port function's state.

Example of a PCI SF port which supports a port function:
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device
$ devlink port show
netdevsim/netdevsim10/0: type eth netdev eni10np1 flavour physical port 1 
splittable false

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0
$ devlink port add netdevsim/netdevsim10/11 flavour pcisf pfnum 0 sfnum 44
$ devlink port show netdevsim/netdevsim10/11
netdevsim/netdevsim10/11: type eth netdev eni10npf0sf44 flavour pcisf 
controller 0 pfnum 0 sfnum 44 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port function set netdevsim/netdevsim10/11 hw_addr 00:11:22:33:44:55 
state active

$ devlink port show netdevsim/netdevsim10/11 -jp
{
"port": {
"netdevsim/netdevsim10/11": {
"type": "eth",
"netdev": "eni10npf0sf44",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": false,
"splittable": false,
"function": {
"hw_addr": "00:11:22:33:44:55",
"state": "active"
}
}
}
}

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 include/net/devlink.h| 20 ++
 include/uapi/linux/devlink.h |  6 +++
 net/core/devlink.c   | 77 +++-
 3 files changed, 101 insertions(+), 2 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index ebab2c0360d0..500c22835686 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1200,6 +1200,26 @@ struct devlink_ops {
int (*port_function_hw_addr_set)(struct devlink *devlink, struct 
devlink_port *port,
 const u8 *hw_addr, int hw_addr_len,
 struct netlink_ext_ack *extack);
+   /**
+* @port_function_state_get: Port function's state get function.
+*
+* Should be used by device drivers to report the state of a function 
managed
+* by the devlink port. Driver should return -EOPNOTSUPP if it doesn't 
support port
+* function handling for a particular port.
+*/
+   int (*port_function_state_get)(struct devlink *devlink, struct 
devlink_port *port,
+  enum devlink_port_function_state *state,
+  struct netlink_ext_ack *extack);
+   /**
+* @port_function_state_set: Port function's state set function.
+*
+* Should be used by device drivers to set the state of a function 
managed
+* by the devlink port. Driver should return -EOPNOTSUPP if it doesn't 
support port
+* function handling for a particular port.
+*/
+   int (*port_function_state_set)(struct devlink *devlink, struct 
devlink_port *port,
+  enum devlink_port_function_state state,
+  struct netlink_ext_ack *extack);
/**
 * @port_new: Port add function.
 *
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 09c41b9ce407..8e513f1cd638 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -518,9 +518,15 @@ enum devlink_resource_unit {
 enum devlink_port_function_attr {
DEVLINK_PORT_FUNCTION_ATTR_UNSPEC,
DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, /* binary */
+   DEVLINK_PORT_FUNCTION_ATTR_STATE,   /* u8 */
 
__DEVLINK_PORT_FUNCTION_ATTR_MAX,
DEVLINK_PORT_FUNCTION_ATTR_MAX = __DEVLINK_PORT_FUNCTION_ATTR_MAX - 1
 };
 
+enum devlink_port_function_state {
+   DEVLINK_PORT_FUNCTION_STATE_INACTIVE,
+   DEVLINK_PORT_FUNCTION_STATE_ACTIVE,
+};
+
 #endif /* _UAPI_LINUX_DEVLINK_H_ */
diff --git a/net/core/devlink.c b/net/core/devlink.c
index d152489e48da..c82098cb75da 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -87,6 +87,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(devlink_hwerr);
 
 static const struct nla_policy 
devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_ATTR_MAX + 1] = {
[DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR] = { .type = NLA_BINARY },
+   [DEVLINK_PORT_FUNCTION_ATTR_STATE] =
+   NLA_POLICY_RANGE(NLA_U8, DEVLINK_PORT_FUNCTION_STATE_INACTIVE,
+DEVLINK_PORT_FUNCTION_STATE_ACTIVE),
 };
 
 static LIST_HEAD(devlink_list);
@@ -595,6 +598,40 @@ devlink_port_function_hw_addr_fill(struct devlink 
*devlink, const struct devlink
return 0;
 }
 
+static bool devlink_port_function_state_valid(u8 state)
+{
+   return state == DEVLINK_PORT_FUNCTION_STATE_INACTIVE ||
+  state == DEVLINK_PORT_FUNCTION_STATE_ACTIVE;
+}
+
+static int devlink_port_function_state_fill(struct devlink *devlink, const

[PATCH net-next 0/8] devlink: Add SF add/delete devlink ops

Hi Dave, Jakub,

Similar to PCI VF, PCI SF represents portion of the device.
PCI SF is represented using a new devlink port flavour.

This short series implements small part of the RFC described in detail at [1] 
and [2].

It extends
(a) devlink core to expose new devlink port flavour 'pcisf'.
(b) Expose new user interface to add/delete devlink port.
(c) Extends netdevsim driver to simulate PCI PF and SF ports
(d) Add port function state attribute

Patch summary:
Patch-1 Extends devlink to expose new PCI SF port flavour
Patch-2 Extends devlink to let user add, delete devlink Port
Patch-3 Prepare code to handle multiple port attributes
Patch-4 Extends devlink to let user get, set function state
Patch-5 Extends netdevsim driver to simulate PCI PF ports
Patch-6 Extends netdevsim driver to simulate hw_addr get/set
Patch-7 Extends netdevsim driver to simulate function state get/set
Patch-8 Extends netdevsim driver to simulate PCI SF ports

[1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/
[2] https://marc.info/?l=linux-netdev&m=15855592851&w=2

Parav Pandit (8):
  devlink: Introduce PCI SF port flavour and port attribute
  devlink: Support add and delete devlink port
  devlink: Prepare code to fill multiple port function attributes
  devlink: Support get and set state of port function
  netdevsim: Add support for add and delete of a PCI PF port
  netdevsim: Simulate get/set hardware address of a PCI port
  netdevsim: Simulate port function state for a PCI port
  netdevsim: Add support for add and delete PCI SF port

 drivers/net/netdevsim/Makefile|   3 +-
 drivers/net/netdevsim/dev.c   |  14 +
 drivers/net/netdevsim/netdevsim.h |  32 ++
 drivers/net/netdevsim/port_function.c | 498 ++
 include/net/devlink.h |  75 
 include/uapi/linux/devlink.h  |  13 +
 net/core/devlink.c| 230 ++--
 7 files changed, 840 insertions(+), 25 deletions(-)
 create mode 100644 drivers/net/netdevsim/port_function.c

-- 
2.26.2

[PATCH net-next 2/8] devlink: Support add and delete devlink port

Extended devlink interface for the user to add and delete port.
Extend devlink to connect user requests to driver to add/delete
such port in the device.

When driver routines are invoked, devlink instance lock is not held.
This enables driver to perform several devlink objects registration,
unregistration such as (port, health reporter, resource etc)
by using exising devlink APIs.
This also helps to uniformly used the code for port registration
during driver unload and during port deletion initiated by user.

Examples of add, show and delete commands:
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device

$ devlink port show netdevsim/netdevsim10/0
netdevsim/netdevsim10/0: type eth netdev eni10np1 flavour physical port 1 
splittable false

$ devlink port add netdevsim/netdevsim10 flavour pcipf pfnum 0

$ devlink port show netdevsim/netdevsim10/1
netdevsim/netdevsim10/1: type eth netdev eni10npf0 flavour pcipf controller 0 
pfnum 0 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port show netdevsim/netdevsim10/1 -jp
{
"port": {
"netdevsim/netdevsim10/1": {
"type": "eth",
"netdev": "eni10npf0",
"flavour": "pcipf",
"controller": 0,
"pfnum": 0,
"external": false,
"splittable": false,
"function": {
"hw_addr": "00:00:00:00:00:00",
"state": "inactive"
}
}
}
}

$ devlink port del netdevsim/netdevsim10/1

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 include/net/devlink.h | 38 
 net/core/devlink.c| 67 +++
 2 files changed, 105 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 1edb558125b0..ebab2c0360d0 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -142,6 +142,17 @@ struct devlink_port {
struct mutex reporters_lock; /* Protects reporter_list */
 };
 
+struct devlink_port_new_attrs {
+   enum devlink_port_flavour flavour;
+   unsigned int port_index;
+   u32 controller;
+   u32 sfnum;
+   u16 pfnum;
+   u8 port_index_valid:1,
+  controller_valid:1,
+  sfnum_valid:1;
+};
+
 struct devlink_sb_pool_info {
enum devlink_sb_pool_type pool_type;
u32 size;
@@ -1189,6 +1200,33 @@ struct devlink_ops {
int (*port_function_hw_addr_set)(struct devlink *devlink, struct 
devlink_port *port,
 const u8 *hw_addr, int hw_addr_len,
 struct netlink_ext_ack *extack);
+   /**
+* @port_new: Port add function.
+*
+* Should be used by device driver to let caller add new port of a 
specified flavour
+* with optional attributes.
+* Driver should return -EOPNOTSUPP if it doesn't support port addition 
of a specified
+* flavour or specified attributes. Driver should set extack error 
message in case of fail
+* to add the port.
+* devlink core does not hold a devlink instance lock when this 
callback is invoked.
+* Driver must ensures synchronization when adding or deleting a port. 
Driver must
+* register a port with devlink core.
+*/
+   int (*port_new)(struct devlink *devlink, const struct 
devlink_port_new_attrs *attrs,
+   struct netlink_ext_ack *extack);
+   /**
+* @port_del: Port delete function.
+*
+* Should be used by device driver to let caller delete port which was 
previously created
+* using port_new() callback.
+* Driver should return -EOPNOTSUPP if it doesn't support port deletion.
+* Driver should set extack error message in case of fail to delete the 
port.
+* devlink core does not hold a devlink instance lock when this 
callback is invoked.
+* Driver must ensures synchronization when adding or deleting a port. 
Driver must
+* register a port with devlink core.
+*/
+   int (*port_del)(struct devlink *devlink, unsigned int port_index,
+   struct netlink_ext_ack *extack);
 };
 
 static inline void *devlink_priv(struct devlink *devlink)
diff --git a/net/core/devlink.c b/net/core/devlink.c
index fada660fd515..e93730065c57 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -991,6 +991,57 @@ static int devlink_nl_cmd_port_unsplit_doit(struct sk_buff 
*skb,
return devlink_port_unsplit(devlink, port_index, info->extack);
 }
 
+static int devlink_nl_cmd_port_new_doit(struct sk_buff *skb, struct genl_info 
*info)
+{
+   struct netlink_ext_ack *extack = info->extack;
+   struct devlink_port_new_attrs new_attrs = {};
+   struct devlink *devlink = info->user_ptr[0];
+
+   if (!info->attrs[DEVLINK_ATTR_PORT_FLAVOUR] ||
+   !info->attrs[DEVLINK_ATTR_PORT_

[PATCH net-next 5/8] netdevsim: Add support for add and delete of a PCI PF port

Simulate PCI PF ports. Allow user to create one or more PCI PF ports.

Examples:

Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device
$ devlink port show
netdevsim/netdevsim10/0: type eth netdev eni10np1 flavour physical port 1 
splittable false

Add and show devlink port of flavour 'pcipf' for PF number 0.

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0

$ devlink port show netdevsim/netdevsim10/10
netdevsim/netdevsim10/10: type eth netdev eni10npf0 flavour pcipf controller 0 
pfnum 0 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

Delete newly added devlink port
$ devlink port add netdevsim/netdevsim10/10

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 drivers/net/netdevsim/Makefile|   3 +-
 drivers/net/netdevsim/dev.c   |  10 +
 drivers/net/netdevsim/netdevsim.h |  19 ++
 drivers/net/netdevsim/port_function.c | 337 ++
 4 files changed, 368 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/netdevsim/port_function.c

diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
index ade086eed955..e69e895af62c 100644
--- a/drivers/net/netdevsim/Makefile
+++ b/drivers/net/netdevsim/Makefile
@@ -3,7 +3,8 @@
 obj-$(CONFIG_NETDEVSIM) += netdevsim.o
 
 netdevsim-objs := \
-   netdev.o dev.o ethtool.o fib.o bus.o health.o udp_tunnels.o
+   netdev.o dev.o ethtool.o fib.o bus.o health.o udp_tunnels.o \
+   port_function.o
 
 ifeq ($(CONFIG_BPF_SYSCALL),y)
 netdevsim-objs += \
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index 32f339fedb21..e3b81c8b5125 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -884,6 +884,8 @@ static const struct devlink_ops nsim_dev_devlink_ops = {
.trap_group_set = nsim_dev_devlink_trap_group_set,
.trap_policer_set = nsim_dev_devlink_trap_policer_set,
.trap_policer_counter_get = nsim_dev_devlink_trap_policer_counter_get,
+   .port_new = nsim_dev_devlink_port_new,
+   .port_del = nsim_dev_devlink_port_del,
 };
 
 #define NSIM_DEV_MAX_MACS_DEFAULT 32
@@ -1017,6 +1019,8 @@ static int nsim_dev_reload_create(struct nsim_dev 
*nsim_dev,
  nsim_dev->ddir,
  nsim_dev,
&nsim_dev_take_snapshot_fops);
+
+   nsim_dev_port_function_enable(nsim_dev);
return 0;
 
 err_health_exit:
@@ -1050,6 +1054,7 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
nsim_dev->max_macs = NSIM_DEV_MAX_MACS_DEFAULT;
nsim_dev->test1 = NSIM_DEV_TEST1_DEFAULT;
spin_lock_init(&nsim_dev->fa_cookie_lock);
+   nsim_dev_port_function_init(nsim_dev);
 
dev_set_drvdata(&nsim_bus_dev->dev, nsim_dev);
 
@@ -1097,6 +1102,7 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
if (err)
goto err_bpf_dev_exit;
 
+   nsim_dev_port_function_enable(nsim_dev);
devlink_params_publish(devlink);
devlink_reload_enable(devlink);
return 0;
@@ -1131,6 +1137,9 @@ static void nsim_dev_reload_destroy(struct nsim_dev 
*nsim_dev)
 
if (devlink_is_reload_failed(devlink))
return;
+
+   /* Disable and destroy any user created devlink ports */
+   nsim_dev_port_function_disable(nsim_dev);
debugfs_remove(nsim_dev->take_snapshot);
nsim_dev_port_del_all(nsim_dev);
nsim_dev_health_exit(nsim_dev);
@@ -1155,6 +1164,7 @@ void nsim_dev_remove(struct nsim_bus_dev *nsim_bus_dev)
  ARRAY_SIZE(nsim_devlink_params));
devlink_unregister(devlink);
devlink_resources_unregister(devlink, NULL);
+   nsim_dev_port_function_exit(nsim_dev);
devlink_free(devlink);
 }
 
diff --git a/drivers/net/netdevsim/netdevsim.h 
b/drivers/net/netdevsim/netdevsim.h
index 0c86561e6d8d..aec3c4d5fda7 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -213,6 +213,16 @@ struct nsim_dev {
bool ipv4_only;
u32 sleep;
} udp_ports;
+   struct {
+   refcount_t refcount; /* refcount along with disable_complete 
serializes
+ * port operations with port function 
disablement
+ * during driver unload.
+ */
+   struct completion disable_complete;
+   struct list_head head;
+   struct ida ida;
+   struct ida pfnum_ida;
+   } port_functions;
 };
 
 static inline struct net *nsim_dev_net(struct nsim_dev *nsim_dev)
@@ -283,3 +293,12 @@ struct nsim_bus_dev {
 
 int nsim_bus_init(void);
 void nsim_bus_exit(void);
+
+void nsim_dev_port_function_init(struct nsim_dev *nsim_dev);
+void nsim_dev_port_function_exit(struc

[PATCH net-next 8/8] netdevsim: Add support for add and delete PCI SF port

Simulate PCI SF ports. Allow user to create one or more PCI SF ports.

Examples:

Create a PCI PF and PCI SF port.
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device
$ devlink port show
netdevsim/netdevsim10/0: type eth netdev eni10np1 flavour physical port 1 
splittable false

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0
$ devlink port add netdevsim/netdevsim10/11 flavour pcisf pfnum 0 sfnum 44
$ devlink port show netdevsim/netdevsim10/11
netdevsim/netdevsim10/11: type eth netdev eni10npf0sf44 flavour pcisf 
controller 0 pfnum 0 sfnum 44 external true splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port function set netdevsim/netdevsim10/11 hw_addr 00:11:22:33:44:55 
state active

$ devlink port show netdevsim/netdevsim10/11 -jp
{
"port": {
"netdevsim/netdevsim10/11": {
"type": "eth",
"netdev": "eni10npf0sf44",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": true,
"splittable": false,
"function": {
"hw_addr": "00:11:22:33:44:55",
"state": "active"
}
}
}
}

Delete newly added devlink port
$ devlink port add netdevsim/netdevsim10/11

Add devlink port of flavour 'pcisf' where port index and sfnum are
auto assigned by driver.
$ devlink port add netdevsim/netdevsim10 flavour pcisf controller 0 pfnum 0

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 drivers/net/netdevsim/netdevsim.h |  1 +
 drivers/net/netdevsim/port_function.c | 95 +--
 2 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/drivers/net/netdevsim/netdevsim.h 
b/drivers/net/netdevsim/netdevsim.h
index 0ea9705eda38..c70782e444d5 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -222,6 +222,7 @@ struct nsim_dev {
struct list_head head;
struct ida ida;
struct ida pfnum_ida;
+   struct ida sfnum_ida;
} port_functions;
 };
 
diff --git a/drivers/net/netdevsim/port_function.c 
b/drivers/net/netdevsim/port_function.c
index 01587b54f0e0..3a90de50b152 100644
--- a/drivers/net/netdevsim/port_function.c
+++ b/drivers/net/netdevsim/port_function.c
@@ -13,10 +13,12 @@ struct nsim_port_function {
unsigned int port_index;
enum devlink_port_flavour flavour;
u32 controller;
+   u32 sfnum;
u16 pfnum;
struct nsim_port_function *pf_port; /* Valid only for SF port */
u8 hw_addr[ETH_ALEN];
u8 state; /* enum devlink_port_function_state */
+   int refcount; /* Counts how many sf ports are bound attached to this pf 
port. */
 };
 
 void nsim_dev_port_function_init(struct nsim_dev *nsim_dev)
@@ -25,10 +27,13 @@ void nsim_dev_port_function_init(struct nsim_dev *nsim_dev)
INIT_LIST_HEAD(&nsim_dev->port_functions.head);
ida_init(&nsim_dev->port_functions.ida);
ida_init(&nsim_dev->port_functions.pfnum_ida);
+   ida_init(&nsim_dev->port_functions.sfnum_ida);
 }
 
 void nsim_dev_port_function_exit(struct nsim_dev *nsim_dev)
 {
+   WARN_ON(!ida_is_empty(&nsim_dev->port_functions.sfnum_ida));
+   ida_destroy(&nsim_dev->port_functions.sfnum_ida);
WARN_ON(!ida_is_empty(&nsim_dev->port_functions.pfnum_ida));
ida_destroy(&nsim_dev->port_functions.pfnum_ida);
WARN_ON(!ida_is_empty(&nsim_dev->port_functions.ida));
@@ -119,9 +124,24 @@ nsim_devlink_port_function_alloc(struct nsim_dev *dev, 
const struct devlink_port
goto fn_ida_err;
port->pfnum = ret;
break;
+   case DEVLINK_PORT_FLAVOUR_PCI_SF:
+   if (attrs->sfnum_valid)
+   ret = ida_alloc_range(&dev->port_functions.sfnum_ida, 
attrs->sfnum,
+ attrs->sfnum, GFP_KERNEL);
+   else
+   ret = ida_alloc(&dev->port_functions.sfnum_ida, 
GFP_KERNEL);
+   if (ret < 0)
+   goto fn_ida_err;
+   port->sfnum = ret;
+   port->pfnum = attrs->pfnum;
+   break;
default:
break;
};
+   /* refcount_t is not needed as port is protected by 
port_functions.mutex.
+* This count is to keep track of how many SF ports are attached a PF 
port.
+*/
+   port->refcount = 1;
return port;
 
 fn_ida_err:
@@ -137,6 +157,9 @@ static void nsim_devlink_port_function_free(struct nsim_dev 
*dev, struct nsim_po
case DEVLINK_PORT_FLAVOUR_PCI_PF:
ida_simple_remove(&dev->port_functions.pfnum_ida, port->pfnum);
break;
+   case DEVLINK_PORT_FLAVOUR_PCI_SF:
+   ida_simple_remove(&dev->port_functions.sfnum_ida, port->sfnum);
+   break;

[PATCH net-next 3/8] devlink: Prepare code to fill multiple port function attributes

Prepare code to fill zero or more port function optional attributes.
Subsequent patch makes use of this to fill more port function
attributes.

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 net/core/devlink.c | 53 +-
 1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index e93730065c57..d152489e48da 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -570,6 +570,31 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg,
return 0;
 }
 
+static int
+devlink_port_function_hw_addr_fill(struct devlink *devlink, const struct 
devlink_ops *ops,
+  struct devlink_port *port, struct sk_buff 
*msg,
+  struct netlink_ext_ack *extack, bool 
*msg_updated)
+{
+   u8 hw_addr[MAX_ADDR_LEN];
+   int hw_addr_len;
+   int err;
+
+   if (!ops->port_function_hw_addr_get)
+   return 0;
+
+   err = ops->port_function_hw_addr_get(devlink, port, hw_addr, 
&hw_addr_len, extack);
+   if (err) {
+   if (err == -EOPNOTSUPP)
+   return 0;
+   return err;
+   }
+   err = nla_put(msg, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, hw_addr_len, 
hw_addr);
+   if (err)
+   return err;
+   *msg_updated = true;
+   return 0;
+}
+
 static int
 devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port 
*port,
   struct netlink_ext_ack *extack)
@@ -577,36 +602,16 @@ devlink_nl_port_function_attrs_put(struct sk_buff *msg, 
struct devlink_port *por
struct devlink *devlink = port->devlink;
const struct devlink_ops *ops;
struct nlattr *function_attr;
-   bool empty_nest = true;
-   int err = 0;
+   bool msg_updated = false;
+   int err;
 
function_attr = nla_nest_start_noflag(msg, DEVLINK_ATTR_PORT_FUNCTION);
if (!function_attr)
return -EMSGSIZE;
 
ops = devlink->ops;
-   if (ops->port_function_hw_addr_get) {
-   int hw_addr_len;
-   u8 hw_addr[MAX_ADDR_LEN];
-
-   err = ops->port_function_hw_addr_get(devlink, port, hw_addr, 
&hw_addr_len, extack);
-   if (err == -EOPNOTSUPP) {
-   /* Port function attributes are optional for a port. If 
port doesn't
-* support function attribute, returning -EOPNOTSUPP is 
not an error.
-*/
-   err = 0;
-   goto out;
-   } else if (err) {
-   goto out;
-   }
-   err = nla_put(msg, DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, 
hw_addr_len, hw_addr);
-   if (err)
-   goto out;
-   empty_nest = false;
-   }
-
-out:
-   if (err || empty_nest)
+   err = devlink_port_function_hw_addr_fill(devlink, ops, port, msg, 
extack, &msg_updated);
+   if (err || !msg_updated)
nla_nest_cancel(msg, function_attr);
else
nla_nest_end(msg, function_attr);
-- 
2.26.2

[PATCH net-next 6/8] netdevsim: Simulate get/set hardware address of a PCI port

Allow users to get/set hardware address for the PCI port.

Below example creates one devlink port, queries a port, sets a
hardware address.

Example of a PCI SF port which supports a port function hw_addr set:
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device
$ devlink port show
netdevsim/netdevsim10/0: type eth netdev eni10np1 flavour physical port 1 
splittable false

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0
$ devlink port show netdevsim/netdevsim10/10
netdevsim/netdevsim10/10: type eth netdev eni10npf0 flavour pcipf controller 0 
pfnum 0 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port function set netdevsim/netdevsim10/10 hw_addr 00:11:22:33:44:55

$ devlink port show netdevsim/netdevsim10/10 -jp
{
"port": {
"netdevsim/netdevsim10/11": {
"type": "eth",
"netdev": "eni10npf0",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": true,
"splittable": false,
"function": {
"hw_addr": "00:11:22:33:44:55"
}
}
}
}

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 drivers/net/netdevsim/dev.c   |  2 ++
 drivers/net/netdevsim/netdevsim.h |  6 
 drivers/net/netdevsim/port_function.c | 44 +++
 3 files changed, 52 insertions(+)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index e3b81c8b5125..ef2e293f358b 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -886,6 +886,8 @@ static const struct devlink_ops nsim_dev_devlink_ops = {
.trap_policer_counter_get = nsim_dev_devlink_trap_policer_counter_get,
.port_new = nsim_dev_devlink_port_new,
.port_del = nsim_dev_devlink_port_del,
+   .port_function_hw_addr_get = nsim_dev_port_function_hw_addr_get,
+   .port_function_hw_addr_set = nsim_dev_port_function_hw_addr_set,
 };
 
 #define NSIM_DEV_MAX_MACS_DEFAULT 32
diff --git a/drivers/net/netdevsim/netdevsim.h 
b/drivers/net/netdevsim/netdevsim.h
index aec3c4d5fda7..8dc8f4e5dcd8 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -302,3 +302,9 @@ int nsim_dev_devlink_port_new(struct devlink *devlink, 
const struct devlink_port
  struct netlink_ext_ack *extack);
 int nsim_dev_devlink_port_del(struct devlink *devlink, unsigned int port_index,
  struct netlink_ext_ack *extack);
+int nsim_dev_port_function_hw_addr_get(struct devlink *devlink, struct 
devlink_port *port,
+  u8 *hw_addr, int *hw_addr_len,
+  struct netlink_ext_ack *extack);
+int nsim_dev_port_function_hw_addr_set(struct devlink *devlink, struct 
devlink_port *port,
+  const u8 *hw_addr, int hw_addr_len,
+  struct netlink_ext_ack *extack);
diff --git a/drivers/net/netdevsim/port_function.c 
b/drivers/net/netdevsim/port_function.c
index 9a1634898c7d..0053f6f6d530 100644
--- a/drivers/net/netdevsim/port_function.c
+++ b/drivers/net/netdevsim/port_function.c
@@ -15,6 +15,7 @@ struct nsim_port_function {
u32 controller;
u16 pfnum;
struct nsim_port_function *pf_port; /* Valid only for SF port */
+   u8 hw_addr[ETH_ALEN];
 };
 
 void nsim_dev_port_function_init(struct nsim_dev *nsim_dev)
@@ -335,3 +336,46 @@ void nsim_dev_port_function_disable(struct nsim_dev 
*nsim_dev)
nsim_devlink_port_function_free(nsim_dev, port);
}
 }
+
+static struct nsim_port_function *nsim_dev_to_port_function(struct nsim_dev 
*nsim_dev,
+   struct devlink_port 
*dl_port)
+{
+   if (nsim_dev_port_index_internal(nsim_dev, dl_port->index))
+   return ERR_PTR(-EOPNOTSUPP);
+   return container_of(dl_port, struct nsim_port_function, dl_port);
+}
+
+int nsim_dev_port_function_hw_addr_get(struct devlink *devlink, struct 
devlink_port *dl_port,
+  u8 *hw_addr, int *hw_addr_len,
+  struct netlink_ext_ack *extack)
+{
+   struct nsim_dev *nsim_dev = devlink_priv(devlink);
+   struct nsim_port_function *port;
+
+   port = nsim_dev_to_port_function(nsim_dev, dl_port);
+   if (IS_ERR(port))
+   return PTR_ERR(port);
+
+   memcpy(hw_addr, port->hw_addr, ETH_ALEN);
+   *hw_addr_len = ETH_ALEN;
+   return 0;
+}
+
+int nsim_dev_port_function_hw_addr_set(struct devlink *devlink, struct 
devlink_port *dl_port,
+  const u8 *hw_addr, int hw_addr_len,
+  struct netlink_ext_ack *extack)
+{
+   struct nsim_dev *nsim_dev = devlink_priv(devlink);

[PATCH net-next 7/8] netdevsim: Simulate port function state for a PCI port

Simulate port function state of a PCI port.
This enables users to get and set the state of the PCI port function.

Example of a PCI SF port which supports a port function:
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device
$ devlink port show
netdevsim/netdevsim10/0: type eth netdev eni10np1 flavour physical port 1 
splittable false

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0
$ devlink port function set netdevsim/netdevsim10/10 hw_addr 00:11:22:33:44:55 
state active

$ devlink port show netdevsim/netdevsim10/10
netdevsim/netdevsim10/10: type eth netdev eni10npf0 flavour pcipf controller 0 
pfnum 0 external true splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port function set netdevsim/netdevsim10/10 hw_addr 00:11:22:33:44:55 
state active

$ devlink port show  netdevsim/netdevsim10/10 -jp
{
"port": {
"netdevsim/netdevsim10/10": {
"type": "eth",
"netdev": "eni10npf0",
"flavour": "pcipf",
"controller": 0,
"pfnum": 0,
"external": false,
"splittable": false,
"function": {
"hw_addr": "00:11:22:33:44:55",
"state": "active"
}
}
}
}

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 drivers/net/netdevsim/dev.c   |  2 ++
 drivers/net/netdevsim/netdevsim.h |  6 ++
 drivers/net/netdevsim/port_function.c | 30 +++
 3 files changed, 38 insertions(+)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index ef2e293f358b..ec1e5dc74be1 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -888,6 +888,8 @@ static const struct devlink_ops nsim_dev_devlink_ops = {
.port_del = nsim_dev_devlink_port_del,
.port_function_hw_addr_get = nsim_dev_port_function_hw_addr_get,
.port_function_hw_addr_set = nsim_dev_port_function_hw_addr_set,
+   .port_function_state_get = nsim_dev_port_function_state_get,
+   .port_function_state_set = nsim_dev_port_function_state_set,
 };
 
 #define NSIM_DEV_MAX_MACS_DEFAULT 32
diff --git a/drivers/net/netdevsim/netdevsim.h 
b/drivers/net/netdevsim/netdevsim.h
index 8dc8f4e5dcd8..0ea9705eda38 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -308,3 +308,9 @@ int nsim_dev_port_function_hw_addr_get(struct devlink 
*devlink, struct devlink_p
 int nsim_dev_port_function_hw_addr_set(struct devlink *devlink, struct 
devlink_port *port,
   const u8 *hw_addr, int hw_addr_len,
   struct netlink_ext_ack *extack);
+int nsim_dev_port_function_state_get(struct devlink *devlink, struct 
devlink_port *port,
+enum devlink_port_function_state *state,
+struct netlink_ext_ack *extack);
+int nsim_dev_port_function_state_set(struct devlink *devlink, struct 
devlink_port *port,
+enum devlink_port_function_state state,
+struct netlink_ext_ack *extack);
diff --git a/drivers/net/netdevsim/port_function.c 
b/drivers/net/netdevsim/port_function.c
index 0053f6f6d530..01587b54f0e0 100644
--- a/drivers/net/netdevsim/port_function.c
+++ b/drivers/net/netdevsim/port_function.c
@@ -16,6 +16,7 @@ struct nsim_port_function {
u16 pfnum;
struct nsim_port_function *pf_port; /* Valid only for SF port */
u8 hw_addr[ETH_ALEN];
+   u8 state; /* enum devlink_port_function_state */
 };
 
 void nsim_dev_port_function_init(struct nsim_dev *nsim_dev)
@@ -196,6 +197,7 @@ static int nsim_devlink_port_function_add(struct devlink 
*devlink, struct nsim_d
 
list_add(&port->list, &nsim_dev->port_functions.head);
 
+   port->state = DEVLINK_PORT_FUNCTION_STATE_INACTIVE;
err = devlink_port_register(devlink, &port->dl_port, port->port_index);
if (err)
goto reg_err;
@@ -379,3 +381,31 @@ int nsim_dev_port_function_hw_addr_set(struct devlink 
*devlink, struct devlink_p
memcpy(port->hw_addr, hw_addr, ETH_ALEN);
return 0;
 }
+
+int nsim_dev_port_function_state_get(struct devlink *devlink, struct 
devlink_port *dl_port,
+enum devlink_port_function_state *state,
+struct netlink_ext_ack *extack)
+{
+   struct nsim_dev *nsim_dev = devlink_priv(devlink);
+   struct nsim_port_function *port;
+
+   port = nsim_dev_to_port_function(nsim_dev, dl_port);
+   if (IS_ERR(port))
+   return PTR_ERR(port);
+   *state = port->state;
+   return 0;
+}
+
+int nsim_dev_port_function_state_set(struct devlink *devlink, struct 
devlink_port *dl_port,
+enum devlink_port_function_state state,
+

[PATCH net-next 1/8] devlink: Introduce PCI SF port flavour and port attribute

A PCI sub-function (SF) represents a portion of the device similar
to PCI VF.

In an eswitch, PCI SF may have port which is normally represented
using a representor netdevice.
To have better visibility of eswitch port, its association with SF,
and its representor netdevice, introduce a PCI SF port flavour.

When devlink port flavour is PCI SF, fill up PCI SF attributes of the
port.

Extend port name creation using PCI PF and SF number scheme on best
effort basis, so that vendor drivers can skip defining their own
scheme.

An example view of a PCI SF port.

$ devlink port show netdevsim/netdevsim10/2
netdevsim/netdevsim10/2: type eth netdev eni10npf0sf44 flavour pcisf controller 
0 pfnum 0 sfnum 44 external false splittable false
  function:
hw_addr 00:00:00:00:00:00

devlink port show netdevsim/netdevsim10/2 -jp
{
"port": {
"netdevsim/netdevsim10/2": {
"type": "eth",
"netdev": "eni10npf0sf44",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": false,
"splittable": false,
"function": {
"hw_addr": "00:00:00:00:00:00"
}
}
}
}

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 include/net/devlink.h| 17 +
 include/uapi/linux/devlink.h |  7 +++
 net/core/devlink.c   | 37 
 3 files changed, 61 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 48b1c1ef1ebd..1edb558125b0 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -83,6 +83,20 @@ struct devlink_port_pci_vf_attrs {
u8 external:1;
 };
 
+/**
+ * struct devlink_port_pci_sf_attrs - devlink port's PCI SF attributes
+ * @controller: Associated controller number
+ * @pf: Associated PCI PF number for this port.
+ * @sf: Associated PCI SF for of the PCI PF for this port.
+ * @external: when set, indicates if a port is for an external controller
+ */
+struct devlink_port_pci_sf_attrs {
+   u32 controller;
+   u16 pf;
+   u32 sf;
+   u8 external:1;
+};
+
 /**
  * struct devlink_port_attrs - devlink port object
  * @flavour: flavour of the port
@@ -104,6 +118,7 @@ struct devlink_port_attrs {
struct devlink_port_phys_attrs phys;
struct devlink_port_pci_pf_attrs pci_pf;
struct devlink_port_pci_vf_attrs pci_vf;
+   struct devlink_port_pci_sf_attrs pci_sf;
};
 };
 
@@ -1230,6 +1245,8 @@ void devlink_port_attrs_pci_pf_set(struct devlink_port 
*devlink_port, u32 contro
   u16 pf, bool external);
 void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 
controller,
   u16 pf, u16 vf, bool external);
+void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, u32 
controller,
+  u16 pf, u32 sf, bool external);
 int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
u32 size, u16 ingress_pools_count,
u16 egress_pools_count, u16 ingress_tc_count,
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 631f5bdf1707..09c41b9ce407 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -195,6 +195,11 @@ enum devlink_port_flavour {
  * port that faces the PCI VF.
  */
DEVLINK_PORT_FLAVOUR_VIRTUAL, /* Any virtual port facing the user. */
+
+   DEVLINK_PORT_FLAVOUR_PCI_SF, /* Represents eswitch port
+ * for the PCI SF. It is an internal
+ * port that faces the PCI SF.
+ */
 };
 
 enum devlink_param_cmode {
@@ -462,6 +467,8 @@ enum devlink_attr {
 
DEVLINK_ATTR_PORT_EXTERNAL, /* u8 */
DEVLINK_ATTR_PORT_CONTROLLER_NUMBER,/* u32 */
+
+   DEVLINK_ATTR_PORT_PCI_SF_NUMBER,/* u32 */
/* add new attributes above here, update the policy in devlink.c */
 
__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index e5b71f3c2d4d..fada660fd515 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -539,6 +539,15 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg,
if (nla_put_u8(msg, DEVLINK_ATTR_PORT_EXTERNAL, 
attrs->pci_vf.external))
return -EMSGSIZE;
break;
+   case DEVLINK_PORT_FLAVOUR_PCI_SF:
+   if (nla_put_u32(msg, DEVLINK_ATTR_PORT_CONTROLLER_NUMBER,
+   attrs->pci_sf.controller) ||
+   nla_put_u16(msg, DEVLINK_ATTR_PORT_PCI_PF_NUMBER, 
attrs->pci_sf.pf) ||
+   nla_put_u32(msg, DEVLINK_ATTR_PORT_PCI_SF_NUMBER, 
attrs->pci_sf.sf))
+

Re: [PATCH net-next] selftests: mptcp: interpret \n as a new line

2020-09-17 Thread Paolo Abeni

On Wed, 2020-09-16 at 15:13 +0200, Matthieu Baerts wrote:
> In case of errors, this message was printed:
> 
>   (...)
>   # read: Resource temporarily unavailable
>   #  client exit code 0, server 3
>   # \nnetns ns1-0-BJlt5D socket stat for 10003:
>   (...)
> 
> Obviously, the idea was to add a new line before the socket stat and not
> print "\nnetns".
> 
> Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN")
> Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
> Signed-off-by: Matthieu Baerts 

Acked-by: Paolo Abeni

Re: [PATCH bpf-next] selftests/bpf: Fix stat probe in d_path test

On Wed, Sep 16, 2020 at 06:45:31PM -0700, Alexei Starovoitov wrote:
> On Wed, Sep 16, 2020 at 01:24:16PM +0200, Jiri Olsa wrote:
> > Some kernels builds might inline vfs_getattr call within fstat
> > syscall code path, so fentry/vfs_getattr trampoline is not called.
> > 
> > Alexei suggested [1] we should use security_inode_getattr instead,
> > because it's less likely to get inlined.
> > 
> > Adding security_inode_getattr to the d_path allowed list and
> > switching the stat trampoline to security_inode_getattr.
> > 
> > Adding flags that indicate trampolines were called and failing
> > the test if any of them got missed, so it's easier to identify
> > the issue next time.
> > 
> > [1] 
> > https://lore.kernel.org/bpf/caadnvqj0fchopqnwm+deppyij-movveg_trefyrhdabtceu...@mail.gmail.com/
> > Fixes: e4d1af4b16f8 ("selftests/bpf: Add test for d_path helper")
> > Signed-off-by: Jiri Olsa 
> > ---
> >  kernel/trace/bpf_trace.c| 1 +
> >  tools/testing/selftests/bpf/prog_tests/d_path.c | 6 ++
> >  tools/testing/selftests/bpf/progs/test_d_path.c | 9 -
> >  3 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index b2a5380eb187..1001c053ebb3 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1122,6 +1122,7 @@ BTF_ID(func, vfs_truncate)
> >  BTF_ID(func, vfs_fallocate)
> >  BTF_ID(func, dentry_open)
> >  BTF_ID(func, vfs_getattr)
> > +BTF_ID(func, security_inode_getattr)
> >  BTF_ID(func, filp_close)
> >  BTF_SET_END(btf_allowlist_d_path)
> 
> I think it's concealing the problem instead of fixing it.
> bpf is difficult to use for many reasons. Let's not make it harder.
> The users will have a very hard time debugging why vfs_getattr bpf probe
> is not called in all cases.
> Let's replace:
> vfs_truncate -> security_path_truncate
> vfs_fallocate -> security_file_permission
> vfs_getattr -> security_inode_getattr
> 
> For dentry_open also add security_file_open.
> dentry_open and filp_close are in its own files,
> so unlikely to be inlined.

ok

> Ideally resolve_btfids would parse dwarf info and check
> whether any of the funcs in allowlist were inlined.
> That would be more reliable, but not pretty to drag libdw
> dependency into resolve_btfids.

hm, we could add some check to perf|bpftrace that would 
show you all the places where function is called from and
if it was inlined or is a regular call.. so user is aware
what probe calls to expect

> 
> >  
> > diff --git a/tools/testing/selftests/bpf/prog_tests/d_path.c 
> > b/tools/testing/selftests/bpf/prog_tests/d_path.c
> > index fc12e0d445ff..f507f1a6fa3a 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/d_path.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/d_path.c
> > @@ -120,6 +120,12 @@ void test_d_path(void)
> > if (err < 0)
> > goto cleanup;
> >  
> > +   if (CHECK(!bss->called_stat || !bss->called_close,
> 
> +1 to KP's comment.

ok

thanks,
jirka

Re: [PATCH v3] mptcp: Fix unsigned 'max_seq' compared with zero in mptcp_data_queue_ofo

2020-09-17 Thread Paolo Abeni

On Thu, 2020-09-17 at 09:12 +0800, Ye Bin wrote:
> Fixes coccicheck warnig:
> net/mptcp/protocol.c:164:11-18: WARNING: Unsigned expression compared with 
> zero: max_seq > 0
> 
> Fixes: ab174ad8ef76 ("mptcp: move ooo skbs into msk out of order queue")
> Reported-by: Hulk Robot 
> Signed-off-by: Ye Bin 
> ---
>  net/mptcp/protocol.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index ef0dd2f23482..386cd4e60250 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -157,11 +157,12 @@ static void mptcp_data_queue_ofo(struct mptcp_sock 
> *msk, struct sk_buff *skb)
>   struct rb_node **p, *parent;
>   u64 seq, end_seq, max_seq;
>   struct sk_buff *skb1;
> + int space;
>  
>   seq = MPTCP_SKB_CB(skb)->map_seq;
>   end_seq = MPTCP_SKB_CB(skb)->end_seq;
> - max_seq = tcp_space(sk);
> - max_seq = max_seq > 0 ? max_seq + msk->ack_seq : msk->ack_seq;
> + space = tcp_space(sk);
> + max_seq = space > 0 ? space + msk->ack_seq : msk->ack_seq;
>  
>   pr_debug("msk=%p seq=%llx limit=%llx empty=%d", msk, seq, max_seq,
>RB_EMPTY_ROOT(&msk->out_of_order_queue));

Thank you for addressing our feedback!

Acked-by: Paolo Abeni

Re: resolve_btfids breaks kernel cross-compilation

On Thu, Sep 17, 2020 at 10:04:55AM +0200, Jiri Olsa wrote:
> On Wed, Sep 16, 2020 at 02:47:33PM -0500, Seth Forshee wrote:
> > The requirement to build resolve_btfids whenever CONFIG_DEBUG_INFO_BTF
> > is enabled breaks some cross builds. For example, when building a 64-bit
> > powerpc kernel on amd64 I get:
> > 
> >  Auto-detecting system features:
> >  ...libelf: [ [32mon[m  ]
> >  ...  zlib: [ [32mon[m  ]
> >  ...   bpf: [ [31mOFF[m ]
> >  
> >  BPF API too old
> >  make[6]: *** [Makefile:295: bpfdep] Error 1
> > 
> > The contents of tools/bpf/resolve_btfids/feature/test-bpf.make.output:
> > 
> >  In file included from 
> > /home/sforshee/src/u-k/unstable/tools/arch/powerpc/include/uapi/asm/bitsperlong.h:11,
> >   from /usr/include/asm-generic/int-ll64.h:12,
> >   from /usr/include/asm-generic/types.h:7,
> >   from /usr/include/x86_64-linux-gnu/asm/types.h:1,
> >   from 
> > /home/sforshee/src/u-k/unstable/tools/include/linux/types.h:10,
> >   from 
> > /home/sforshee/src/u-k/unstable/tools/include/uapi/linux/bpf.h:11,
> >   from test-bpf.c:3:
> >  
> > /home/sforshee/src/u-k/unstable/tools/include/asm-generic/bitsperlong.h:14:2:
> >  error: #error Inconsistent word size. Check asm/bitsperlong.h
> > 14 | #error Inconsistent word size. Check asm/bitsperlong.h
> >|  ^
> > 
> > This is because tools/arch/powerpc/include/uapi/asm/bitsperlong.h sets
> > __BITS_PER_LONG based on the predefinied compiler macro __powerpc64__,
> > which is not defined by the host compiler. What can we do to get cross
> > builds working again?
> 
> could you please share the command line and setup?

I just reproduced.. checking on fix

jirka

[PATCH v3] arm64: bpf: Fix branch offset in JIT

2020-09-17 Thread Ilias Apalodimas

Running the eBPF test_verifier leads to random errors looking like this:

[ 6525.735488] Unexpected kernel BRK exception at EL1
[ 6525.735502] Internal error: ptrace BRK handler: f2000100 [#1] SMP
[ 6525.741609] Modules linked in: nls_utf8 cifs libdes libarc4 dns_resolver 
fscache binfmt_misc nls_ascii nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd 
aes_ce_cipher ghash_ce gf128mul efi_pstore sha2_ce sha256_arm64 sha1_ce evdev 
efivars efivarfs ip_tables x_tables autofs4 btrfs blake2b_generic xor xor_neon 
zstd_compress raid6_pq libcrc32c crc32c_generic ahci xhci_pci libahci xhci_hcd 
igb libata i2c_algo_bit nvme realtek usbcore nvme_core scsi_mod t10_pi netsec 
mdio_devres of_mdio gpio_keys fixed_phy libphy gpio_mb86s7x
[ 6525.787760] CPU: 3 PID: 7881 Comm: test_verifier Tainted: GW 
5.9.0-rc1+ #47
[ 6525.796111] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS 
build #1 Jun  6 2020
[ 6525.804812] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
[ 6525.810390] pc : bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
[ 6525.815613] lr : bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
[ 6525.820832] sp : 8000130cbb80
[ 6525.824141] x29: 8000130cbbb0 x28: 
[ 6525.829451] x27: 05ef6fcbf39b x26: 
[ 6525.834759] x25: 8000130cbb80 x24: 800011dc7038
[ 6525.840067] x23: 8000130cbd00 x22: 0008f624d080
[ 6525.845375] x21: 0001 x20: 800011dc7000
[ 6525.850682] x19:  x18: 
[ 6525.855990] x17:  x16: 
[ 6525.861298] x15:  x14: 
[ 6525.866606] x13:  x12: 
[ 6525.871913] x11: 0001 x10: 800a660c
[ 6525.877220] x9 : 800010951810 x8 : 8000130cbc38
[ 6525.882528] x7 :  x6 : 009864cfa881
[ 6525.887836] x5 : 00ff x4 : 002880ba1a0b3e9f
[ 6525.893144] x3 : 0018 x2 : 800a4374
[ 6525.898452] x1 : 000a x0 : 0009
[ 6525.903760] Call trace:
[ 6525.906202]  bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
[ 6525.911076]  bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
[ 6525.915957]  bpf_dispatcher_xdp_func+0x14/0x20
[ 6525.920398]  bpf_test_run+0x70/0x1b0
[ 6525.923969]  bpf_prog_test_run_xdp+0xec/0x190
[ 6525.928326]  __do_sys_bpf+0xc88/0x1b28
[ 6525.932072]  __arm64_sys_bpf+0x24/0x30
[ 6525.935820]  el0_svc_common.constprop.0+0x70/0x168
[ 6525.940607]  do_el0_svc+0x28/0x88
[ 6525.943920]  el0_sync_handler+0x88/0x190
[ 6525.947838]  el0_sync+0x140/0x180
[ 6525.951154] Code: d4202000 d4202000 d4202000 d4202000 (d4202000)
[ 6525.957249] ---[ end trace cecc3f93b14927e2 ]---

The reason is the offset[] creation and later usage, while building
the eBPF body. The code currently omits the first instruction, since
build_insn() will increase our ctx->idx before saving it.
That was fine up until bounded eBPF loops were introduced. After that
introduction, offset[0] must be the offset of the end of prologue which
is the start of the 1st insn while, offset[n] holds the
offset of the end of n-th insn.

When "taken loop with back jump to 1st insn" test runs, it will
eventually call bpf2a64_offset(-1, 2, ctx). Since negative indexing is
permitted, the current outcome depends on the value stored in
ctx->offset[-1], which has nothing to do with our array.
If the value happens to be 0 the tests will work. If not this error
triggers.

commit 7c2e988f400e ("bpf: fix x64 JIT code generation for jmp to 1st insn")
fixed an indentical bug on x86 when eBPF bounded loops were introduced.

So let's fix it by creating the ctx->offset[] differently. Track the
beginning of instruction and account for the extra instruction while
calculating the arm instruction offsets.

Fixes: 2589726d12a1 ("bpf: introduce bounded loops")
Reported-by: Naresh Kamboju 
Reported-by: Jiri Olsa 
Co-developed-by: Jean-Philippe Brucker 
Signed-off-by: Jean-Philippe Brucker 
Co-developed-by: Yauheni Kaliuta 
Signed-off-by: Yauheni Kaliuta 
Signed-off-by: Ilias Apalodimas 
---
Changes since v1: 
 - Added Co-developed-by, Reported-by and Fixes tags correctly
 - Describe the expected context of ctx->offset[] in comments
Changes since v2:
 - Drop the change of behavior for 16-byte eBPF instructions. This won't
 currently cause any problems and can go in on a different patch
 - simplify bpf2a64_offset()

 arch/arm64/net/bpf_jit_comp.c | 43 +--
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index f8912e45be7a..ef9f1d5e989d 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -143,14 +143,17 @@ static inline void emit_addr_mov_i64(const int reg, const 
u64 val,
}
 }
 
-static inline int bpf2a64_offset(int bpf_to, int bpf_from,
+static inline int bpf2a64_offset(int bpf_insn, int off,
 const struct jit_ctx *ctx)
 {
-   int t

Re: [PATCH net-next] net/packet: Fix a comment about mac_header

2020-09-17 Thread Willem de Bruijn

On Wed, Sep 16, 2020 at 8:54 PM Xie He  wrote:
>
> 1. Change all "dev->hard_header" to "dev->header_ops"
>
> 2. On receiving incoming frames when header_ops == NULL:
>
> The comment only says what is wrong, but doesn't say what is right.
> This patch changes the comment to make it clear what is right.
>
> 3. On transmitting and receiving outgoing frames when header_ops == NULL:
>
> The comment explains that the LL header will be later added by the driver.
>
> However, I think it's better to simply say that the LL header is invisible
> to us. This phrasing is better from a software engineering perspective,
> because this makes it clear that what happens in the driver should be
> hidden from us and we should not care about what happens internally in the
> driver.
>
> 4. On resuming the LL header (for RAW frames) when header_ops == NULL:
>
> The comment says we are "unlikely" to restore the LL header.
>
> However, we should say that we are "unable" to restore it.
> It's not possible (rather than not likely) to restore it, because:
>
> 1) There is no way for us to restore because the LL header internally
> processed by the driver should be invisible to us.
>
> 2) In function packet_rcv and tpacket_rcv, the code only tries to restore
> the LL header when header_ops != NULL.
>
> Cc: Willem de Bruijn 
> Signed-off-by: Xie He 

Acked-by: Willem de Bruijn

Re: [PATCH v3] arm64: bpf: Fix branch offset in JIT

2020-09-17 Thread Will Deacon

On Thu, Sep 17, 2020 at 11:49:25AM +0300, Ilias Apalodimas wrote:
> Running the eBPF test_verifier leads to random errors looking like this:
> 
> [ 6525.735488] Unexpected kernel BRK exception at EL1
> [ 6525.735502] Internal error: ptrace BRK handler: f2000100 [#1] SMP
> [ 6525.741609] Modules linked in: nls_utf8 cifs libdes libarc4 dns_resolver 
> fscache binfmt_misc nls_ascii nls_cp437 vfat fat aes_ce_blk crypto_simd 
> cryptd aes_ce_cipher ghash_ce gf128mul efi_pstore sha2_ce sha256_arm64 
> sha1_ce evdev efivars efivarfs ip_tables x_tables autofs4 btrfs 
> blake2b_generic xor xor_neon zstd_compress raid6_pq libcrc32c crc32c_generic 
> ahci xhci_pci libahci xhci_hcd igb libata i2c_algo_bit nvme realtek usbcore 
> nvme_core scsi_mod t10_pi netsec mdio_devres of_mdio gpio_keys fixed_phy 
> libphy gpio_mb86s7x
> [ 6525.787760] CPU: 3 PID: 7881 Comm: test_verifier Tainted: GW   
>   5.9.0-rc1+ #47
> [ 6525.796111] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS 
> build #1 Jun  6 2020
> [ 6525.804812] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
> [ 6525.810390] pc : bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
> [ 6525.815613] lr : bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
> [ 6525.820832] sp : 8000130cbb80
> [ 6525.824141] x29: 8000130cbbb0 x28: 
> [ 6525.829451] x27: 05ef6fcbf39b x26: 
> [ 6525.834759] x25: 8000130cbb80 x24: 800011dc7038
> [ 6525.840067] x23: 8000130cbd00 x22: 0008f624d080
> [ 6525.845375] x21: 0001 x20: 800011dc7000
> [ 6525.850682] x19:  x18: 
> [ 6525.855990] x17:  x16: 
> [ 6525.861298] x15:  x14: 
> [ 6525.866606] x13:  x12: 
> [ 6525.871913] x11: 0001 x10: 800a660c
> [ 6525.877220] x9 : 800010951810 x8 : 8000130cbc38
> [ 6525.882528] x7 :  x6 : 009864cfa881
> [ 6525.887836] x5 : 00ff x4 : 002880ba1a0b3e9f
> [ 6525.893144] x3 : 0018 x2 : 800a4374
> [ 6525.898452] x1 : 000a x0 : 0009
> [ 6525.903760] Call trace:
> [ 6525.906202]  bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
> [ 6525.911076]  bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
> [ 6525.915957]  bpf_dispatcher_xdp_func+0x14/0x20
> [ 6525.920398]  bpf_test_run+0x70/0x1b0
> [ 6525.923969]  bpf_prog_test_run_xdp+0xec/0x190
> [ 6525.928326]  __do_sys_bpf+0xc88/0x1b28
> [ 6525.932072]  __arm64_sys_bpf+0x24/0x30
> [ 6525.935820]  el0_svc_common.constprop.0+0x70/0x168
> [ 6525.940607]  do_el0_svc+0x28/0x88
> [ 6525.943920]  el0_sync_handler+0x88/0x190
> [ 6525.947838]  el0_sync+0x140/0x180
> [ 6525.951154] Code: d4202000 d4202000 d4202000 d4202000 (d4202000)
> [ 6525.957249] ---[ end trace cecc3f93b14927e2 ]---
> 
> The reason is the offset[] creation and later usage, while building
> the eBPF body. The code currently omits the first instruction, since
> build_insn() will increase our ctx->idx before saving it.
> That was fine up until bounded eBPF loops were introduced. After that
> introduction, offset[0] must be the offset of the end of prologue which
> is the start of the 1st insn while, offset[n] holds the
> offset of the end of n-th insn.
> 
> When "taken loop with back jump to 1st insn" test runs, it will
> eventually call bpf2a64_offset(-1, 2, ctx). Since negative indexing is
> permitted, the current outcome depends on the value stored in
> ctx->offset[-1], which has nothing to do with our array.
> If the value happens to be 0 the tests will work. If not this error
> triggers.
> 
> commit 7c2e988f400e ("bpf: fix x64 JIT code generation for jmp to 1st insn")
> fixed an indentical bug on x86 when eBPF bounded loops were introduced.
> 
> So let's fix it by creating the ctx->offset[] differently. Track the
> beginning of instruction and account for the extra instruction while
> calculating the arm instruction offsets.
> 
> Fixes: 2589726d12a1 ("bpf: introduce bounded loops")
> Reported-by: Naresh Kamboju 
> Reported-by: Jiri Olsa 
> Co-developed-by: Jean-Philippe Brucker 
> Signed-off-by: Jean-Philippe Brucker 
> Co-developed-by: Yauheni Kaliuta 
> Signed-off-by: Yauheni Kaliuta 
> Signed-off-by: Ilias Apalodimas 

Acked-by: Will Deacon 

Catalin -- do you want to take this as a fix?

Will

[PATCH rdma-next v2 0/3] Fix in-kernel active_speed type

From: Leon Romanovsky 

Changelog:
v2:
 * Changed WARN_ON casting to be saturated value instead while returning 
active_speed
   to the user.
v1: https://lore.kernel.org/linux-rdma/20200902074503.743310-1-l...@kernel.org
 * Changed patch #1 to fix memory corruption to help with bisect. No
   change in series, because the added code is changed anyway in patch
   #3.
v0:
 * https://lore.kernel.org/linux-rdma/20200824105826.1093613-1-l...@kernel.org



IBTA declares speed as 16 bits, but kernel stores it in u8. This series
fixes in-kernel declaration while keeping external interface intact.

Thanks

Aharon Landau (3):
  net/mlx5: Refactor query port speed functions
  RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
  RDMA: Fix link active_speed size

 .../infiniband/core/uverbs_std_types_device.c |  3 +-
 drivers/infiniband/core/verbs.c   |  2 +-
 drivers/infiniband/hw/bnxt_re/bnxt_re.h   |  2 +-
 drivers/infiniband/hw/hfi1/verbs.c|  2 +-
 drivers/infiniband/hw/mlx5/main.c | 41 +++
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   |  2 +-
 drivers/infiniband/hw/qedr/verbs.c|  2 +-
 drivers/infiniband/hw/qib/qib.h   |  6 +--
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h   |  2 +-
 .../mellanox/mlx5/core/ipoib/ethtool.c| 31 ++
 .../net/ethernet/mellanox/mlx5/core/port.c| 23 ++-
 include/linux/mlx5/port.h | 15 +--
 include/rdma/ib_verbs.h   |  4 +-
 13 files changed, 47 insertions(+), 88 deletions(-)

--
2.26.2

[PATCH mlx5-next v2 2/3] RDMA/mlx5: Delete duplicated mlx5_ptys_width enum

From: Aharon Landau 

Combine two same enums to avoid duplication.

Signed-off-by: Aharon Landau 
Reviewed-by: Michael Guralnik 
Signed-off-by: Leon Romanovsky 
---
 drivers/infiniband/hw/mlx5/main.c | 20 ++-
 .../mellanox/mlx5/core/ipoib/ethtool.c|  8 
 include/linux/mlx5/port.h |  8 
 3 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index ca33ff4b1d5e..545f23d27660 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1179,32 +1179,24 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
return 0;
 }
 
-enum mlx5_ib_width {
-   MLX5_IB_WIDTH_1X= 1 << 0,
-   MLX5_IB_WIDTH_2X= 1 << 1,
-   MLX5_IB_WIDTH_4X= 1 << 2,
-   MLX5_IB_WIDTH_8X= 1 << 3,
-   MLX5_IB_WIDTH_12X   = 1 << 4
-};
-
 static void translate_active_width(struct ib_device *ibdev, u16 active_width,
   u8 *ib_width)
 {
struct mlx5_ib_dev *dev = to_mdev(ibdev);
 
-   if (active_width & MLX5_IB_WIDTH_1X)
+   if (active_width & MLX5_PTYS_WIDTH_1X)
*ib_width = IB_WIDTH_1X;
-   else if (active_width & MLX5_IB_WIDTH_2X)
+   else if (active_width & MLX5_PTYS_WIDTH_2X)
*ib_width = IB_WIDTH_2X;
-   else if (active_width & MLX5_IB_WIDTH_4X)
+   else if (active_width & MLX5_PTYS_WIDTH_4X)
*ib_width = IB_WIDTH_4X;
-   else if (active_width & MLX5_IB_WIDTH_8X)
+   else if (active_width & MLX5_PTYS_WIDTH_8X)
*ib_width = IB_WIDTH_8X;
-   else if (active_width & MLX5_IB_WIDTH_12X)
+   else if (active_width & MLX5_PTYS_WIDTH_12X)
*ib_width = IB_WIDTH_12X;
else {
mlx5_ib_dbg(dev, "Invalid active_width %d, setting width to 
default value: 4x\n",
-   (int)active_width);
+   active_width);
*ib_width = IB_WIDTH_4X;
}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
index 17f5be801d2f..cac8f085b16d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
@@ -130,14 +130,6 @@ static int mlx5i_flash_device(struct net_device *netdev,
return mlx5e_ethtool_flash_device(priv, flash);
 }
 
-enum mlx5_ptys_width {
-   MLX5_PTYS_WIDTH_1X  = 1 << 0,
-   MLX5_PTYS_WIDTH_2X  = 1 << 1,
-   MLX5_PTYS_WIDTH_4X  = 1 << 2,
-   MLX5_PTYS_WIDTH_8X  = 1 << 3,
-   MLX5_PTYS_WIDTH_12X = 1 << 4,
-};
-
 static inline int mlx5_ptys_width_enum_to_int(enum mlx5_ptys_width width)
 {
switch (width) {
diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h
index 4d33ae0c2d97..23edd2db4803 100644
--- a/include/linux/mlx5/port.h
+++ b/include/linux/mlx5/port.h
@@ -125,6 +125,14 @@ enum mlx5e_connector_type {
MLX5E_CONNECTOR_TYPE_NUMBER,
 };
 
+enum mlx5_ptys_width {
+   MLX5_PTYS_WIDTH_1X  = 1 << 0,
+   MLX5_PTYS_WIDTH_2X  = 1 << 1,
+   MLX5_PTYS_WIDTH_4X  = 1 << 2,
+   MLX5_PTYS_WIDTH_8X  = 1 << 3,
+   MLX5_PTYS_WIDTH_12X = 1 << 4,
+};
+
 #define MLX5E_PROT_MASK(link_mode) (1 << link_mode)
 #define MLX5_GET_ETH_PROTO(reg, out, ext, field)   \
(ext ? MLX5_GET(reg, out, ext_##field) :\
-- 
2.26.2

[PATCH mlx5-next v2 1/3] net/mlx5: Refactor query port speed functions

From: Aharon Landau 

The functions mlx5_query_port_link_width_oper and
mlx5_query_port_ib_proto_oper are always called together, so combine them
to a new function called mlx5_query_port_oper to avoid duplication.

And while the mlx5i_get_port_settings is the same as
mlx5_query_port_oper therefore let's remove it.

According to the IB spec link_width_oper and ib_proto_oper should be u16
and not as written u8, so perform casting as a preparation to cross-RDMA
patch which will fix that type for all drivers in the RDMA subsystem.

Fixes: ada68c31ba9c ("net/mlx5: Introduce a new header file for physical port 
functions")
Signed-off-by: Aharon Landau 
Reviewed-by: Michael Guralnik 
Signed-off-by: Leon Romanovsky 
---
 drivers/infiniband/hw/mlx5/main.c | 27 ++-
 .../mellanox/mlx5/core/ipoib/ethtool.c| 23 +++-
 .../net/ethernet/mellanox/mlx5/core/port.c| 23 +++-
 include/linux/mlx5/port.h |  7 +++--
 4 files changed, 25 insertions(+), 55 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index bfa8b6b3c681..ca33ff4b1d5e 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -326,8 +326,8 @@ void mlx5_ib_put_native_port_mdev(struct mlx5_ib_dev 
*ibdev, u8 port_num)
spin_unlock(&port->mp.mpi_lock);
 }
 
-static int translate_eth_legacy_proto_oper(u32 eth_proto_oper, u8 
*active_speed,
-  u8 *active_width)
+static int translate_eth_legacy_proto_oper(u32 eth_proto_oper,
+  u16 *active_speed, u8 *active_width)
 {
switch (eth_proto_oper) {
case MLX5E_PROT_MASK(MLX5E_1000BASE_CX_SGMII):
@@ -384,7 +384,7 @@ static int translate_eth_legacy_proto_oper(u32 
eth_proto_oper, u8 *active_speed,
return 0;
 }
 
-static int translate_eth_ext_proto_oper(u32 eth_proto_oper, u8 *active_speed,
+static int translate_eth_ext_proto_oper(u32 eth_proto_oper, u16 *active_speed,
u8 *active_width)
 {
switch (eth_proto_oper) {
@@ -436,7 +436,7 @@ static int translate_eth_ext_proto_oper(u32 eth_proto_oper, 
u8 *active_speed,
return 0;
 }
 
-static int translate_eth_proto_oper(u32 eth_proto_oper, u8 *active_speed,
+static int translate_eth_proto_oper(u32 eth_proto_oper, u16 *active_speed,
u8 *active_width, bool ext)
 {
return ext ?
@@ -457,6 +457,7 @@ static int mlx5_query_port_roce(struct ib_device *device, 
u8 port_num,
bool put_mdev = true;
u16 qkey_viol_cntr;
u32 eth_prot_oper;
+   u16 active_speed;
u8 mdev_port_num;
bool ext;
int err;
@@ -490,9 +491,12 @@ static int mlx5_query_port_roce(struct ib_device *device, 
u8 port_num,
props->active_width = IB_WIDTH_4X;
props->active_speed = IB_SPEED_QDR;
 
-   translate_eth_proto_oper(eth_prot_oper, &props->active_speed,
+   translate_eth_proto_oper(eth_prot_oper, &active_speed,
 &props->active_width, ext);
 
+   WARN_ON_ONCE(active_speed & ~0xFF);
+   props->active_speed = (u8)active_speed;
+
props->port_cap_flags |= IB_PORT_CM_SUP;
props->ip_gids = true;
 
@@ -1183,8 +1187,8 @@ enum mlx5_ib_width {
MLX5_IB_WIDTH_12X   = 1 << 4
 };
 
-static void translate_active_width(struct ib_device *ibdev, u8 active_width,
- u8 *ib_width)
+static void translate_active_width(struct ib_device *ibdev, u16 active_width,
+  u8 *ib_width)
 {
struct mlx5_ib_dev *dev = to_mdev(ibdev);
 
@@ -1277,7 +1281,7 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, 
u8 port,
u16 max_mtu;
u16 oper_mtu;
int err;
-   u8 ib_link_width_oper;
+   u16 ib_link_width_oper;
u8 vl_hw_cap;
 
rep = kzalloc(sizeof(*rep), GFP_KERNEL);
@@ -1310,16 +1314,13 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, 
u8 port,
if (props->port_cap_flags & IB_PORT_CAP_MASK2_SUP)
props->port_cap_flags2 = rep->cap_mask2;
 
-   err = mlx5_query_port_link_width_oper(mdev, &ib_link_width_oper, port);
+   err = mlx5_query_ib_port_oper(mdev, &ib_link_width_oper,
+ (u16 *)&props->active_speed, port);
if (err)
goto out;
 
translate_active_width(ibdev, ib_link_width_oper, &props->active_width);
 
-   err = mlx5_query_port_ib_proto_oper(mdev, &props->active_speed, port);
-   if (err)
-   goto out;
-
mlx5_query_port_max_mtu(mdev, &max_mtu, port);
 
props->max_mtu = mlx5_mtu_to_ib_mtu(max_mtu);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c 
b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
index 1eef66ee849e..17f5be801d2f 100644
--- a/drivers/net/ethern

Re: resolve_btfids breaks kernel cross-compilation

On Thu, Sep 17, 2020 at 10:38:12AM +0200, Jiri Olsa wrote:
> On Thu, Sep 17, 2020 at 10:04:55AM +0200, Jiri Olsa wrote:
> > On Wed, Sep 16, 2020 at 02:47:33PM -0500, Seth Forshee wrote:
> > > The requirement to build resolve_btfids whenever CONFIG_DEBUG_INFO_BTF
> > > is enabled breaks some cross builds. For example, when building a 64-bit
> > > powerpc kernel on amd64 I get:
> > > 
> > >  Auto-detecting system features:
> > >  ...libelf: [ [32mon[m  ]
> > >  ...  zlib: [ [32mon[m  ]
> > >  ...   bpf: [ [31mOFF[m ]
> > >  
> > >  BPF API too old
> > >  make[6]: *** [Makefile:295: bpfdep] Error 1
> > > 
> > > The contents of tools/bpf/resolve_btfids/feature/test-bpf.make.output:
> > > 
> > >  In file included from 
> > > /home/sforshee/src/u-k/unstable/tools/arch/powerpc/include/uapi/asm/bitsperlong.h:11,
> > >   from /usr/include/asm-generic/int-ll64.h:12,
> > >   from /usr/include/asm-generic/types.h:7,
> > >   from /usr/include/x86_64-linux-gnu/asm/types.h:1,
> > >   from 
> > > /home/sforshee/src/u-k/unstable/tools/include/linux/types.h:10,
> > >   from 
> > > /home/sforshee/src/u-k/unstable/tools/include/uapi/linux/bpf.h:11,
> > >   from test-bpf.c:3:
> > >  
> > > /home/sforshee/src/u-k/unstable/tools/include/asm-generic/bitsperlong.h:14:2:
> > >  error: #error Inconsistent word size. Check asm/bitsperlong.h
> > > 14 | #error Inconsistent word size. Check asm/bitsperlong.h
> > >|  ^
> > > 
> > > This is because tools/arch/powerpc/include/uapi/asm/bitsperlong.h sets
> > > __BITS_PER_LONG based on the predefinied compiler macro __powerpc64__,
> > > which is not defined by the host compiler. What can we do to get cross
> > > builds working again?
> > 
> > could you please share the command line and setup?
> 
> I just reproduced.. checking on fix

I still need to check on few things, but patch below should help

we might have a problem for cross builds with different endianity
than the host because libbpf does not support reading BTF data
with different endianity, and we get:

  BTFIDS  vmlinux
libbpf: non-native ELF endianness is not supported

jirka


---
diff --git a/tools/bpf/resolve_btfids/Makefile 
b/tools/bpf/resolve_btfids/Makefile
index a88cd4426398..d3c818b8d8d3 100644
--- a/tools/bpf/resolve_btfids/Makefile
+++ b/tools/bpf/resolve_btfids/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 include ../../scripts/Makefile.include
+include ../../scripts/Makefile.arch
 
 ifeq ($(srctree),)
 srctree := $(patsubst %/,%,$(dir $(CURDIR)))
@@ -29,6 +30,7 @@ endif
 AR   = $(HOSTAR)
 CC   = $(HOSTCC)
 LD   = $(HOSTLD)
+ARCH = $(HOSTARCH)
 
 OUTPUT ?= $(srctree)/tools/bpf/resolve_btfids/

Re: [PATCH] ptp: mark symbols static where possible

2020-09-17 Thread kernel test robot

Hi Herrington,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]
[also build test ERROR on net-next/master linus/master v5.9-rc5 next-20200916]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Herrington/ptp-mark-symbols-static-where-possible/20200917-103557
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 
d5d325eae7823c85eedabf05f78f9cd574fe832b
config: riscv-allyesconfig (attached as .config)
compiler: riscv64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L306':
>> pch_gbe_main.c:(.text+0x2a04): undefined reference to `pch_ch_control_write'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L287':
   pch_gbe_main.c:(.text+0x2a3c): undefined reference to `pch_ch_control_write'
>> riscv64-linux-ld: pch_gbe_main.c:(.text+0x2a76): undefined reference to 
>> `pch_ch_control_write'
   riscv64-linux-ld: pch_gbe_main.c:(.text+0x2ab2): undefined reference to 
`pch_ch_control_write'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L0 ':
>> pch_gbe_main.c:(.text+0x2ad6): undefined reference to 
>> `pch_set_station_address'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L295':
>> pch_gbe_main.c:(.text+0x2b1c): undefined reference to `pch_ch_event_write'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L0 ':
>> pch_gbe_main.c:(.text+0x44ea): undefined reference to `pch_ch_event_read'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L447':
>> pch_gbe_main.c:(.text+0x468e): undefined reference to `pch_tx_snap_read'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L450':
   pch_gbe_main.c:(.text+0x46ae): undefined reference to `pch_ch_event_write'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L508':
   pch_gbe_main.c:(.text+0x522c): undefined reference to `pch_ch_event_read'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L509':
>> pch_gbe_main.c:(.text+0x5254): undefined reference to `pch_src_uuid_lo_read'
>> riscv64-linux-ld: pch_gbe_main.c:(.text+0x5266): undefined reference to 
>> `pch_src_uuid_hi_read'
   riscv64-linux-ld: drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o: in 
function `.L515':
>> pch_gbe_main.c:(.text+0x540e): undefined reference to `pch_rx_snap_read'
>> riscv64-linux-ld: pch_gbe_main.c:(.text+0x545c): undefined reference to 
>> `pch_ch_event_write'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: resolve_btfids breaks kernel cross-compilation

On Thu, Sep 17, 2020 at 11:14:08AM +0200, Jiri Olsa wrote:
> On Thu, Sep 17, 2020 at 10:38:12AM +0200, Jiri Olsa wrote:
> > On Thu, Sep 17, 2020 at 10:04:55AM +0200, Jiri Olsa wrote:
> > > On Wed, Sep 16, 2020 at 02:47:33PM -0500, Seth Forshee wrote:
> > > > The requirement to build resolve_btfids whenever CONFIG_DEBUG_INFO_BTF
> > > > is enabled breaks some cross builds. For example, when building a 64-bit
> > > > powerpc kernel on amd64 I get:
> > > > 
> > > >  Auto-detecting system features:
> > > >  ...libelf: [ [32mon[m  ]
> > > >  ...  zlib: [ [32mon[m  ]
> > > >  ...   bpf: [ [31mOFF[m ]
> > > >  
> > > >  BPF API too old
> > > >  make[6]: *** [Makefile:295: bpfdep] Error 1
> > > > 
> > > > The contents of tools/bpf/resolve_btfids/feature/test-bpf.make.output:
> > > > 
> > > >  In file included from 
> > > > /home/sforshee/src/u-k/unstable/tools/arch/powerpc/include/uapi/asm/bitsperlong.h:11,
> > > >   from /usr/include/asm-generic/int-ll64.h:12,
> > > >   from /usr/include/asm-generic/types.h:7,
> > > >   from /usr/include/x86_64-linux-gnu/asm/types.h:1,
> > > >   from 
> > > > /home/sforshee/src/u-k/unstable/tools/include/linux/types.h:10,
> > > >   from 
> > > > /home/sforshee/src/u-k/unstable/tools/include/uapi/linux/bpf.h:11,
> > > >   from test-bpf.c:3:
> > > >  
> > > > /home/sforshee/src/u-k/unstable/tools/include/asm-generic/bitsperlong.h:14:2:
> > > >  error: #error Inconsistent word size. Check asm/bitsperlong.h
> > > > 14 | #error Inconsistent word size. Check asm/bitsperlong.h
> > > >|  ^
> > > > 
> > > > This is because tools/arch/powerpc/include/uapi/asm/bitsperlong.h sets
> > > > __BITS_PER_LONG based on the predefinied compiler macro __powerpc64__,
> > > > which is not defined by the host compiler. What can we do to get cross
> > > > builds working again?
> > > 
> > > could you please share the command line and setup?
> > 
> > I just reproduced.. checking on fix
> 
> I still need to check on few things, but patch below should help
> 
> we might have a problem for cross builds with different endianity
> than the host because libbpf does not support reading BTF data
> with different endianity, and we get:
> 
>   BTFIDS  vmlinux
> libbpf: non-native ELF endianness is not supported
> 
> jirka
> 
> 
> ---
> diff --git a/tools/bpf/resolve_btfids/Makefile 
> b/tools/bpf/resolve_btfids/Makefile
> index a88cd4426398..d3c818b8d8d3 100644
> --- a/tools/bpf/resolve_btfids/Makefile
> +++ b/tools/bpf/resolve_btfids/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  include ../../scripts/Makefile.include
> +include ../../scripts/Makefile.arch
>  
>  ifeq ($(srctree),)
>  srctree := $(patsubst %/,%,$(dir $(CURDIR)))
> @@ -29,6 +30,7 @@ endif
>  AR   = $(HOSTAR)
>  CC   = $(HOSTCC)
>  LD   = $(HOSTLD)
> +ARCH = $(HOSTARCH)
>  
>  OUTPUT ?= $(srctree)/tools/bpf/resolve_btfids/
>  

and I realized we can have CONFIG_DEBUG_INFO_BTF without
CONFIG_BPF, so we need also fix below for such cases

jirka


---
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index e6e2d9e5ff48..8a990933a690 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -343,7 +343,10 @@ vmlinux_link vmlinux "${kallsymso}" ${btf_vmlinux_bin_o}
 # fill in BTF IDs
 if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
 info BTFIDS vmlinux
-${RESOLVE_BTFIDS} vmlinux
+if [ -z "${CONFIG_BPF}" ]; then
+  no_fail=--no-fail
+fi
+${RESOLVE_BTFIDS} $no_fail vmlinux
 fi
 
 if [ -n "${CONFIG_BUILDTIME_TABLE_SORT}" ]; then

Re: [PATCH] net: phy: realtek: fix rtl8211e rx/tx delay config

2020-09-17 Thread Serge Semin

Hello Willy,
Thanks for the patch. My comments are below.

I've Cc'ed the U-boot/FreeBSD, who might be also interested in the solution
you've provided.

On Thu, Sep 17, 2020 at 09:47:33AM +0800, Willy Liu wrote:
> RGMII RX Delay and TX Delay settings will not applied if Force TX RX Delay
> Control bit is not set.
> Register bit for configuration pins:
> 13 = force Tx RX Delay controlled by bit12 bit11
> 12 = Tx Delay
> 11 = Rx Delay

This is a very useful information, but it contradicts a bit to what knowledge
we've currently got about that magical register. Current code in U-boot does
the delays configuration by means of another bits:
https://elixir.bootlin.com/u-boot/v2020.10-rc4/source/drivers/net/phy/realtek.c

Could you provide a full register layout, so we'd know for sure what that
register really does and finally close the question for good?

> 
> Fixes: f81dadbcf7fd ("net: phy: realtek: Add rtl8211e rx/tx delays config")
> Signed-off-by: Willy Liu 
> ---
>  drivers/net/phy/realtek.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>  mode change 100644 => 100755 drivers/net/phy/realtek.c
> 
> diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
> old mode 100644
> new mode 100755
> index 95dbe5e..3fddd57
> --- a/drivers/net/phy/realtek.c
> +++ b/drivers/net/phy/realtek.c
> @@ -32,9 +32,9 @@
>  #define RTL8211F_TX_DELAYBIT(8)
>  #define RTL8211F_RX_DELAYBIT(3)
>  

> -#define RTL8211E_TX_DELAYBIT(1)
> -#define RTL8211E_RX_DELAYBIT(2)
> -#define RTL8211E_MODE_MII_GMII   BIT(3)
> +#define RTL8211E_CTRL_DELAY  BIT(13)
> +#define RTL8211E_TX_DELAYBIT(12)
> +#define RTL8211E_RX_DELAYBIT(11)

So, what do BIT(1) and BIT(2) control then? Could you explain?

>  
>  #define RTL8201F_ISR 0x1e
>  #define RTL8201F_IER 0x13
> @@ -249,13 +249,13 @@ static int rtl8211e_config_init(struct phy_device 
> *phydev)
>   val = 0;
>   break;
>   case PHY_INTERFACE_MODE_RGMII_ID:
> - val = RTL8211E_TX_DELAY | RTL8211E_RX_DELAY;
> + val = RTL8211E_CTRL_DELAY | RTL8211E_TX_DELAY | 
> RTL8211E_RX_DELAY;
>   break;
>   case PHY_INTERFACE_MODE_RGMII_RXID:
> - val = RTL8211E_RX_DELAY;
> + val = RTL8211E_CTRL_DELAY | RTL8211E_RX_DELAY;
>   break;
>   case PHY_INTERFACE_MODE_RGMII_TXID:
> - val = RTL8211E_TX_DELAY;
> + val = RTL8211E_CTRL_DELAY | RTL8211E_TX_DELAY;
>   break;
>   default: /* the rest of the modes imply leaving delays as is. */
>   return 0;
> @@ -265,9 +265,8 @@ static int rtl8211e_config_init(struct phy_device *phydev)
>* 0xa4 extension page (0x7) layout. It can be used to disable/enable
>* the RX/TX delays otherwise controlled by RXDLY/TXDLY pins. It can
>* also be used to customize the whole configuration register:

> -  * 8:6 = PHY Address, 5:4 = Auto-Negotiation, 3 = Interface Mode Select,
> -  * 2 = RX Delay, 1 = TX Delay, 0 = SELRGV (see original PHY datasheet
> -  * for details).
> +  * 13 = Force Tx RX Delay controlled by bit12 bit11,
> +  * 12 = RX Delay, 11 = TX Delay

Here you've removed the register layout description and replaced it with just 
three
bits info. So from now the text above doesn't really corresponds to what 
follows.

I might have forgotten something, but AFAIR that register bits state mapped
well to what was available on the corresponding external pins. So if you've got
a sacred knowledge what configs are really hidden behind that register, please
open it up. This in-code comment would be a good place to provide the full
register description.

-Sergey

>*/
>   oldpage = phy_select_page(phydev, 0x7);
>   if (oldpage < 0)
> @@ -277,7 +276,8 @@ static int rtl8211e_config_init(struct phy_device *phydev)
>   if (ret)
>   goto err_restore_page;
>  
> - ret = __phy_modify(phydev, 0x1c, RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
> + ret = __phy_modify(phydev, 0x1c, RTL8211E_CTRL_DELAY
> +| RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
>  val);
>  
>  err_restore_page:
> -- 
> 1.9.1
>

Re: [PATCH net-next] net/packet: Fix a comment about mac_header

2020-09-17 Thread Xie He

On Thu, Sep 17, 2020 at 1:51 AM Willem de Bruijn
 wrote:
>
> Acked-by: Willem de Bruijn 

Thank you, Willem!

Re: [PATCH bpf-next v5 2/8] bpf: verifier: refactor check_attach_btf_id()

2020-09-17 Thread Toke Høiland-Jørgensen

Andrii Nakryiko  writes:

>>
>> +int bpf_check_attach_target(struct bpf_verifier_log *log,
>> +   const struct bpf_prog *prog,
>> +   const struct bpf_prog *tgt_prog,
>> +   u32 btf_id,
>> +   struct btf_func_model *fmodel,
>> +   long *tgt_addr,
>> +   const char **tgt_name,
>> +   const struct btf_type **tgt_type);
>
> So this is obviously an abomination of a function signature,
> especially for a one exported to other files.
>
> One candidate to remove would be tgt_type, which is supposed to be a
> derivative of target BTF (vmlinux or tgt_prog->btf) + btf_id,
> **except** (and that's how I found the bug below), in case of
> fentry/fexit programs attaching to "conservative" BPF functions, in
> which case what's stored in aux->attach_func_proto is different from
> what is passed into btf_distill_func_proto. So that's a bug already
> (you'll return NULL in some cases for tgt_type, while it has to always
> be non-NULL).

Okay, looked at this in more detail, and I don't think the refactored
code is doing anything different from the pre-refactor version?

Before we had this:

if (tgt_prog && conservative) {
prog->aux->attach_func_proto = NULL;
t = NULL;
}

and now we just have

if (tgt_prog && conservative)
t = NULL;

in bpf_check_attach_target(), which gets returned as tgt_type and
subsequently assigned to prog->aux->attach_func_proto.

> But related to that is fmodel. It seems like bpf_check_attach_target()
> has no interest in fmodel itself and is just passing it from
> btf_distill_func_proto(). So I was about to suggest dropping fmodel
> and calling btf_distill_func_proto() outside of
> bpf_check_attach_target(), but given the conservative + fentry/fexit
> quirk, it's probably going to be more confusing.
>
> So with all this, I suggest dropping the tgt_type output param
> altogether and let callers do a `btf__type_by_id(tgt_prog ?
> tgt_prog->aux->btf : btf_vmlinux, btf_id);`. That will both fix the
> bug and will make this function's signature just a tad bit less
> horrible.

Thought about this, but the logic also does a few transformations of the
type itself, e.g., this for bpf_trace_raw_tp:

tname += sizeof(prefix) - 1;
t = btf_type_by_id(btf, t->type);
if (!btf_type_is_ptr(t))
/* should never happen in valid vmlinux build */
return -EINVAL;
t = btf_type_by_id(btf, t->type);
if (!btf_type_is_func_proto(t))
/* should never happen in valid vmlinux build */
return -EINVAL;

so to catch this we really do have to return the type from the function
as well.

I do agree that the function signature is a tad on the long side, but I
couldn't think of any good way of making it smaller. I considered
replacing the last two return values with a boolean 'save' parameter,
that would just make it same the values directly in prog->aux; but I
actually find it easier to reason about a function that is strictly
checking things and returning the result, instead of 'sometimes modify'
semantics...

-Toke

Re: [PATCH nf-next v3 3/3] netfilter: Introduce egress hook

2020-09-17 Thread Laura García Liébana

Hi Daniel,

On Tue, Sep 15, 2020 at 12:02 AM Daniel Borkmann  wrote:
>
> On 9/14/20 1:29 PM, Laura García Liébana wrote:
> > On Fri, Sep 11, 2020 at 6:28 PM Daniel Borkmann  
> > wrote:
> >> On 9/11/20 9:42 AM, Laura García Liébana wrote:
> >>> On Tue, Sep 8, 2020 at 2:55 PM Daniel Borkmann  
> >>> wrote:
>  On 9/5/20 7:24 AM, Lukas Wunner wrote:
> > On Fri, Sep 04, 2020 at 11:14:37PM +0200, Daniel Borkmann wrote:
> >> On 9/4/20 6:21 PM, Lukas Wunner wrote:
>  [...]
> >> The tc queueing layer which is below is not the tc egress hook; the
> >> latter is for filtering/mangling/forwarding or helping the lower tc
> >> queueing layer to classify.
> >
> > People want to apply netfilter rules on egress, so either we need an
> > egress hook in the xmit path or we'd have to teach tc to filter and
> > mangle based on netfilter rules.  The former seemed more 
> > straight-forward
> > to me but I'm happy to pursue other directions.
> 
>  I would strongly prefer something where nf integrates into existing tc 
>  hook,
>  not only due to the hook reuse which would be better, but also to allow 
>  for a
>  more flexible interaction between tc/BPF use cases and nf, to name one
> >>>
> >>> That sounds good but I'm afraid that it would take too much back and
> >>> forth discussions. We'll really appreciate it if this small patch can
> >>> be unblocked and then rethink the refactoring of ingress/egress hooks
> >>> that you commented in another thread.
> >>
> >> I'm not sure whether your comment was serious or not, but nope, this needs
> >> to be addressed as mentioned as otherwise this use case would regress. It
> >
> > This patch doesn't break anything. The tc redirect use case that you
> > just commented on is the expected behavior and the same will happen
> > with ingress. To be consistent, in the case that someone requires both
> > hooks, another tc redirect would be needed in the egress path. If you
> > mean to bypass the nf egress if tc redirect in ingress is used, that
> > would lead in a huge security concern.
>
> I'm not sure I parse what you're saying above ... today it is possible and
> perfectly fine to e.g. redirect to a host-facing veth from tc ingress which
> then goes into container. Only traffic that goes up the host stack is seen
> by nf ingress hook in that case. Likewise, reply traffic can be redirected
> from host-facing veth to phys dev for xmit w/o any netfilter interference.
> This means netfilter in host ns really only sees traffic to/from host as
> intended. This is fine today, however, if 3rd party entities (e.g. distro
> side) start pushing down rules on the two nf hooks, then these use cases will
> break on the egress one due to this asymmetric layering violation. Hence my
> ask that this needs to be configurable from a control plane perspective so
> that both use cases can live next to each other w/o breakage. Most trivial

Why does it should be symmetric? Fast-paths create "asymmetric
layering" continuously, see: packet hit XDP to user space bypassing
ingress, but in the response will hit egress. So the "breakage" is
already there.

Also, we're here to create mechanisms not policies that distros have to follow.

> one I can think of is (aside from the fact to refactor the hooks and improve
> their performance) a flag e.g. for skb that can be set from tc/BPF layer to
> bypass the nf hooks. Basically a flexible opt-in so that existing use-cases
> can be retained w/o breakage. This is one option with what I meant in my
> earlier mail.

No comment.

>
> >> is one thing for you wanting to remove tc / BPF from your application stack
> >> as you call it, but not at the cost of breaking others.
> >
> > I'm not intended to remove tc / BPF from my application stack as I'm
> > not using it and, as I explained in past emails, it can't be used for
> > my use cases.
> >
> > In addition, let's review your NACK reasons:
> >
> > This reverts the following commits:
> >
> >   8537f78647c0 ("netfilter: Introduce egress hook")
> >   5418d3881e1f ("netfilter: Generalize ingress hook")
> >   b030f194aed2 ("netfilter: Rename ingress hook include file")
> >
> > From the discussion in [0], the author's main motivation to add a hook
> > in fast path is for an out of tree kernel module, which is a red flag
> > to begin with. Other mentioned potential use cases like NAT{64,46}
> > is on future extensions w/o concrete code in the tree yet. Revert as
> > suggested [1] given the weak justification to add more hooks to critical
> > fast-path.
> >
> >   [0] 
> > https://lore.kernel.org/netdev/cover.1583927267.git.lu...@wunner.de/
> >   [1] 
> > https://lore.kernel.org/netdev/20200318.011152.72770718915606186.da...@davemloft.net/
> >
> > It has been explained already that there are more use cases that
> > require this hook in nf, not only for future developments or out of
> > tree modules.
>
> Sure, aside f

Re: [EXT] Re: [net-next PATCH 0/2] Introduce mbox tracepoints for Octeontx2

2020-09-17 Thread sundeep subbaraya

On Thu, Sep 17, 2020 at 11:34 AM Jiri Pirko  wrote:
>
> Wed, Sep 16, 2020 at 07:19:36PM CEST, sundeep.l...@gmail.com wrote:
> >On Wed, Sep 16, 2020 at 4:04 PM Jiri Pirko  wrote:
> >>
> >> Mon, Sep 07, 2020 at 12:59:45PM CEST, sundeep.l...@gmail.com wrote:
> >> >Hi Jakub,
> >> >
> >> >On Sat, Sep 5, 2020 at 2:07 AM Jakub Kicinski  wrote:
> >> >>
> >> >> On Fri, 4 Sep 2020 12:29:04 + Sunil Kovvuri Goutham wrote:
> >> >> > > >No, there are 3 drivers registering to 3 PCI device IDs and there 
> >> >> > > >can
> >> >> > > >be many instances of the same devices. So there can be 10's of 
> >> >> > > >instances of
> >> >> > > AF, PF and VFs.
> >> >> > >
> >> >> > > So you can still have per-pci device devlink instance and use the 
> >> >> > > tracepoint
> >> >> > > Jakub suggested.
> >> >> > >
> >> >> >
> >> >> > Two things
> >> >> > - As I mentioned above, there is a Crypto driver which uses the same 
> >> >> > mbox APIs
> >> >> >   which is in the process of upstreaming. There also we would need 
> >> >> > trace points.
> >> >> >   Not sure registering to devlink just for the sake of tracepoint is 
> >> >> > proper.
> >> >> >
> >> >> > - The devlink trace message is like this
> >> >> >
> >> >> >TRACE_EVENT(devlink_hwmsg,
> >> >> >  . . .
> >> >> > TP_printk("bus_name=%s dev_name=%s driver_name=%s incoming=%d 
> >> >> > type=%lu buf=0x[%*phD] len=%zu",
> >> >> >   __get_str(bus_name), __get_str(dev_name),
> >> >> >   __get_str(driver_name), __entry->incoming, 
> >> >> > __entry->type,
> >> >> >   (int) __entry->len, __get_dynamic_array(buf), 
> >> >> > __entry->len)
> >> >> >);
> >> >> >
> >> >> >Whatever debug message we want as output doesn't fit into this.
> >> >>
> >> >> Make use of the standard devlink tracepoint wherever applicable, and you
> >> >> can keep your extra ones if you want (as long as Jiri don't object).
> >> >
> >> >Sure and noted. I have tried to use devlink tracepoints and since it
> >> >could not fit our purpose I used these.
> >>
> >> Why exactly the existing TP didn't fit your need?
> >>
> >Existing TP has provision to dump skb and trace error strings with
> >error code but
> >we are trying to trace the entire mailbox flow of the AF/PF and VF
> >drivers. In particular
> >we trace the below:
> >message allocation with message id and size at initiator.
> >number of messages sent and total size.
> >check message requester id, response id and response code after
> >reply is received.
> >interrupts happened on behalf of mailboxes in the entire process
> >with source and receiver of interrupt along with isr status.
> >error like initiator timeout waiting for response.
> >  All the above are relevant and are required for Octeontx2 only hence
> >used own tracepoints.
>
> You can still use devlink_hwmsg for the actual data exchanged between
> the driver and hw. For the rest, you can have driver-specific TPs.
>
>
I totally got your point and adding devlink to our drivers is work in progress
since we got a similar comment from Jakub for a patch previously:
https://www.mail-archive.com/netdev@vger.kernel.org/msg341414.html
All the errors in the drivers will be turned to devlink TP in future.
This patchset is a bit different since it traces mailbox messages state machine
at low level and does not even trace message data exchanged between
driver and hw.

Thanks,
Sundeep

> >
> >Thanks,
> >Sundeep
> >
> >> >
> >> >Thanks,
> >> >Sundeep

[PATCH net-next RFC v2 1/3] devlink: Wrap trap related lists a trap_lists struct

Bundle the trap related lists: trap_list, trap_group_list and
trap_policer_list in a dedicated struct. This will be handy in the
coming patches in the set introducing traps in devlink port context.
With trap_lists, code reuse is much simpler.

Signed-off-by: Aya Levin 
---
Changelog:
v1->v2:
Patch 1: Encapsulate only the traps lists for future code reuse. Don't
try to reuse the traps ops.

 include/net/devlink.h |  10 +++--
 net/core/devlink.c| 109 ++
 2 files changed, 63 insertions(+), 56 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index eaec0a8cc5ef..f11e09097e44 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -22,6 +22,12 @@
 
 struct devlink_ops;
 
+struct devlink_trap_lists {
+   struct list_head trap_list;
+   struct list_head trap_group_list;
+   struct list_head trap_policer_list;
+};
+
 struct devlink {
struct list_head list;
struct list_head port_list;
@@ -33,9 +39,7 @@ struct devlink {
struct list_head reporter_list;
struct mutex reporters_lock; /* protects reporter_list */
struct devlink_dpipe_headers *dpipe_headers;
-   struct list_head trap_list;
-   struct list_head trap_group_list;
-   struct list_head trap_policer_list;
+   struct devlink_trap_lists trap_lists;
const struct devlink_ops *ops;
struct xarray snapshot_ids;
struct device *dev;
diff --git a/net/core/devlink.c b/net/core/devlink.c
index 19037f114307..fde6f2c5c409 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -6158,11 +6158,11 @@ struct devlink_trap_item {
 };
 
 static struct devlink_trap_policer_item *
-devlink_trap_policer_item_lookup(struct devlink *devlink, u32 id)
+devlink_trap_policer_item_lookup(struct devlink_trap_lists *trap_lists, u32 id)
 {
struct devlink_trap_policer_item *policer_item;
 
-   list_for_each_entry(policer_item, &devlink->trap_policer_list, list) {
+   list_for_each_entry(policer_item, &trap_lists->trap_policer_list, list) 
{
if (policer_item->policer->id == id)
return policer_item;
}
@@ -6171,11 +6171,11 @@ devlink_trap_policer_item_lookup(struct devlink 
*devlink, u32 id)
 }
 
 static struct devlink_trap_item *
-devlink_trap_item_lookup(struct devlink *devlink, const char *name)
+devlink_trap_item_lookup(struct devlink_trap_lists *trap_lists, const char 
*name)
 {
struct devlink_trap_item *trap_item;
 
-   list_for_each_entry(trap_item, &devlink->trap_list, list) {
+   list_for_each_entry(trap_item, &trap_lists->trap_list, list) {
if (!strcmp(trap_item->trap->name, name))
return trap_item;
}
@@ -6184,7 +6184,7 @@ devlink_trap_item_lookup(struct devlink *devlink, const 
char *name)
 }
 
 static struct devlink_trap_item *
-devlink_trap_item_get_from_info(struct devlink *devlink,
+devlink_trap_item_get_from_info(struct devlink_trap_lists *trap_lists,
struct genl_info *info)
 {
struct nlattr *attr;
@@ -6193,7 +6193,7 @@ devlink_trap_item_get_from_info(struct devlink *devlink,
return NULL;
attr = info->attrs[DEVLINK_ATTR_TRAP_NAME];
 
-   return devlink_trap_item_lookup(devlink, nla_data(attr));
+   return devlink_trap_item_lookup(trap_lists, nla_data(attr));
 }
 
 static int
@@ -6352,10 +6352,10 @@ static int devlink_nl_cmd_trap_get_doit(struct sk_buff 
*skb,
struct sk_buff *msg;
int err;
 
-   if (list_empty(&devlink->trap_list))
+   if (list_empty(&devlink->trap_lists.trap_list))
return -EOPNOTSUPP;
 
-   trap_item = devlink_trap_item_get_from_info(devlink, info);
+   trap_item = devlink_trap_item_get_from_info(&devlink->trap_lists, info);
if (!trap_item) {
NL_SET_ERR_MSG_MOD(extack, "Device did not register this trap");
return -ENOENT;
@@ -6392,7 +6392,7 @@ static int devlink_nl_cmd_trap_get_dumpit(struct sk_buff 
*msg,
if (!net_eq(devlink_net(devlink), sock_net(msg->sk)))
continue;
mutex_lock(&devlink->lock);
-   list_for_each_entry(trap_item, &devlink->trap_list, list) {
+   list_for_each_entry(trap_item, &devlink->trap_lists.trap_list, 
list) {
if (idx < start) {
idx++;
continue;
@@ -6468,10 +6468,10 @@ static int devlink_nl_cmd_trap_set_doit(struct sk_buff 
*skb,
struct devlink_trap_item *trap_item;
int err;
 
-   if (list_empty(&devlink->trap_list))
+   if (list_empty(&devlink->trap_lists.trap_list))
return -EOPNOTSUPP;
 
-   trap_item = devlink_trap_item_get_from_info(devlink, info);
+   trap_item = devlink_trap_item_get_from_info(&devlink->trap_lists, info);
if (!trap_item) {
N

[PATCH net-next RFC v2 2/3] devlink: Add devlink traps under devlink_ports context

There are some cases where we would like to trap dropped packets only
for a single port on a device without affecting the others. For that
purpose:
- Add trap lists and trap ops to devlink_port
- Add corresponding trap API to manage traps
- Add matching netlink commands

Signed-off-by: Aya Levin 
---
Changelog:
v1->v2:
Add traps lock in devlink_port
Add devlink_port ops and in it, add the trap ops
Add support onlty for traps and exclude groups and policer
Add seperate netlink commands foor port trap get and set 
Allow trap registration without a corresponding group

 include/net/devlink.h|  44 ++
 include/uapi/linux/devlink.h |   5 +
 net/core/devlink.c   | 346 +--
 3 files changed, 382 insertions(+), 13 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index f11e09097e44..93eb7033ce00 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -21,6 +21,8 @@
 #include 
 
 struct devlink_ops;
+struct devlink_port_ops;
+
 
 struct devlink_trap_lists {
struct list_head trap_list;
@@ -129,6 +131,9 @@ struct devlink_port {
struct delayed_work type_warn_dw;
struct list_head reporter_list;
struct mutex reporters_lock; /* Protects reporter_list */
+   struct mutex trap_lists_lock;
+   struct devlink_trap_lists trap_lists;
+   const struct devlink_port_ops *ops;
 };
 
 struct devlink_sb_pool_info {
@@ -1177,6 +1182,35 @@ struct devlink_ops {
 struct netlink_ext_ack *extack);
 };
 
+struct devlink_port_ops {
+   /**
+* @trap_init: Trap initialization function.
+*
+* Should be used by device drivers to initialize the trap in the
+* underlying device. Drivers should also store the provided trap
+* context, so that they could efficiently pass it to
+* devlink_trap_report() when the trap is triggered.
+*/
+   int (*trap_init)(struct devlink_port *devlink,
+const struct devlink_trap *trap, void *trap_ctx);
+   /**
+* @trap_fini: Trap de-initialization function.
+*
+* Should be used by device drivers to de-initialize the trap in the
+* underlying device.
+*/
+
+   void (*trap_fini)(struct devlink_port *devlink_port,
+ const struct devlink_trap *trap, void *trap_ctx);
+   /**
+* @trap_action_set: Trap action set function.
+*/
+   int (*trap_action_set)(struct devlink_port *devlink_port,
+  const struct devlink_trap *trap,
+  enum devlink_trap_action action,
+  struct netlink_ext_ack *extack);
+};
+
 static inline void *devlink_priv(struct devlink *devlink)
 {
BUG_ON(!devlink);
@@ -1220,6 +1254,8 @@ int devlink_port_register(struct devlink *devlink,
  struct devlink_port *devlink_port,
  unsigned int port_index);
 void devlink_port_unregister(struct devlink_port *devlink_port);
+void devlink_port_set_ops(struct devlink_port *devlink_port,
+ const struct devlink_port_ops *ops);
 void devlink_port_type_eth_set(struct devlink_port *devlink_port,
   struct net_device *netdev);
 void devlink_port_type_ib_set(struct devlink_port *devlink_port,
@@ -1429,6 +1465,14 @@ void
 devlink_trap_policers_unregister(struct devlink *devlink,
 const struct devlink_trap_policer *policers,
 size_t policers_count);
+int devlink_port_traps_register(struct devlink_port *devlink_port,
+   const struct devlink_trap *traps,
+   size_t traps_count, void *priv);
+void devlink_port_traps_unregister(struct devlink_port *devlink_port,
+  const struct devlink_trap *traps,
+  size_t traps_count);
+void devlink_port_trap_report(struct devlink_port *devlink_port, struct 
sk_buff *skb,
+ void *trap_ctx, const struct flow_action_cookie 
*fa_cookie);
 
 #if IS_ENABLED(CONFIG_NET_DEVLINK)
 
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 40d35145c879..401ad93dab67 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -122,6 +122,11 @@ enum devlink_command {
DEVLINK_CMD_TRAP_POLICER_NEW,
DEVLINK_CMD_TRAP_POLICER_DEL,
 
+   DEVLINK_CMD_PORT_TRAP_GET,  /* can dump */
+   DEVLINK_CMD_PORT_TRAP_SET,
+   DEVLINK_CMD_PORT_TRAP_NEW,
+   DEVLINK_CMD_PORT_TRAP_DEL,
+
/* add new commands above here */
__DEVLINK_CMD_MAX,
DEVLINK_CMD_MAX = __DEVLINK_CMD_MAX - 1
diff --git a/net/core/devlink.c b/net/core/devlink.c
index fde6f2c5c409..438bd88c2c1b 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -6

[PATCH net-next RFC v2 0/3] Add devlink traps in devlink port context

Implement support for devlink traps on per-port basis. Dropped
packets in the RX flow are related to the Ethernet port and thus
should be in port context. Traps per device should trap global
configuration which may cause drops. Devlink traps is regard as a
debug mode. Using traps per port enable debug which doesn't effect
other ports on a device.

Patchset:
Patch 1: Refactors devlink trap for easier code re-use in the coming
patches
Patch 2: Adds devlink traps under devlink port context
ports context. In a nutshell it allows enable/disable of a trap on
all related ports which registered this trap.
Patch 3: Display a use in devlink traps in port context in mlx5
ethernet driver.

Changelog:
Minor changes in cover letter
v1->v2:
Patch 1: 
-Gather only the traps lists for future code reuse. Don't
 try to reuse the traps ops.
Ptach 2: 
-Add traps lock in devlink_port
-Add devlink_port ops and in it, add the trap ops
-Add support onlty for traps and exclude groups and policy
-Add separate netlink commands for port trap get and set 
-Allow trap registration without a corresponding group
Patch 3: removed
Ptach 4: 
-Is now patch 3
-Minor changes in trap's definition
-Adjustments to trap API and ops

Aya Levin (3):
  devlink: Wrap trap related lists a trap_lists struct
  devlink: Add devlink traps under devlink_ports context
  net/mlx5e: Add devlink trap to catch oversize packets

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h   |   2 +
 drivers/net/ethernet/mellanox/mlx5/core/en/traps.c |  38 ++
 drivers/net/ethernet/mellanox/mlx5/core/en/traps.h |  14 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  48 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c|  11 +-
 include/net/devlink.h  |  54 ++-
 include/uapi/linux/devlink.h   |   5 +
 net/core/devlink.c | 453 ++---
 9 files changed, 556 insertions(+), 71 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/traps.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/traps.h

-- 
2.14.1

[PATCH net-next RFC v2 3/3] net/mlx5e: Add devlink trap to catch oversize packets

From: Aya Levin 

Register MTU error trap to allow visibility of oversize packets. Display
a naive use of devlink trap in devlink port context.

Signed-off-by: Aya Levin 
---
Changelog:
v1->v2:
-Minor changes in trap's definition
-Adjustments to trap API and ops

 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en.h   |  2 +
 drivers/net/ethernet/mellanox/mlx5/core/en/traps.c | 38 +
 drivers/net/ethernet/mellanox/mlx5/core/en/traps.h | 14 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 48 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c| 11 -
 6 files changed, 112 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/traps.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/traps.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile 
b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 9826a041e407..32436325725c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -25,7 +25,7 @@ mlx5_core-$(CONFIG_MLX5_CORE_EN) += en_main.o en_common.o 
en_fs.o en_ethtool.o \
en_tx.o en_rx.o en_dim.o en_txrx.o en/xdp.o en_stats.o \
en_selftest.o en/port.o en/monitor_stats.o en/health.o \
en/reporter_tx.o en/reporter_rx.o en/params.o en/xsk/pool.o \
-   en/xsk/setup.o en/xsk/rx.o en/xsk/tx.o en/devlink.o
+   en/xsk/setup.o en/xsk/rx.o en/xsk/tx.o en/devlink.o en/traps.o
 
 #
 # Netdev extra
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 0df40d24acb0..6e652a513a84 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -824,6 +824,8 @@ struct mlx5e_priv {
struct mlx5e_hv_vhca_stats_agent stats_agent;
 #endif
struct mlx5e_scratchpadscratchpad;
+   bool trap_oversize;
+   void *trap_mtu;
 };
 
 struct mlx5e_rx_handlers {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/traps.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en/traps.c
new file mode 100644
index ..211407666c3a
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/traps.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Mellanox Technologies.
+#include "traps.h"
+
+#define MLX5E_TRAP(_id, _type, _group_id)   \
+   DEVLINK_TRAP_GENERIC(_type, DROP, _id,  \
+DEVLINK_TRAP_GROUP_GENERIC_ID_##_group_id, \
+DEVLINK_TRAP_METADATA_TYPE_F_IN_PORT)
+static struct devlink_trap mlx5e_traps_arr[] = {
+   MLX5E_TRAP(MTU_ERROR, EXCEPTION, L2_DROPS),
+};
+
+int mlx5e_devlink_traps_create(struct mlx5e_priv *priv)
+{
+   struct devlink_port *dl_port = &priv->dl_port;
+
+   return  devlink_port_traps_register(dl_port, mlx5e_traps_arr,
+  ARRAY_SIZE(mlx5e_traps_arr),
+  priv);
+}
+
+void mlx5e_devlink_traps_destroy(struct mlx5e_priv *priv)
+{
+   struct devlink_port *dl_port = &priv->dl_port;
+
+   devlink_port_traps_unregister(dl_port, mlx5e_traps_arr,
+ ARRAY_SIZE(mlx5e_traps_arr));
+}
+
+struct devlink_trap *mlx5e_trap_lookup(u16 id)
+{
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(mlx5e_traps_arr); i++)
+   if (mlx5e_traps_arr[i].id == id)
+   return &mlx5e_traps_arr[i];
+   return NULL;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/traps.h 
b/drivers/net/ethernet/mellanox/mlx5/core/en/traps.h
new file mode 100644
index ..7d95cd4b571c
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/traps.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2020 Mellanox Technologies.*/
+
+#ifndef __MLX5E_EN_TRAPS_H
+#define __MLX5E_EN_TRAPS_H
+
+#include "en.h"
+
+int mlx5e_devlink_traps_create(struct mlx5e_priv *priv);
+void mlx5e_devlink_traps_destroy(struct mlx5e_priv *priv);
+struct devlink_trap *mlx5e_trap_lookup(u16 id);
+
+#endif
+
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 472252ea67a1..81d1e6186bb8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -64,6 +64,7 @@
 #include "en/hv_vhca_stats.h"
 #include "en/devlink.h"
 #include "lib/mlx5.h"
+#include "en/traps.h"
 
 bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
 {
@@ -5003,6 +5004,50 @@ void mlx5e_destroy_q_counters(struct mlx5e_priv *priv)
}
 }
 
+static int mlx5e_devlink_trap_init(struct devlink_port *devlink_port,
+  const struct devlink_trap *trap,
+

[PATCH] netdevsim: fix semicolon.cocci warnings

2020-09-17 Thread kernel test robot

From: kernel test robot 

drivers/net/netdevsim/port_function.c:122:2-3: Unneeded semicolon
drivers/net/netdevsim/port_function.c:140:2-3: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Parav Pandit 
Signed-off-by: kernel test robot 
---

url:
https://github.com/0day-ci/linux/commits/Parav-Pandit/devlink-Add-SF-add-delete-devlink-ops/20200917-162417
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
b948577b984a01d24d401d2264efbccc7f0146c1

 port_function.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/netdevsim/port_function.c
+++ b/drivers/net/netdevsim/port_function.c
@@ -119,7 +119,7 @@ nsim_devlink_port_function_alloc(struct
break;
default:
break;
-   };
+   }
return port;
 
 fn_ida_err:
@@ -137,7 +137,7 @@ static void nsim_devlink_port_function_f
break;
default:
break;
-   };
+   }
ida_simple_remove(&dev->port_functions.ida, port->port_index);
free_netdev(port->netdev);
 }

Re: [PATCH v3] arm64: bpf: Fix branch offset in JIT

2020-09-17 Thread Catalin Marinas

On Thu, 17 Sep 2020 11:49:25 +0300, Ilias Apalodimas wrote:
> Running the eBPF test_verifier leads to random errors looking like this:
> 
> [ 6525.735488] Unexpected kernel BRK exception at EL1
> [ 6525.735502] Internal error: ptrace BRK handler: f2000100 [#1] SMP
> [ 6525.741609] Modules linked in: nls_utf8 cifs libdes libarc4 dns_resolver 
> fscache binfmt_misc nls_ascii nls_cp437 vfat fat aes_ce_blk crypto_simd 
> cryptd aes_ce_cipher ghash_ce gf128mul efi_pstore sha2_ce sha256_arm64 
> sha1_ce evdev efivars efivarfs ip_tables x_tables autofs4 btrfs 
> blake2b_generic xor xor_neon zstd_compress raid6_pq libcrc32c crc32c_generic 
> ahci xhci_pci libahci xhci_hcd igb libata i2c_algo_bit nvme realtek usbcore 
> nvme_core scsi_mod t10_pi netsec mdio_devres of_mdio gpio_keys fixed_phy 
> libphy gpio_mb86s7x
> [ 6525.787760] CPU: 3 PID: 7881 Comm: test_verifier Tainted: GW   
>   5.9.0-rc1+ #47
> [ 6525.796111] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS 
> build #1 Jun  6 2020
> [ 6525.804812] pstate: 2005 (nzCv daif -PAN -UAO BTYPE=--)
> [ 6525.810390] pc : bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
> [ 6525.815613] lr : bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
> [ 6525.820832] sp : 8000130cbb80
> [ 6525.824141] x29: 8000130cbbb0 x28: 
> [ 6525.829451] x27: 05ef6fcbf39b x26: 
> [ 6525.834759] x25: 8000130cbb80 x24: 800011dc7038
> [ 6525.840067] x23: 8000130cbd00 x22: 0008f624d080
> [ 6525.845375] x21: 0001 x20: 800011dc7000
> [ 6525.850682] x19:  x18: 
> [ 6525.855990] x17:  x16: 
> [ 6525.861298] x15:  x14: 
> [ 6525.866606] x13:  x12: 
> [ 6525.871913] x11: 0001 x10: 800a660c
> [ 6525.877220] x9 : 800010951810 x8 : 8000130cbc38
> [ 6525.882528] x7 :  x6 : 009864cfa881
> [ 6525.887836] x5 : 00ff x4 : 002880ba1a0b3e9f
> [ 6525.893144] x3 : 0018 x2 : 800a4374
> [ 6525.898452] x1 : 000a x0 : 0009
> [ 6525.903760] Call trace:
> [ 6525.906202]  bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
> [ 6525.911076]  bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
> [ 6525.915957]  bpf_dispatcher_xdp_func+0x14/0x20
> [ 6525.920398]  bpf_test_run+0x70/0x1b0
> [ 6525.923969]  bpf_prog_test_run_xdp+0xec/0x190
> [ 6525.928326]  __do_sys_bpf+0xc88/0x1b28
> [ 6525.932072]  __arm64_sys_bpf+0x24/0x30
> [ 6525.935820]  el0_svc_common.constprop.0+0x70/0x168
> [ 6525.940607]  do_el0_svc+0x28/0x88
> [ 6525.943920]  el0_sync_handler+0x88/0x190
> [ 6525.947838]  el0_sync+0x140/0x180
> [ 6525.951154] Code: d4202000 d4202000 d4202000 d4202000 (d4202000)
> [ 6525.957249] ---[ end trace cecc3f93b14927e2 ]---
> 
> [...]

Applied to arm64 (for-next/fixes), thanks!

[1/1] arm64: bpf: Fix branch offset in JIT
  https://git.kernel.org/arm64/c/32f6865c7aa3

-- 
Catalin

Re: [PATCH rdma-next v2 0/3] Fix in-kernel active_speed type

2020-09-17 Thread Jason Gunthorpe

On Thu, Sep 17, 2020 at 12:02:20PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky 
> 
> Changelog:
> v2:
>  * Changed WARN_ON casting to be saturated value instead while returning 
> active_speed
>to the user.
> v1: https://lore.kernel.org/linux-rdma/20200902074503.743310-1-l...@kernel.org
>  * Changed patch #1 to fix memory corruption to help with bisect. No
>change in series, because the added code is changed anyway in patch
>#3.
> v0:
>  * https://lore.kernel.org/linux-rdma/20200824105826.1093613-1-l...@kernel.org
> 
> 
> IBTA declares speed as 16 bits, but kernel stores it in u8. This series
> fixes in-kernel declaration while keeping external interface intact.
> 
> Thanks
> 
> Aharon Landau (3):
>   net/mlx5: Refactor query port speed functions
>   RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
>   RDMA: Fix link active_speed size

Look OK, can you update the shared branch?

Thanks,
Jason

[PATCH bpf v1] tools/bpftool: support passing BPFTOOL_VERSION to make

2020-09-17 Thread Tony Ambardar

This change facilitates out-of-tree builds, packaging, and versioning for
test and debug purposes. Defining BPFTOOL_VERSION allows self-contained
builds within the tools tree, since it avoids use of the 'kernelversion'
target in the top-level makefile, which would otherwise pull in several
other includes from outside the tools tree.

Signed-off-by: Tony Ambardar 
---
 tools/bpf/bpftool/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 8462690a039b..4828913703b6 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -25,7 +25,7 @@ endif
 
 LIBBPF = $(LIBBPF_PATH)libbpf.a
 
-BPFTOOL_VERSION := $(shell make -rR --no-print-directory -sC ../../.. 
kernelversion)
+BPFTOOL_VERSION ?= $(shell make -rR --no-print-directory -sC ../../.. 
kernelversion)
 
 $(LIBBPF): FORCE
$(if $(LIBBPF_OUTPUT),@mkdir -p $(LIBBPF_OUTPUT))
-- 
2.25.1

[PATCH net] net: wilc1000: clean up resource in error path of init mon interface

2020-09-17 Thread Huang Guobin

The wilc_wfi_init_mon_int() forgets to clean up resource when
register_netdevice() failed. Add the missed call to fix it.
And the return value of netdev_priv can't be NULL, so remove
the unnecessary error handling.

Fixes: 588713006ea4 ("staging: wilc1000: avoid the use of 'wilc_wfi_mon' static 
variable")
Reported-by: Hulk Robot 
Signed-off-by: Huang Guobin 
---
 drivers/net/wireless/microchip/wilc1000/mon.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/microchip/wilc1000/mon.c 
b/drivers/net/wireless/microchip/wilc1000/mon.c
index 358ac8601333..b5a1b65c087c 100644
--- a/drivers/net/wireless/microchip/wilc1000/mon.c
+++ b/drivers/net/wireless/microchip/wilc1000/mon.c
@@ -235,11 +235,10 @@ struct net_device *wilc_wfi_init_mon_interface(struct 
wilc *wl,
 
if (register_netdevice(wl->monitor_dev)) {
netdev_err(real_dev, "register_netdevice failed\n");
+   free_netdev(wl->monitor_dev);
return NULL;
}
priv = netdev_priv(wl->monitor_dev);
-   if (!priv)
-   return NULL;
 
priv->real_ndev = real_dev;
 
-- 
2.25.1

[PATCH net v2] hinic: fix potential resource leak

2020-09-17 Thread Wei Li

In rx_request_irq(), it will just return what irq_set_affinity_hint()
returns. If it is failed, the napi and irq requested are not freed
properly. So add exits for failures to handle these.

Signed-off-by: Wei Li 
---
v1 -> v2:
 - Free irq as well when irq_set_affinity_hint() fails.
---
 drivers/net/ethernet/huawei/hinic/hinic_rx.c | 21 +---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c 
b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
index 5bee951fe9d4..cc1d425d070c 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
@@ -543,18 +543,25 @@ static int rx_request_irq(struct hinic_rxq *rxq)
if (err) {
netif_err(nic_dev, drv, rxq->netdev,
  "Failed to set RX interrupt coalescing attribute\n");
-   rx_del_napi(rxq);
-   return err;
+   goto err_req_irq;
}
 
err = request_irq(rq->irq, rx_irq, 0, rxq->irq_name, rxq);
-   if (err) {
-   rx_del_napi(rxq);
-   return err;
-   }
+   if (err)
+   goto err_req_irq;
 
cpumask_set_cpu(qp->q_id % num_online_cpus(), &rq->affinity_mask);
-   return irq_set_affinity_hint(rq->irq, &rq->affinity_mask);
+   err = irq_set_affinity_hint(rq->irq, &rq->affinity_mask);
+   if (err)
+   goto err_irq_affinity;
+
+   return 0;
+
+err_irq_affinity:
+   free_irq(rq->irq, rxq);
+err_req_irq:
+   rx_del_napi(rxq);
+   return err;
 }
 
 static void rx_free_irq(struct hinic_rxq *rxq)
-- 
2.17.1

BPF redirect API design issue for BPF-prog MTU feedback?

2020-09-17 Thread Jesper Dangaard Brouer



As you likely know[1] I'm looking into moving the MTU check (for TC-BPF)
in __bpf_skb_max_len() when e.g. called by bpf_skb_adjust_room(),
because when redirecting packets to another netdev it is not correct to
limit the MTU based on the incoming netdev.

I was looking at doing the MTU check in bpf_redirect() helper, because
at this point we know the redirect to netdev, and returning an
indication/error that MTU was exceed, would allow the BPF-prog logic to
react, e.g. sending ICMP (instead of packet getting silently dropped).
BUT this is not possible because bpf_redirect(index, flags) helper
don't provide the packet context-object (so I cannot lookup the packet
length).

Seeking input:

Should/can we change the bpf_redirect API or create a new helper with
packet-context?

 Note: We have the same need for the packet context for XDP when
 redirecting the new multi-buffer packets, as not all destination netdev
 will support these new multi-buffer packets.

I can of-cause do the MTU checks on kernel-side in skb_do_redirect, but
then how do people debug this? as packet will basically be silently dropped.



(Looking at how does BPF-prog logic handle MTU today)

How do bpf_skb_adjust_room() report that the MTU was exceeded?
Unfortunately it uses a common return code -ENOTSUPP which used for
multiple cases (include MTU exceeded). Thus, the BPF-prog logic cannot
use this reliably to know if this is a MTU exceeded event. (Looked
BPF-prog code and they all simply exit with TC_ACT_SHOT for all error
codes, cloudflare have the most advanced handling with
metrics->errors_total_encap_adjust_failed++).


[1] 
https://lore.kernel.org/bpf/159921182827.1260200.9699352760916903781.stgit@firesoul/
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

[PATCH -next v2] net: hsr: Convert to DEFINE_SHOW_ATTRIBUTE

2020-09-17 Thread Qinglang Miao

Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Signed-off-by: Qinglang Miao 
---
v2: based on linux-next(20200917), and can be applied to
mainline cleanly now.

 net/hsr/hsr_debugfs.c | 21 ++---
 1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/net/hsr/hsr_debugfs.c b/net/hsr/hsr_debugfs.c
index 7e11a6c35..4cfd9e829 100644
--- a/net/hsr/hsr_debugfs.c
+++ b/net/hsr/hsr_debugfs.c
@@ -60,17 +60,7 @@ hsr_node_table_show(struct seq_file *sfp, void *data)
return 0;
 }
 
-/* hsr_node_table_open - Open the node_table file
- *
- * Description:
- * This routine opens a debugfs file node_table of specific hsr
- * or prp device
- */
-static int
-hsr_node_table_open(struct inode *inode, struct file *filp)
-{
-   return single_open(filp, hsr_node_table_show, inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(hsr_node_table);
 
 void hsr_debugfs_rename(struct net_device *dev)
 {
@@ -85,13 +75,6 @@ void hsr_debugfs_rename(struct net_device *dev)
priv->node_tbl_root = d;
 }
 
-static const struct file_operations hsr_fops = {
-   .open   = hsr_node_table_open,
-   .read   = seq_read,
-   .llseek = seq_lseek,
-   .release = single_release,
-};
-
 /* hsr_debugfs_init - create hsr node_table file for dumping
  * the node table
  *
@@ -113,7 +96,7 @@ void hsr_debugfs_init(struct hsr_priv *priv, struct 
net_device *hsr_dev)
 
de = debugfs_create_file("node_table", S_IFREG | 0444,
 priv->node_tbl_root, priv,
-&hsr_fops);
+&hsr_node_table_fops);
if (IS_ERR(de)) {
pr_err("Cannot create hsr node_table file\n");
debugfs_remove(priv->node_tbl_root);
-- 
2.23.0

Re: [PATCH -next] dpaa2-eth: Convert to DEFINE_SHOW_ATTRIBUTE

2020-09-17 Thread miaoqinglang

在 2020/7/18 3:47, David Miller 写道:

From: Qinglang Miao 
Date: Thu, 16 Jul 2020 16:58:59 +0800

From: Yongqiang Liu 

Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Signed-off-by: Yongqiang Liu 

This also does not apply cleanly to the net-next tree.
.

Hi David,

I've sent new patches against linux-next(20200917), and they can
be applied to mainline cleanly now.

Thanks.

[PATCH -next v2] dpaa2-eth: Convert to DEFINE_SHOW_ATTRIBUTE

2020-09-17 Thread Qinglang Miao

Signed-off-by: Qinglang Miao 
---
v2: based on linux-next(20200917), and can be applied to
mainline cleanly now.

 .../freescale/dpaa2/dpaa2-eth-debugfs.c   | 63 ++-
 1 file changed, 6 insertions(+), 57 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c 
b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
index 56d9927fb..b87db0846 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-debugfs.c
@@ -42,24 +42,7 @@ static int dpaa2_dbg_cpu_show(struct seq_file *file, void 
*offset)
return 0;
 }
 
-static int dpaa2_dbg_cpu_open(struct inode *inode, struct file *file)
-{
-   int err;
-   struct dpaa2_eth_priv *priv = (struct dpaa2_eth_priv *)inode->i_private;
-
-   err = single_open(file, dpaa2_dbg_cpu_show, priv);
-   if (err < 0)
-   netdev_err(priv->net_dev, "single_open() failed\n");
-
-   return err;
-}
-
-static const struct file_operations dpaa2_dbg_cpu_ops = {
-   .open = dpaa2_dbg_cpu_open,
-   .read = seq_read,
-   .llseek = seq_lseek,
-   .release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(dpaa2_dbg_cpu);
 
 static char *fq_type_to_str(struct dpaa2_eth_fq *fq)
 {
@@ -106,24 +89,7 @@ static int dpaa2_dbg_fqs_show(struct seq_file *file, void 
*offset)
return 0;
 }
 
-static int dpaa2_dbg_fqs_open(struct inode *inode, struct file *file)
-{
-   int err;
-   struct dpaa2_eth_priv *priv = (struct dpaa2_eth_priv *)inode->i_private;
-
-   err = single_open(file, dpaa2_dbg_fqs_show, priv);
-   if (err < 0)
-   netdev_err(priv->net_dev, "single_open() failed\n");
-
-   return err;
-}
-
-static const struct file_operations dpaa2_dbg_fq_ops = {
-   .open = dpaa2_dbg_fqs_open,
-   .read = seq_read,
-   .llseek = seq_lseek,
-   .release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(dpaa2_dbg_fqs);
 
 static int dpaa2_dbg_ch_show(struct seq_file *file, void *offset)
 {
@@ -151,24 +117,7 @@ static int dpaa2_dbg_ch_show(struct seq_file *file, void 
*offset)
return 0;
 }
 
-static int dpaa2_dbg_ch_open(struct inode *inode, struct file *file)
-{
-   int err;
-   struct dpaa2_eth_priv *priv = (struct dpaa2_eth_priv *)inode->i_private;
-
-   err = single_open(file, dpaa2_dbg_ch_show, priv);
-   if (err < 0)
-   netdev_err(priv->net_dev, "single_open() failed\n");
-
-   return err;
-}
-
-static const struct file_operations dpaa2_dbg_ch_ops = {
-   .open = dpaa2_dbg_ch_open,
-   .read = seq_read,
-   .llseek = seq_lseek,
-   .release = single_release,
-};
+DEFINE_SHOW_ATTRIBUTE(dpaa2_dbg_ch);
 
 void dpaa2_dbg_add(struct dpaa2_eth_priv *priv)
 {
@@ -179,13 +128,13 @@ void dpaa2_dbg_add(struct dpaa2_eth_priv *priv)
priv->dbg.dir = dir;
 
/* per-cpu stats file */
-   debugfs_create_file("cpu_stats", 0444, dir, priv, &dpaa2_dbg_cpu_ops);
+   debugfs_create_file("cpu_stats", 0444, dir, priv, &dpaa2_dbg_cpu_fops);
 
/* per-fq stats file */
-   debugfs_create_file("fq_stats", 0444, dir, priv, &dpaa2_dbg_fq_ops);
+   debugfs_create_file("fq_stats", 0444, dir, priv, &dpaa2_dbg_fqs_fops);
 
/* per-fq stats file */
-   debugfs_create_file("ch_stats", 0444, dir, priv, &dpaa2_dbg_ch_ops);
+   debugfs_create_file("ch_stats", 0444, dir, priv, &dpaa2_dbg_ch_fops);
 }
 
 void dpaa2_dbg_remove(struct dpaa2_eth_priv *priv)
-- 
2.23.0

[PATCH net] drivers: net: Fix *_ipsec_offload_ok(): Use ip_hdr family

2020-09-17 Thread Christian Langrock

Xfrm_dev_offload_ok() is called with the unencrypted SKB. So in case of
interfamily ipsec traffic (IPv4-in-IPv6 and IPv6 in IPv4) the check
assumes the wrong family of the skb (IP family of the state).
With this patch the ip header of the SKB is used to determine the
family.

Signed-off-by: Christian Langrock 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c   | 2 +-
 drivers/net/ethernet/intel/ixgbevf/ipsec.c   | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index eca73526ac86..3601dd293463 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -813,7 +813,7 @@ static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
  **/
 static bool ixgbe_ipsec_offload_ok(struct sk_buff *skb, struct
xfrm_state *xs)
 {
-   if (xs->props.family == AF_INET) {
+   if (ip_hdr(skb)->version == 4) {
    /* Offload with IPv4 options is not supported yet */
    if (ip_hdr(skb)->ihl != 5)
    return false;
diff --git a/drivers/net/ethernet/intel/ixgbevf/ipsec.c
b/drivers/net/ethernet/intel/ixgbevf/ipsec.c
index 5170dd9d8705..b1d72d5d1744 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ipsec.c
@@ -418,7 +418,7 @@ static void ixgbevf_ipsec_del_sa(struct xfrm_state *xs)
  **/
 static bool ixgbevf_ipsec_offload_ok(struct sk_buff *skb, struct
xfrm_state *xs)
 {
-   if (xs->props.family == AF_INET) {
+   if (ip_hdr(skb)->version == 4) {
    /* Offload with IPv4 options is not supported yet */
    if (ip_hdr(skb)->ihl != 5)
    return false;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
index d39989cddd90..e3a9b313b01f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
@@ -460,7 +460,7 @@ void mlx5e_ipsec_cleanup(struct mlx5e_priv *priv)
 
 static bool mlx5e_ipsec_offload_ok(struct sk_buff *skb, struct
xfrm_state *x)
 {
-   if (x->props.family == AF_INET) {
+   if (ip_hdr(skb)->version == 4) {
    /* Offload with IPv4 options is not supported yet */
    if (ip_hdr(skb)->ihl > 5)
    return false;
-- 
2.21.0

-- 
Dipl.-Inf.(FH) Christian Langrock
Senior Consultant
Network & Client Security
Division Public Authorities
secunet Security Networks AG 


Phone: +49 201 5454-3833 
E-Mail: christian.langr...@secunet.com

Ammonstraße 74 
01067 Dresden, Germany
www.secunet.com

__

Registered at: Kurfuerstenstrasse 58, 45138 Essen, Germany 
Amtsgericht Essen HRB 13615
Management Board: Dr Rainer Baumgart (CEO), Thomas Pleines 
Chairman of Supervisory Board: Ralf Wintergerst
__



signature.asc
Description: OpenPGP digital signature

Re: resolve_btfids breaks kernel cross-compilation

2020-09-17 Thread Seth Forshee

On Thu, Sep 17, 2020 at 11:14:06AM +0200, Jiri Olsa wrote:
> On Thu, Sep 17, 2020 at 10:38:12AM +0200, Jiri Olsa wrote:
> > On Thu, Sep 17, 2020 at 10:04:55AM +0200, Jiri Olsa wrote:
> > > On Wed, Sep 16, 2020 at 02:47:33PM -0500, Seth Forshee wrote:
> > > > The requirement to build resolve_btfids whenever CONFIG_DEBUG_INFO_BTF
> > > > is enabled breaks some cross builds. For example, when building a 64-bit
> > > > powerpc kernel on amd64 I get:
> > > > 
> > > >  Auto-detecting system features:
> > > >  ...libelf: [ [32mon[m  ]
> > > >  ...  zlib: [ [32mon[m  ]
> > > >  ...   bpf: [ [31mOFF[m ]
> > > >  
> > > >  BPF API too old
> > > >  make[6]: *** [Makefile:295: bpfdep] Error 1
> > > > 
> > > > The contents of tools/bpf/resolve_btfids/feature/test-bpf.make.output:
> > > > 
> > > >  In file included from 
> > > > /home/sforshee/src/u-k/unstable/tools/arch/powerpc/include/uapi/asm/bitsperlong.h:11,
> > > >   from /usr/include/asm-generic/int-ll64.h:12,
> > > >   from /usr/include/asm-generic/types.h:7,
> > > >   from /usr/include/x86_64-linux-gnu/asm/types.h:1,
> > > >   from 
> > > > /home/sforshee/src/u-k/unstable/tools/include/linux/types.h:10,
> > > >   from 
> > > > /home/sforshee/src/u-k/unstable/tools/include/uapi/linux/bpf.h:11,
> > > >   from test-bpf.c:3:
> > > >  
> > > > /home/sforshee/src/u-k/unstable/tools/include/asm-generic/bitsperlong.h:14:2:
> > > >  error: #error Inconsistent word size. Check asm/bitsperlong.h
> > > > 14 | #error Inconsistent word size. Check asm/bitsperlong.h
> > > >|  ^
> > > > 
> > > > This is because tools/arch/powerpc/include/uapi/asm/bitsperlong.h sets
> > > > __BITS_PER_LONG based on the predefinied compiler macro __powerpc64__,
> > > > which is not defined by the host compiler. What can we do to get cross
> > > > builds working again?
> > > 
> > > could you please share the command line and setup?
> > 
> > I just reproduced.. checking on fix
> 
> I still need to check on few things, but patch below should help

It does help with the word size problem, thanks.

> we might have a problem for cross builds with different endianity
> than the host because libbpf does not support reading BTF data
> with different endianity, and we get:
> 
>   BTFIDS  vmlinux
> libbpf: non-native ELF endianness is not supported

Yes, I see this now when cross building for s390.

Thanks,
Seth

> 
> jirka
> 
> 
> ---
> diff --git a/tools/bpf/resolve_btfids/Makefile 
> b/tools/bpf/resolve_btfids/Makefile
> index a88cd4426398..d3c818b8d8d3 100644
> --- a/tools/bpf/resolve_btfids/Makefile
> +++ b/tools/bpf/resolve_btfids/Makefile
> @@ -1,5 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  include ../../scripts/Makefile.include
> +include ../../scripts/Makefile.arch
>  
>  ifeq ($(srctree),)
>  srctree := $(patsubst %/,%,$(dir $(CURDIR)))
> @@ -29,6 +30,7 @@ endif
>  AR   = $(HOSTAR)
>  CC   = $(HOSTCC)
>  LD   = $(HOSTLD)
> +ARCH = $(HOSTARCH)
>  
>  OUTPUT ?= $(srctree)/tools/bpf/resolve_btfids/
>  
>

Re: [PATCH net-next v2] net: phy: bcm7xxx: request and manage GPHY clock

On Wed, Sep 16, 2020 at 07:04:13PM -0700, Florian Fainelli wrote:
> The internal Gigabit PHY on Broadcom STB chips has a digital clock which
> drives its MDIO interface among other things, the driver now requests
> and manage that clock during .probe() and .remove() accordingly.
> 
> Because the PHY driver can be probed with the clocks turned off we need
> to apply the dummy BMSR workaround during the driver probe function to
> ensure subsequent MDIO read or write towards the PHY will succeed.

Hi Florian

Is it worth mentioning this in the DT binding? It is all pretty much
standard lego pieces, but it has taken you a while to assemble them in
the correct way. So giving hits to others who might want to uses these
STB chips could be nice.

Andrew

Re: [PATCH net 2/2] net: phy: Do not warn in phy_stop() on PHY_DOWN

On Wed, Sep 16, 2020 at 08:43:10PM -0700, Florian Fainelli wrote:
> When phy_is_started() was added to catch incorrect PHY states,
> phy_stop() would not be qualified against PHY_DOWN. It is possible to
> reach that state when the PHY driver has been unbound and the network
> device is then brought down.
> 
> Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
> Signed-off-by: Florian Fainelli 

Reviewed-by: Andrew Lunn 

Andrew

Re: [PATCH net 1/2] net: phy: Avoid NPD upon phy_detach() when driver is unbound

On Wed, Sep 16, 2020 at 08:43:09PM -0700, Florian Fainelli wrote:
> If we have unbound the PHY driver prior to calling phy_detach() (often
> via phy_disconnect()) then we can cause a NULL pointer de-reference
> accessing the driver owner member. The steps to reproduce are:
> 
> echo unimac-mdio-0:01 > /sys/class/net/eth0/phydev/driver/unbind
> ip link set eth0 down

Hi Florian

How forceful is this unbind? Can we actually block it while the
interface is up? Or returning -EBUSY would make sense.

  Andrew

[PATCH v3,net-next,2/4] octeontx2-af: add support to manage the CPT unit

2020-09-17 Thread Srujana Challa

From: Srujana 

The Admin function (AF) manages hardware resources on the cryptographic
acceleration unit(CPT). This patch adds a mailbox interface for PFs and
VFs to configure hardware resources for cryptography and inline-ipsec.

Signed-off-by: Suheil Chandran 
Signed-off-by: Vidya Sagar Velumuri 
Signed-off-by: Lukas Bartosik 
Signed-off-by: Srujana Challa 
---
 .../ethernet/marvell/octeontx2/af/Makefile|   3 +-
 .../net/ethernet/marvell/octeontx2/af/mbox.h  |  85 +
 .../net/ethernet/marvell/octeontx2/af/rvu.c   |   2 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |   7 +
 .../ethernet/marvell/octeontx2/af/rvu_cpt.c   | 343 ++
 .../marvell/octeontx2/af/rvu_debugfs.c| 342 +
 .../ethernet/marvell/octeontx2/af/rvu_nix.c   |  76 
 .../ethernet/marvell/octeontx2/af/rvu_reg.h   |  65 +++-
 8 files changed, 915 insertions(+), 8 deletions(-)
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/Makefile 
b/drivers/net/ethernet/marvell/octeontx2/af/Makefile
index 0bc2410c8949..657a89afbf75 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/Makefile
+++ b/drivers/net/ethernet/marvell/octeontx2/af/Makefile
@@ -8,4 +8,5 @@ obj-$(CONFIG_OCTEONTX2_AF) += octeontx2_af.o
 
 octeontx2_mbox-y := mbox.o
 octeontx2_af-y := cgx.o rvu.o rvu_cgx.o rvu_npa.o rvu_nix.o \
- rvu_reg.o rvu_npc.o rvu_debugfs.o ptp.o
+ rvu_reg.o rvu_npc.o rvu_debugfs.o ptp.o \
+ rvu_cpt.o
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h 
b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
index 4aaef0a2b51c..12e00d06c37a 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h
@@ -157,6 +157,13 @@ M(NPA_HWCTX_DISABLE,   0x403, npa_hwctx_disable, 
hwctx_disable_req, msg_rsp)\
 /* SSO/SSOW mbox IDs (range 0x600 - 0x7FF) */  \
 /* TIM mbox IDs (range 0x800 - 0x9FF) */   \
 /* CPT mbox IDs (range 0xA00 - 0xBFF) */   \
+M(CPT_LF_ALLOC,0xA00, cpt_lf_alloc, cpt_lf_alloc_req_msg,  
\
+  msg_rsp) \
+M(CPT_LF_FREE, 0xA01, cpt_lf_free, msg_req, msg_rsp)   \
+M(CPT_RD_WR_REGISTER,  0xA02, cpt_rd_wr_register,  cpt_rd_wr_reg_msg,  \
+  cpt_rd_wr_reg_msg)   \
+M(CPT_INLINE_IPSEC_CFG,0xA04, cpt_inline_ipsec_cfg,
\
+  cpt_inline_ipsec_cfg_msg, msg_rsp)   \
 /* NPC mbox IDs (range 0x6000 - 0x7FFF) */ \
 M(NPC_MCAM_ALLOC_ENTRY,0x6000, npc_mcam_alloc_entry, 
npc_mcam_alloc_entry_req,\
npc_mcam_alloc_entry_rsp)   \
@@ -222,6 +229,10 @@ M(NIX_BP_ENABLE,   0x8016, nix_bp_enable, nix_bp_cfg_req,  
\
nix_bp_cfg_rsp) \
 M(NIX_BP_DISABLE,  0x8017, nix_bp_disable, nix_bp_cfg_req, msg_rsp) \
 M(NIX_GET_MAC_ADDR, 0x8018, nix_get_mac_addr, msg_req, nix_get_mac_addr_rsp) \
+M(NIX_INLINE_IPSEC_CFG, 0x8019, nix_inline_ipsec_cfg,  \
+   nix_inline_ipsec_cfg, msg_rsp)  \
+M(NIX_INLINE_IPSEC_LF_CFG, 0x801a, nix_inline_ipsec_lf_cfg,\
+   nix_inline_ipsec_lf_cfg, msg_rsp)
 
 /* Messages initiated by AF (range 0xC00 - 0xDFF) */
 #define MBOX_UP_CGX_MESSAGES   \
@@ -715,6 +726,38 @@ struct nix_bp_cfg_rsp {
u8  chan_cnt; /* Number of channel for which bpids are assigned */
 };
 
+/* Global NIX inline IPSec configuration */
+struct nix_inline_ipsec_cfg {
+   struct mbox_msghdr hdr;
+   u32 cpt_credit;
+   struct {
+   u8 egrp;
+   u8 opcode;
+   } gen_cfg;
+   struct {
+   u16 cpt_pf_func;
+   u8 cpt_slot;
+   } inst_qsel;
+   u8 enable;
+};
+
+/* Per NIX LF inline IPSec configuration */
+struct nix_inline_ipsec_lf_cfg {
+   struct mbox_msghdr hdr;
+   u64 sa_base_addr;
+   struct {
+   u32 tag_const;
+   u16 lenm1_max;
+   u8 sa_pow2_size;
+   u8 tt;
+   } ipsec_cfg0;
+   struct {
+   u32 sa_idx_max;
+   u8 sa_idx_w;
+   } ipsec_cfg1;
+   u8 enable;
+};
+
 /* NPC mbox message structs */
 
 #define NPC_MCAM_ENTRY_INVALID 0x
@@ -879,4 +922,46 @@ struct ptp_rsp {
u64 clk;
 };
 
+/* CPT mailbox error codes
+ * Range 901 - 1000.
+ */
+enum cpt_af_status {
+   CPT_AF_ERR_PARAM= -901,
+   CPT_AF_ERR_GRP_INVALID  = -902,
+   CPT_AF_ERR_LF_INVALID   = -903,
+   CPT_AF_ERR_ACCESS_DENIED= -904,
+   CPT_AF_ERR_SSO_PF_FUNC_INVALID  = -905,
+   CPT_AF_ERR_NIX_PF_FUNC_INVALID  = -906,
+

[PATCH net-next v4 4/5] ravb: Split delay handling in parsing and applying

Currently, full delay handling is done in both the probe and resume
paths.  Split it in two parts, so the resume path doesn't have to redo
the parsing part over and over again.

Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Sergei Shtylyov 
Reviewed-by: Florian Fainelli 
---
v4:
  - Add Reviewed-by,

v3:
  - No changes,

v2:
  - Add Reviewed-by,
  - Use 1 instead of true when assigning to a single-bit bitfield.
---
 drivers/net/ethernet/renesas/ravb.h  |  4 +++-
 drivers/net/ethernet/renesas/ravb_main.c | 21 -
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb.h 
b/drivers/net/ethernet/renesas/ravb.h
index 9f88b5db4f89843a..e5ca12ce93c730a9 100644
--- a/drivers/net/ethernet/renesas/ravb.h
+++ b/drivers/net/ethernet/renesas/ravb.h
@@ -1036,7 +1036,9 @@ struct ravb_private {
unsigned no_avb_link:1;
unsigned avb_link_active_low:1;
unsigned wol_enabled:1;
-   int num_tx_desc;/* TX descriptors per packet */
+   unsigned rxcidm:1;  /* RX Clock Internal Delay Mode */
+   unsigned txcidm:1;  /* TX Clock Internal Delay Mode */
+   int num_tx_desc;/* TX descriptors per packet */
 };
 
 static inline u32 ravb_read(struct net_device *ndev, enum ravb_reg reg)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 99f7aae102ce12a1..59dadd971345e0d1 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1989,23 +1989,32 @@ static const struct soc_device_attribute 
ravb_delay_mode_quirk_match[] = {
 };
 
 /* Set tx and rx clock internal delay modes */
-static void ravb_set_delay_mode(struct net_device *ndev)
+static void ravb_parse_delay_mode(struct net_device *ndev)
 {
struct ravb_private *priv = netdev_priv(ndev);
-   int set = 0;
 
if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID ||
priv->phy_interface == PHY_INTERFACE_MODE_RGMII_RXID)
-   set |= APSR_DM_RDM;
+   priv->rxcidm = 1;
 
if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID ||
priv->phy_interface == PHY_INTERFACE_MODE_RGMII_TXID) {
if (!WARN(soc_device_match(ravb_delay_mode_quirk_match),
  "phy-mode %s requires TX clock internal delay mode 
which is not supported by this hardware revision. Please update device tree",
  phy_modes(priv->phy_interface)))
-   set |= APSR_DM_TDM;
+   priv->txcidm = 1;
}
+}
 
+static void ravb_set_delay_mode(struct net_device *ndev)
+{
+   struct ravb_private *priv = netdev_priv(ndev);
+   u32 set = 0;
+
+   if (priv->rxcidm)
+   set |= APSR_DM_RDM;
+   if (priv->txcidm)
+   set |= APSR_DM_TDM;
ravb_modify(ndev, APSR, APSR_DM, set);
 }
 
@@ -2138,8 +2147,10 @@ static int ravb_probe(struct platform_device *pdev)
/* Request GTI loading */
ravb_modify(ndev, GCCR, GCCR_LTI, GCCR_LTI);
 
-   if (priv->chip_id != RCAR_GEN2)
+   if (priv->chip_id != RCAR_GEN2) {
+   ravb_parse_delay_mode(ndev);
ravb_set_delay_mode(ndev);
+   }
 
/* Allocate descriptor base address table */
priv->desc_bat_size = sizeof(struct ravb_desc) * DBAT_ENTRY_NUM;
-- 
2.17.1

Re: RTL8402 stops working after hibernate/resume

2020-09-17 Thread Petr Tesarik

Hi Heiner,

any comment on my findings?

On Thu, 3 Sep 2020 10:41:22 +0200
Petr Tesarik  wrote:

> Hi Heiner,
> 
> this issue was on the back-burner for some time, but I've got some
> interesting news now.
> 
> On Sat, 18 Jul 2020 14:07:50 +0200
> Heiner Kallweit  wrote:
> 
> >[...]
> > Maybe the following gives us an idea:
> > Please do "ethtool -d " after boot and after resume from suspend,
> > and check for differences.  
> 
> The register dump did not reveal anything of interest - the only
> differences were in the physical addresses after a device reopen.
> 
> However, knowing that reloading the driver can fix the issue, I copied
> the initialization sequence from init_one() to rtl8169_resume() and
> gave it a try. That works!
> 
> Then I started removing the initialization calls one by one. This
> exercise left me with a call to rtl_init_rxcfg(), which simply sets the
> RxConfig register. In other words, these is the difference between
> 5.8.4 and my working version:
> 
> --- linux-orig/drivers/net/ethernet/realtek/r8169_main.c  2020-09-02 
> 22:43:09.361951750 +0200
> +++ linux/drivers/net/ethernet/realtek/r8169_main.c   2020-09-03 
> 10:36:23.915803703 +0200
> @@ -4925,6 +4925,9 @@
>  
>   clk_prepare_enable(tp->clk);
>  
> + if (tp->mac_version == RTL_GIGA_MAC_VER_37)
> + RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST);
> +
>   if (netif_running(tp->dev))
>   __rtl8169_resume(tp);
>  
> This is quite surprising, at least when the device is managed by
> NetworkManager, because then it is closed on wakeup, and the open
> method should call rtl_init_rxcfg() anyway. So, it might be a timing
> issue, or incorrect order of register writes.
> 
> Since I have no idea why the above change fixes my issue, I'm hesitant
> to post it as a patch. It might break other people's systems...

Petr T


pgp5Z_fyXGfkJ.pgp
Description: Digitální podpis OpenPGP

[PATCH net-next v4 0/5] net/ravb: Add support for explicit internal clock delay configuration

Hi David, Jakub,

Some Renesas EtherAVB variants support internal clock delay
configuration, which can add larger delays than the delays that are
typically supported by the PHY (using an "rgmii-*id" PHY mode, and/or
"[rt]xc-skew-ps" properties).

Historically, the EtherAVB driver configured these delays based on the
"rgmii-*id" PHY mode. This caused issues with PHY drivers that
implement PHY internal delays properly[1]. Hence a backwards-compatible
workaround was added by masking the PHY mode[2].

This patch series implements the next step of the plan outlined in [3],
and adds proper support for explicit configuration of the MAC internal
clock delays using new "[rt]x-internal-delay-ps" properties. If none of
these properties is present, the driver falls back to the old handling.

This can be considered the MAC counterpart of commit 9150069bf5fc0e86
("dt-bindings: net: Add tx and rx internal delays"), which applies to
the PHY. Note that unlike commit 92252eec913b2dd5 ("net: phy: Add a
helper to return the index for of the internal delay"), no helpers are
provided to parse the DT properties, as so far there is a single user
only, which supports only zero or a single fixed value. Of course such
helpers can be added later, when the need arises, or when deemed useful
otherwise.

This series consists of 3 parts:
1. DT binding updates documenting the new properties, for both the
generic ethernet-controller and the EtherAVB-specific bindings,
2. Conversion to json-schema of the Renesas EtherAVB DT bindings.
Technically, the conversion is independent of all of the above.
I included it in this series, as it shows how all sanity checks on
"[rt]x-internal-delay-ps" values are implemented as DT binding
checks,
3. EtherAVB driver update implementing support for the new properties.

Given Rob has provided his acks for the DT binding updates, all of this
can be merged through net-next.

Changes compared to v3[4]:
- Add Reviewed-by,
- Drop the DT updates, as they will be merged through renesas-devel and
arm-soc, and have a hard dependency on this series.

Changes compared to v2[5]:
- Update recently added board DTS files,
- Add Reviewed-by.

Changes compared to v1[6]:
- Added "[PATCH 1/7] dt-bindings: net: ethernet-controller: Add
internal delay properties",
- Replace "renesas,[rt]xc-delay-ps" by "[rt]x-internal-delay-ps",
- Incorporated EtherAVB DT binding conversion to json-schema,
- Add Reviewed-by.

Impacted, tested:
- Salvator-X(S) with R-Car H3 ES1.0 and ES2.0, M3-W, and M3-N.

Not impacted, tested:
- Ebisu with R-Car E3.

Impacted, not tested:
- Salvator-X(S) with other SoC variants,
- ULCB with R-Car H3/M3-W/M3-N variants,
- V3MSK and Eagle with R-Car V3M,
- Draak with R-Car V3H,
- HiHope RZ/G2[MN] with RZ/G2M or RZ/G2N,
- Beacon EmbeddedWorks RZ/G2M Development Kit.

To ease testing, I have pushed this series and the DT updates to the
topic/ravb-internal-clock-delays-v4 branch of my renesas-drivers
repository at
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git.

Thanks for applying!

References:
[1] Commit bcf3440c6dd78bfe ("net: phy: micrel: add phy-mode support
for the KSZ9031 PHY")
[2] Commit 9b23203c32ee02cd ("ravb: Mask PHY mode to avoid inserting
delays twice").
https://lore.kernel.org/r/20200529122540.31368-1-geert+rene...@glider.be/
[3]
https://lore.kernel.org/r/CAMuHMdU+MR-2tr3-pH55G0GqPG9HwH3XUd=8hzxprfdmgqe...@mail.gmail.com/
[4]
https://lore.kernel.org/linux-devicetree/20200819134344.27813-1-geert+rene...@glider.be/
[5]
https://lore.kernel.org/linux-devicetree/20200706143529.18306-1-geert+rene...@glider.be/
[6]
https://lore.kernel.org/linux-devicetree/20200619191554.24942-1-geert+rene...@glider.be/

Geert Uytterhoeven (5):
dt-bindings: net: ethernet-controller: Add internal delay properties
dt-bindings: net: renesas,ravb: Document internal clock delay
properties
dt-bindings: net: renesas,etheravb: Convert to json-schema
ravb: Split delay handling in parsing and applying
ravb: Add support for explicit internal clock delay configuration

.../bindings/net/ethernet-controller.yaml | 14 +
.../bindings/net/renesas,etheravb.yaml| 261 ++
.../devicetree/bindings/net/renesas,ravb.txt | 134 -
drivers/net/ethernet/renesas/ravb.h | 5 +-
drivers/net/ethernet/renesas/ravb_main.c | 53 +++-
5 files changed, 320 insertions(+), 147 deletions(-)
create mode 100644 Documentation/devicetree/bindings/net/renesas,etheravb.yaml
delete mode 100644 Documentation/devicetree/bindings/net/renesas,ravb.txt

--
2.17.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something

RE: [PATCH net-next 5/8] netdevsim: Add support for add and delete of a PCI PF port




> From: kernel test robot 
> Sent: Thursday, September 17, 2020 4:46 PM
> 
> Hi Parav,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on net-next/master]
> 
> url:https://github.com/0day-ci/linux/commits/Parav-Pandit/devlink-Add-
> SF-add-delete-devlink-ops/20200917-162417
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
> b948577b984a01d24d401d2264efbccc7f0146c1
> config: i386-randconfig-c003-20200917 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> 
> 
> coccinelle warnings: (new ones prefixed by >>)
> 
> >> drivers/net/netdevsim/port_function.c:122:2-3: Unneeded semicolon
>drivers/net/netdevsim/port_function.c:140:2-3: Unneeded semicolon
> 
> Please review and possibly fold the followup patch.
> 

Sending v2 containing the fix.
Thanks.

[PATCH net-next v4 3/5] dt-bindings: net: renesas,etheravb: Convert to json-schema

Convert the Renesas Ethernet AVB (EthernetAVB-IF) Device Tree binding
documentation to json-schema.

Add missing properties.
Update the example to match reality.

Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Sergei Shtylyov 
Reviewed-by: Rob Herring 
Reviewed-by: Florian Fainelli 
---
v4:
  - Add Reviewed-by,

v3:
  - Add Reviewed-by,

v2:
  - Add Reviewed-by,
  - Replace "renesas,[rt]xc-delay-ps" by "[rt]x-internal-delay-ps", for
which the base definition is imported from ethernet-controller.yaml.
---
 .../bindings/net/renesas,etheravb.yaml| 261 ++
 .../devicetree/bindings/net/renesas,ravb.txt  | 137 -
 2 files changed, 261 insertions(+), 137 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/renesas,etheravb.yaml
 delete mode 100644 Documentation/devicetree/bindings/net/renesas,ravb.txt

diff --git a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml 
b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
new file mode 100644
index ..e13653051b23d5f7
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
@@ -0,0 +1,261 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/renesas,etheravb.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Renesas Ethernet AVB
+
+maintainers:
+  - Sergei Shtylyov 
+
+properties:
+  compatible:
+oneOf:
+  - items:
+  - enum:
+  - renesas,etheravb-r8a7742  # RZ/G1H
+  - renesas,etheravb-r8a7743  # RZ/G1M
+  - renesas,etheravb-r8a7744  # RZ/G1N
+  - renesas,etheravb-r8a7745  # RZ/G1E
+  - renesas,etheravb-r8a77470 # RZ/G1C
+  - renesas,etheravb-r8a7790  # R-Car H2
+  - renesas,etheravb-r8a7791  # R-Car M2-W
+  - renesas,etheravb-r8a7792  # R-Car V2H
+  - renesas,etheravb-r8a7793  # R-Car M2-N
+  - renesas,etheravb-r8a7794  # R-Car E2
+  - const: renesas,etheravb-rcar-gen2 # R-Car Gen2 and RZ/G1
+
+  - items:
+  - enum:
+  - renesas,etheravb-r8a774a1 # RZ/G2M
+  - renesas,etheravb-r8a774b1 # RZ/G2N
+  - renesas,etheravb-r8a774c0 # RZ/G2E
+  - renesas,etheravb-r8a7795  # R-Car H3
+  - renesas,etheravb-r8a7796  # R-Car M3-W
+  - renesas,etheravb-r8a77961 # R-Car M3-W+
+  - renesas,etheravb-r8a77965 # R-Car M3-N
+  - renesas,etheravb-r8a77970 # R-Car V3M
+  - renesas,etheravb-r8a77980 # R-Car V3H
+  - renesas,etheravb-r8a77990 # R-Car E3
+  - renesas,etheravb-r8a77995 # R-Car D3
+  - const: renesas,etheravb-rcar-gen3 # R-Car Gen3 and RZ/G2
+
+  reg: true
+
+  interrupts: true
+
+  interrupt-names: true
+
+  clocks:
+maxItems: 1
+
+  iommus:
+maxItems: 1
+
+  power-domains:
+maxItems: 1
+
+  resets:
+maxItems: 1
+
+  phy-mode: true
+
+  phy-handle: true
+
+  '#address-cells':
+description: Number of address cells for the MDIO bus.
+const: 1
+
+  '#size-cells':
+description: Number of size cells on the MDIO bus.
+const: 0
+
+  renesas,no-ether-link:
+type: boolean
+description:
+  Specify when a board does not provide a proper AVB_LINK signal.
+
+  renesas,ether-link-active-low:
+type: boolean
+description:
+  Specify when the AVB_LINK signal is active-low instead of normal
+  active-high.
+
+  rx-internal-delay-ps:
+enum: [0, 1800]
+
+  tx-internal-delay-ps:
+enum: [0, 2000]
+
+patternProperties:
+  "^ethernet-phy@[0-9a-f]$":
+type: object
+$ref: ethernet-phy.yaml#
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - power-domains
+  - resets
+  - phy-mode
+  - phy-handle
+  - '#address-cells'
+  - '#size-cells'
+
+allOf:
+  - $ref: ethernet-controller.yaml#
+
+  - if:
+  properties:
+compatible:
+  contains:
+enum:
+  - renesas,etheravb-rcar-gen2
+  - renesas,etheravb-r8a7795
+  - renesas,etheravb-r8a7796
+  - renesas,etheravb-r8a77961
+  - renesas,etheravb-r8a77965
+then:
+  properties:
+reg:
+  items:
+- description: MAC register block
+- description: Stream buffer
+else:
+  properties:
+reg:
+  items:
+- description: MAC register block
+
+  - if:
+  properties:
+compatible:
+  contains:
+const: renesas,etheravb-rcar-gen2
+then:
+  properties:
+interrupts:
+  maxItems: 1
+interrupt-names:
+  items:
+- const: mux
+rx-internal-delay-ps: false
+else:
+  properties:
+interrupts:
+  minItems: 25
+  maxItems: 25
+inter

[PATCH bpf-next] bpf: add support for other map types to bpf_map_lookup_and_delete_elem

2020-09-17 Thread Luka Oreskovic

Since this function already exists, it made sense to implement it for
map types other than stack and queue. This patch adds the necessary parts
from bpf_map_lookup_elem and bpf_map_delete_elem so it works as expected
for all map types.

Signed-off-by: Luka Oreskovic 
CC: Juraj Vijtiuk 
CC: Luka Perkov 
---
 kernel/bpf/syscall.c | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2ce32cad5c8e..955de6ca8c45 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1475,6 +1475,9 @@ static int map_lookup_and_delete_elem(union bpf_attr 
*attr)
if (CHECK_ATTR(BPF_MAP_LOOKUP_AND_DELETE_ELEM))
return -EINVAL;
 
+   if (attr->flags & ~BPF_F_LOCK)
+   return -EINVAL;
+
f = fdget(ufd);
map = __bpf_map_get(f);
if (IS_ERR(map))
@@ -1485,13 +1488,19 @@ static int map_lookup_and_delete_elem(union bpf_attr 
*attr)
goto err_put;
}
 
+   if ((attr->flags & BPF_F_LOCK) &&
+   !map_value_has_spin_lock(map)) {
+   err = -EINVAL;
+   goto err_put;
+   }
+
key = __bpf_copy_key(ukey, map->key_size);
if (IS_ERR(key)) {
err = PTR_ERR(key);
goto err_put;
}
 
-   value_size = map->value_size;
+   value_size = bpf_map_value_size(map);
 
err = -ENOMEM;
value = kmalloc(value_size, GFP_USER | __GFP_NOWARN);
@@ -1502,7 +1511,24 @@ static int map_lookup_and_delete_elem(union bpf_attr 
*attr)
map->map_type == BPF_MAP_TYPE_STACK) {
err = map->ops->map_pop_elem(map, value);
} else {
-   err = -ENOTSUPP;
+   err = bpf_map_copy_value(map, key, value, attr->flags);
+   if (err)
+   goto free_value;
+
+   if (bpf_map_is_dev_bound(map)) {
+   err = bpf_map_offload_delete_elem(map, key);
+   } else if (IS_FD_PROG_ARRAY(map) ||
+  map->map_type == BPF_MAP_TYPE_STRUCT_OPS) {
+   /* These maps require sleepable context */
+   err = map->ops->map_delete_elem(map, key);
+   } else {
+   bpf_disable_instrumentation();
+   rcu_read_lock();
+   err = map->ops->map_delete_elem(map, key);
+   rcu_read_unlock();
+   bpf_enable_instrumentation();
+   maybe_wait_bpf_programs(map);
+   }
}
 
if (err)
-- 
2.26.2

[PATCH net-next v4 1/5] dt-bindings: net: ethernet-controller: Add internal delay properties

Internal Receive and Transmit Clock Delays are a common setting for
RGMII capable devices.

While these delays are typically applied by the PHY, some MACs support
configuring internal clock delay settings, too.  Hence add standardized
properties to configure this.

This is the MAC counterpart of commit 9150069bf5fc0e86 ("dt-bindings:
net: Add tx and rx internal delays"), which applies to the PHY.

Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Rob Herring 
Reviewed-by: Florian Fainelli 
---
v4:
  - Add Reviewed-by,

v3:
  - Add Reviewed-by,

v2:
  - New.
---
 .../bindings/net/ethernet-controller.yaml  | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml 
b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
index 1c4474036d46a9dc..e9bb386066540676 100644
--- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml
+++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
@@ -119,6 +119,13 @@ properties:
   and is useful for determining certain configuration settings
   such as flow control thresholds.
 
+  rx-internal-delay-ps:
+$ref: /schemas/types.yaml#/definitions/uint32
+description: |
+  RGMII Receive Clock Delay defined in pico seconds.
+  This is used for controllers that have configurable RX internal delays.
+  If this property is present then the MAC applies the RX delay.
+
   sfp:
 $ref: /schemas/types.yaml#definitions/phandle
 description:
@@ -130,6 +137,13 @@ properties:
   The size of the controller\'s transmit fifo in bytes. This
   is used for components that can have configurable fifo sizes.
 
+  tx-internal-delay-ps:
+$ref: /schemas/types.yaml#/definitions/uint32
+description: |
+  RGMII Transmit Clock Delay defined in pico seconds.
+  This is used for controllers that have configurable TX internal delays.
+  If this property is present then the MAC applies the TX delay.
+
   managed:
 description:
   Specifies the PHY management type. If auto is set and fixed-link
-- 
2.17.1

[PATCH net-next v4 2/5] dt-bindings: net: renesas,ravb: Document internal clock delay properties

Some EtherAVB variants support internal clock delay configuration, which
can add larger delays than the delays that are typically supported by
the PHY (using an "rgmii-*id" PHY mode, and/or "[rt]xc-skew-ps"
properties).

Add properties for configuring the internal MAC delays.
These properties are mandatory, even when specified as zero, to
distinguish between old and new DTBs.

Update the (bogus) example accordingly.

Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Sergei Shtylyov 
Reviewed-by: Rob Herring 
Reviewed-by: Florian Fainelli 
---
v4:
  - Add Reviewed-by,

v3:
  - Add Reviewed-by,

v2:
  - Replace "renesas,[rt]xc-delay-ps" by "[rt]x-internal-delay-ps",
  - Add "(bogus)" to the example update, to avoid people considering it
a one-to-one conversion.
---
 .../devicetree/bindings/net/renesas,ravb.txt  | 29 ++-
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/renesas,ravb.txt 
b/Documentation/devicetree/bindings/net/renesas,ravb.txt
index 032b76f14f4fdb38..4a62dd11d5c488f4 100644
--- a/Documentation/devicetree/bindings/net/renesas,ravb.txt
+++ b/Documentation/devicetree/bindings/net/renesas,ravb.txt
@@ -64,6 +64,18 @@ Optional properties:
 AVB_LINK signal.
 - renesas,ether-link-active-low: boolean, specify when the AVB_LINK signal is
 active-low instead of normal active-high.
+- rx-internal-delay-ps: Internal RX clock delay.
+   This property is mandatory and valid only on R-Car Gen3
+   and RZ/G2 SoCs.
+   Valid values are 0 and 1800.
+   A non-zero value is allowed only if phy-mode = "rgmii".
+   Zero is not supported on R-Car D3.
+- tx-internal-delay-ps: Internal TX clock delay.
+   This property is mandatory and valid only on R-Car H3,
+   M3-W, M3-W+, M3-N, V3M, and V3H, and RZ/G2M and RZ/G2N.
+   Valid values are 0 and 2000.
+   A non-zero value is allowed only if phy-mode = "rgmii".
+   Zero is not supported on R-Car V3H.
 
 Example:
 
@@ -105,8 +117,10 @@ Example:
  "ch24";
clocks = <&cpg CPG_MOD 812>;
power-domains = <&cpg>;
-   phy-mode = "rgmii-id";
+   phy-mode = "rgmii";
phy-handle = <&phy0>;
+   rx-internal-delay-ps = <0>;
+   tx-internal-delay-ps = <2000>;
 
pinctrl-0 = <ðer_pins>;
pinctrl-names = "default";
@@ -115,18 +129,7 @@ Example:
#size-cells = <0>;
 
phy0: ethernet-phy@0 {
-   rxc-skew-ps = <900>;
-   rxdv-skew-ps = <0>;
-   rxd0-skew-ps = <0>;
-   rxd1-skew-ps = <0>;
-   rxd2-skew-ps = <0>;
-   rxd3-skew-ps = <0>;
-   txc-skew-ps = <900>;
-   txen-skew-ps = <0>;
-   txd0-skew-ps = <0>;
-   txd1-skew-ps = <0>;
-   txd2-skew-ps = <0>;
-   txd3-skew-ps = <0>;
+   rxc-skew-ps = <1500>;
reg = <0>;
interrupt-parent = <&gpio2>;
interrupts = <11 IRQ_TYPE_LEVEL_LOW>;
-- 
2.17.1

[PATCH v3,net-next,0/4] Add Support for Marvell OcteonTX2 Cryptographic

2020-09-17 Thread Srujana Challa

The following series adds support for Marvell Cryptographic Acceleration
Unit(CPT) on OcteonTX2 CN96XX SoC.
This series is tested with CRYPTO_EXTRA_TESTS enabled and
CRYPTO_DISABLE_TESTS disabled.

Changes since v2:
 * Fixed C=1 warnings.
 * Added code to exit CPT VF driver gracefully.
 * Moved OcteonTx2 asm code to a header file under include/linux/soc/

Changes since v1:
 * Moved Makefile changes from patch4 to patch2 and patch3.

Srujana Challa (3):
  octeontx2-pf: move asm code to include/linux/soc
  octeontx2-af: add support to manage the CPT unit
  drivers: crypto: add support for OCTEONTX2 CPT engine
  drivers: crypto: add the Virtual Function driver for OcteonTX2 CPT

 MAINTAINERS   |2 +
 drivers/crypto/marvell/Kconfig|   17 +
 drivers/crypto/marvell/Makefile   |1 +
 drivers/crypto/marvell/octeontx2/Makefile |   10 +
 .../marvell/octeontx2/otx2_cpt_common.h   |   53 +
 .../marvell/octeontx2/otx2_cpt_hw_types.h |  467 
 .../marvell/octeontx2/otx2_cpt_mbox_common.c  |  286 +++
 .../marvell/octeontx2/otx2_cpt_mbox_common.h  |  100 +
 .../marvell/octeontx2/otx2_cpt_reqmgr.h   |  197 ++
 drivers/crypto/marvell/octeontx2/otx2_cptlf.h |  356 +++
 .../marvell/octeontx2/otx2_cptlf_main.c   |  967 
 drivers/crypto/marvell/octeontx2/otx2_cptpf.h |   79 +
 .../marvell/octeontx2/otx2_cptpf_main.c   |  598 +
 .../marvell/octeontx2/otx2_cptpf_mbox.c   |  694 ++
 .../marvell/octeontx2/otx2_cptpf_ucode.c  | 2173 +
 .../marvell/octeontx2/otx2_cptpf_ucode.h  |  180 ++
 drivers/crypto/marvell/octeontx2/otx2_cptvf.h |   29 +
 .../marvell/octeontx2/otx2_cptvf_algs.c   | 1698 +
 .../marvell/octeontx2/otx2_cptvf_algs.h   |  172 ++
 .../marvell/octeontx2/otx2_cptvf_main.c   |  229 ++
 .../marvell/octeontx2/otx2_cptvf_mbox.c   |  189 ++
 .../marvell/octeontx2/otx2_cptvf_reqmgr.c |  540 
 .../ethernet/marvell/octeontx2/af/Makefile|3 +-
 .../net/ethernet/marvell/octeontx2/af/mbox.h  |   85 +
 .../net/ethernet/marvell/octeontx2/af/rvu.c   |2 +-
 .../net/ethernet/marvell/octeontx2/af/rvu.h   |7 +
 .../ethernet/marvell/octeontx2/af/rvu_cpt.c   |  343 +++
 .../marvell/octeontx2/af/rvu_debugfs.c|  342 +++
 .../ethernet/marvell/octeontx2/af/rvu_nix.c   |   76 +
 .../ethernet/marvell/octeontx2/af/rvu_reg.h   |   65 +-
 .../marvell/octeontx2/nic/otx2_common.h   |   13 +-
 include/linux/soc/marvell/octeontx2/asm.h |   29 +
 32 files changed, 9982 insertions(+), 20 deletions(-)
 create mode 100644 drivers/crypto/marvell/octeontx2/Makefile
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cpt_common.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cpt_hw_types.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cpt_mbox_common.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cpt_mbox_common.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cpt_reqmgr.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptlf.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptlf_main.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptpf.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptpf_main.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptpf_mbox.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptpf_ucode.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptvf.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptvf_algs.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptvf_algs.h
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptvf_main.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptvf_mbox.c
 create mode 100644 drivers/crypto/marvell/octeontx2/otx2_cptvf_reqmgr.c
 create mode 100644 drivers/net/ethernet/marvell/octeontx2/af/rvu_cpt.c
 create mode 100644 include/linux/soc/marvell/octeontx2/asm.h

-- 
2.28.0

[PATCH net-next v4 5/5] ravb: Add support for explicit internal clock delay configuration

Some EtherAVB variants support internal clock delay configuration, which
can add larger delays than the delays that are typically supported by
the PHY (using an "rgmii-*id" PHY mode, and/or "[rt]xc-skew-ps"
properties).

Historically, the EtherAVB driver configured these delays based on the
"rgmii-*id" PHY mode.  This caused issues with PHY drivers that
implement PHY internal delays properly[1].  Hence a backwards-compatible
workaround was added by masking the PHY mode[2].

Add proper support for explicit configuration of the MAC internal clock
delays using the new "[rt]x-internal-delay-ps" properties.
Fall back to the old handling if none of these properties is present.

[1] Commit bcf3440c6dd78bfe ("net: phy: micrel: add phy-mode support for
the KSZ9031 PHY")
[2] Commit 9b23203c32ee02cd ("ravb: Mask PHY mode to avoid inserting
delays twice").

Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Sergei Shtylyov 
Reviewed-by: Florian Fainelli 
---
v4:
  - Add Reviewed-by,

v3:
  - No changes,

v2:
  - Add Reviewed-by,
  - Split long line,
  - Replace "renesas,[rt]xc-delay-ps" by "[rt]x-internal-delay-ps",
  - Use 1 instead of true when assigning to a single-bit bitfield.
---
 drivers/net/ethernet/renesas/ravb.h  |  1 +
 drivers/net/ethernet/renesas/ravb_main.c | 36 ++--
 2 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb.h 
b/drivers/net/ethernet/renesas/ravb.h
index e5ca12ce93c730a9..7453b17a37a2c8d0 100644
--- a/drivers/net/ethernet/renesas/ravb.h
+++ b/drivers/net/ethernet/renesas/ravb.h
@@ -1038,6 +1038,7 @@ struct ravb_private {
unsigned wol_enabled:1;
unsigned rxcidm:1;  /* RX Clock Internal Delay Mode */
unsigned txcidm:1;  /* TX Clock Internal Delay Mode */
+   unsigned rgmii_override:1;  /* Deprecated rgmii-*id behavior */
int num_tx_desc;/* TX descriptors per packet */
 };
 
diff --git a/drivers/net/ethernet/renesas/ravb_main.c 
b/drivers/net/ethernet/renesas/ravb_main.c
index 59dadd971345e0d1..aa120e3f1e4d4da5 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1034,11 +1034,8 @@ static int ravb_phy_init(struct net_device *ndev)
pn = of_node_get(np);
}
 
-   iface = priv->phy_interface;
-   if (priv->chip_id != RCAR_GEN2 && phy_interface_mode_is_rgmii(iface)) {
-   /* ravb_set_delay_mode() takes care of internal delay mode */
-   iface = PHY_INTERFACE_MODE_RGMII;
-   }
+   iface = priv->rgmii_override ? PHY_INTERFACE_MODE_RGMII
+: priv->phy_interface;
phydev = of_phy_connect(ndev, pn, ravb_adjust_link, 0, iface);
of_node_put(pn);
if (!phydev) {
@@ -1989,20 +1986,41 @@ static const struct soc_device_attribute 
ravb_delay_mode_quirk_match[] = {
 };
 
 /* Set tx and rx clock internal delay modes */
-static void ravb_parse_delay_mode(struct net_device *ndev)
+static void ravb_parse_delay_mode(struct device_node *np, struct net_device 
*ndev)
 {
struct ravb_private *priv = netdev_priv(ndev);
+   bool explicit_delay = false;
+   u32 delay;
+
+   if (!of_property_read_u32(np, "rx-internal-delay-ps", &delay)) {
+   /* Valid values are 0 and 1800, according to DT bindings */
+   priv->rxcidm = !!delay;
+   explicit_delay = true;
+   }
+   if (!of_property_read_u32(np, "tx-internal-delay-ps", &delay)) {
+   /* Valid values are 0 and 2000, according to DT bindings */
+   priv->txcidm = !!delay;
+   explicit_delay = true;
+   }
 
+   if (explicit_delay)
+   return;
+
+   /* Fall back to legacy rgmii-*id behavior */
if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID ||
-   priv->phy_interface == PHY_INTERFACE_MODE_RGMII_RXID)
+   priv->phy_interface == PHY_INTERFACE_MODE_RGMII_RXID) {
priv->rxcidm = 1;
+   priv->rgmii_override = 1;
+   }
 
if (priv->phy_interface == PHY_INTERFACE_MODE_RGMII_ID ||
priv->phy_interface == PHY_INTERFACE_MODE_RGMII_TXID) {
if (!WARN(soc_device_match(ravb_delay_mode_quirk_match),
  "phy-mode %s requires TX clock internal delay mode 
which is not supported by this hardware revision. Please update device tree",
- phy_modes(priv->phy_interface)))
+ phy_modes(priv->phy_interface))) {
priv->txcidm = 1;
+   priv->rgmii_override = 1;
+   }
}
 }
 
@@ -2148,7 +2166,7 @@ static int ravb_probe(struct platform_device *pdev)
ravb_modify(ndev, GCCR, GCCR_LTI, GCCR_LTI);
 
if (priv->chip_id != RCAR_GEN2) {
-   ravb_parse_delay_mode(ndev);
+   ravb_parse_delay_mode(np, ndev);

Re: [PATCH -next v2] dpaa2-eth: Convert to DEFINE_SHOW_ATTRIBUTE

2020-09-17 Thread Ioana Ciornei

On Thu, Sep 17, 2020 at 08:45:08PM +0800, Qinglang Miao wrote:
> Signed-off-by: Qinglang Miao 

Reviewed-by: Ioana Ciornei 

> ---
> v2: based on linux-next(20200917), and can be applied to
> mainline cleanly now.
> 
>  .../freescale/dpaa2/dpaa2-eth-debugfs.c   | 63 ++-
>  1 file changed, 6 insertions(+), 57 deletions(-)
>

[PATCH v3,net-next,1/4] octeontx2-pf: move asm code to include/linux/soc

2020-09-17 Thread Srujana Challa

On OcteonTX2 platform CPT instruction enqueue and NIX
packet send are only possible via LMTST operations which
uses LDEOR instruction. This patch moves the asm code
from OcteonTX2 nic driver to include/linux/soc as it
will be used by OcteonTX2 CPT and NIC driver for
LMTST.

Signed-off-by: Srujana Challa 
---
 MAINTAINERS   |  2 ++
 .../marvell/octeontx2/nic/otx2_common.h   | 13 +
 include/linux/soc/marvell/octeontx2/asm.h | 29 +++
 3 files changed, 32 insertions(+), 12 deletions(-)
 create mode 100644 include/linux/soc/marvell/octeontx2/asm.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 9a545a631f0d..95ddbb4f1a89 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10431,6 +10431,7 @@ M:  Srujana Challa 
 L: linux-cry...@vger.kernel.org
 S: Maintained
 F: drivers/crypto/marvell/
+F: include/linux/soc/marvell/octeontx2/
 
 MARVELL GIGABIT ETHERNET DRIVERS (skge/sky2)
 M: Mirko Lindner 
@@ -10503,6 +10504,7 @@ M:  hariprasad 
 L: netdev@vger.kernel.org
 S: Supported
 F: drivers/net/ethernet/marvell/octeontx2/nic/
+F: include/linux/soc/marvell/octeontx2/
 
 MARVELL OCTEONTX2 RVU ADMIN FUNCTION DRIVER
 M: Sunil Goutham 
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h 
b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
index ac47762cce9b..12311964d9d6 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include "otx2_reg.h"
@@ -420,21 +421,9 @@ static inline u64 otx2_atomic64_add(u64 incr, u64 *ptr)
return result;
 }
 
-static inline u64 otx2_lmt_flush(uint64_t addr)
-{
-   u64 result = 0;
-
-   __asm__ volatile(".cpu  generic+lse\n"
-"ldeor xzr,%x[rf],[%[rs]]"
-: [rf]"=r"(result)
-: [rs]"r"(addr));
-   return result;
-}
-
 #else
 #define otx2_write128(lo, hi, addr)
 #define otx2_atomic64_add(incr, ptr)   ({ *ptr += incr; })
-#define otx2_lmt_flush(addr)   ({ 0; })
 #endif
 
 /* Alloc pointer from pool/aura */
diff --git a/include/linux/soc/marvell/octeontx2/asm.h 
b/include/linux/soc/marvell/octeontx2/asm.h
new file mode 100644
index ..ae2279fe830a
--- /dev/null
+++ b/include/linux/soc/marvell/octeontx2/asm.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ * Copyright (C) 2020 Marvell.
+ */
+
+#ifndef __SOC_OTX2_ASM_H
+#define __SOC_OTX2_ASM_H
+
+#if defined(CONFIG_ARM64)
+/*
+ * otx2_lmt_flush is used for LMT store operation.
+ * On octeontx2 platform CPT instruction enqueue and
+ * NIX packet send are only possible via LMTST
+ * operations and it uses LDEOR instruction targeting
+ * the coprocessor address.
+ */
+#define otx2_lmt_flush(ioaddr)  \
+({  \
+   u64 result = 0; \
+   __asm__ volatile(".cpu  generic+lse\n"  \
+"ldeor xzr, %x[rf], [%[rs]]"   \
+: [rf]"=r" (result)\
+: [rs]"r" (ioaddr));   \
+   (result);   \
+})
+#else
+#define otx2_lmt_flush(ioaddr)  ({ 0; })
+#endif
+
+#endif /* __SOC_OTX2_ASM_H */
-- 
2.28.0

Re: [PATCH bpf v1] tools/bpftool: support passing BPFTOOL_VERSION to make

2020-09-17 Thread Quentin Monnet

On 17/09/2020 12:58, Tony Ambardar wrote:
> This change facilitates out-of-tree builds, packaging, and versioning for
> test and debug purposes. Defining BPFTOOL_VERSION allows self-contained
> builds within the tools tree, since it avoids use of the 'kernelversion'
> target in the top-level makefile, which would otherwise pull in several
> other includes from outside the tools tree.
> 
> Signed-off-by: Tony Ambardar 

Acked-by: Quentin Monnet

Re: [PATCH 3/3] docs: bpf: ringbuf.rst: fix a broken cross-reference

2020-09-17 Thread Alexei Starovoitov

On Thu, Sep 17, 2020 at 1:04 AM Mauro Carvalho Chehab
 wrote:
>
> Sphinx warns about a broken cross-reference:
>
> Documentation/bpf/ringbuf.rst:194: WARNING: Unknown target name: 
> "bench_ringbufs.c".
>
> It seems that the original idea were to add a reference for this file:
>
> tools/testing/selftests/bpf/benchs/bench_ringbufs.c
>
> However, this won't work as such file is not part of the
> documentation output dir. It could be possible to use
> an extension like interSphinx in order to make external
> references to be pointed to some website (like kernel.org),
> where the file is stored, but currently we don't use it.
>
> It would also be possible to include this file as a
> literal include, placing it inside Documentation/bpf.
>
> For now, let's take the simplest approach: just drop
> the "_" markup at the end of the reference. This
> should solve the warning, and it sounds quite obvious
> that the file to see is at the Kernel tree.
>
> Signed-off-by: Mauro Carvalho Chehab 
> ---
>  Documentation/bpf/ringbuf.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/bpf/ringbuf.rst b/Documentation/bpf/ringbuf.rst
> index 4d4f3bcb1477..6a615cd62bda 100644
> --- a/Documentation/bpf/ringbuf.rst
> +++ b/Documentation/bpf/ringbuf.rst
> @@ -197,7 +197,7 @@ a self-pacing notifications of new data being 
> availability.
>  being available after commit only if consumer has already caught up right up 
> to
>  the record being committed. If not, consumer still has to catch up and thus
>  will see new data anyways without needing an extra poll notification.
> -Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbufs.c_) show 
> that
> +Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbufs.c) show 
> that

This fix already landed in bpf and net trees.
Did you miss it?

Re: ath11k: initialize wmi config based on hw_params

2020-09-17 Thread Kalle Valo

Colin Ian King  writes:

> Hi,
>
> static analysis with Coverity has detected a duplicated assignment issue
> with the following commit:
>
> commit 2d4bcbed5b7d53e19fc158885e7340b464b64507
> Author: Carl Huang 
> Date:   Mon Aug 17 13:31:51 2020 +0300
>
> ath11k: initialize wmi config based on hw_params
>
> The analysis is as follows:
>
>
>  74config->beacon_tx_offload_max_vdev = 0x2;
>  75config->num_multicast_filter_entries = 0x20;
>  76config->num_wow_filters = 0x16;
>
> Unused value (UNUSED_VALUE)
> assigned_value: Assigning value 1U to config->num_keep_alive_pattern
> here, but that stored value is overwritten before it can be used.
>  77config->num_keep_alive_pattern = 0x1;
>
> value_overwrite: Overwriting previous write to
> config->num_keep_alive_pattern with value 0U.
>
>  78config->num_keep_alive_pattern = 0;
>
>
> I'm not sure if one of these assignments is redundant, or perhaps one of
> the assignments is meant to be setting a different structure element.

0x1 assignment should be removed, I'll send a patch.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

[PATCH net-next] selftests: Set default protocol for raw sockets in nettest

2020-09-17 Thread David Ahern

IPPROTO_IP (0) is not valid for raw sockets. Default the protocol for
raw sockets to IPPROTO_RAW if the protocol has not been set via the -P
option.

Signed-off-by: David Ahern 
---
 tools/testing/selftests/net/nettest.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/net/nettest.c 
b/tools/testing/selftests/net/nettest.c
index 93208caacbe6..f75c53ce0a2d 100644
--- a/tools/testing/selftests/net/nettest.c
+++ b/tools/testing/selftests/net/nettest.c
@@ -1667,6 +1667,8 @@ int main(int argc, char *argv[])
case 'R':
args.type = SOCK_RAW;
args.port = 0;
+   if (!args.protocol)
+   args.protocol = IPPROTO_RAW;
break;
case 'P':
pe = getprotobyname(optarg);
-- 
2.24.3 (Apple Git-128)

Re: [PATCH net-next 6/7] lockdep: provide dummy forward declaration of *_is_held() helpers

2020-09-17 Thread peterz

On Wed, Sep 16, 2020 at 11:45:27AM -0700, Jakub Kicinski wrote:
> When CONFIG_LOCKDEP is not set, lock_is_held() and lockdep_is_held()
> are not declared or defined. This forces all callers to use ifdefs
> around these checks.
> 
> Recent RCU changes added a lot of lockdep_is_held() calls inside
> rcu_dereference_protected(). rcu_dereference_protected() hides
> its argument on !LOCKDEP builds, but this may lead to unused variable
> warnings.
> 
> Provide forward declarations of lock_is_held() and lockdep_is_held()
> but never define them. This way callers can keep them visible to
> the compiler on !LOCKDEP builds and instead depend on dead code
> elimination to remove the references before the linker barfs.
> 
> We need lock_is_held() for RCU.
> 
> Signed-off-by: Jakub Kicinski 
> --
> CC: pet...@infradead.org
> CC: mi...@redhat.com
> CC: w...@kernel.org
> ---
>  include/linux/lockdep.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
> index 6a584b3e5c74..6b5bbc536bf6 100644
> --- a/include/linux/lockdep.h
> +++ b/include/linux/lockdep.h
> @@ -371,6 +371,12 @@ static inline void lockdep_unregister_key(struct 
> lock_class_key *key)
>  
>  #define lockdep_depth(tsk)   (0)
>  
> +/*
> + * Dummy forward declarations, allow users to write less ifdef-y code
> + * and depend on dead code elimination.
> + */
> +int lock_is_held(const void *);

extern int lock_is_held(const struct lockdep_map *);

> +int lockdep_is_held(const void *);

extern

I suppose we can't pull the lockdep_is_held() definition out from under
CONFIG_LOCKDEP because it does the ->dep_map dereference and many types
will simply not have that member.

>  #define lockdep_is_held_type(l, r)   (1)
>  
>  #define lockdep_assert_held(l)   do { (void)(l); } while 
> (0)

Re: [PATCH bpf-next v4] bpf: using rcu_read_lock for bpf_sk_storage_map iterator

On Wed, Sep 16, 2020 at 03:46:45PM -0700, Yonghong Song wrote:
> If a bucket contains a lot of sockets, during bpf_iter traversing
> a bucket, concurrent userspace bpf_map_update_elem() and
> bpf program bpf_sk_storage_{get,delete}() may experience
> some undesirable delays as they will compete with bpf_iter
> for bucket lock.
> 
> Note that the number of buckets for bpf_sk_storage_map
> is roughly the same as the number of cpus. So if there
> are lots of sockets in the system, each bucket could
> contain lots of sockets.
> 
> Different actual use cases may experience different delays.
> Here, using selftest bpf_iter subtest bpf_sk_storage_map,
> I hacked the kernel with ktime_get_mono_fast_ns()
> to collect the time when a bucket was locked
> during bpf_iter prog traversing that bucket. This way,
> the maximum incurred delay was measured w.r.t. the
> number of elements in a bucket.
> # elems in each bucket  delay(ns)
>   6417000
>   256   72512
>   2048  875246
> 
> The potential delays will be further increased if
> we have even more elemnts in a bucket. Using rcu_read_lock()
> is a reasonable compromise here. It may lose some precision, e.g.,
> access stale sockets, but it will not hurt performance of
> bpf program or user space application which also tries
> to get/delete or update map elements.
Acked-by: Martin KaFai Lau

[PATCH net-next] net: mdio: octeon: Select MDIO_DEVRES

This driver makes use of devm_mdiobus_alloc_size. To ensure this is
available select MDIO_DEVRES which provides it.

Signed-off-by: Andrew Lunn 
---
 drivers/net/mdio/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/mdio/Kconfig b/drivers/net/mdio/Kconfig
index 1299880dfe74..840727cc9499 100644
--- a/drivers/net/mdio/Kconfig
+++ b/drivers/net/mdio/Kconfig
@@ -138,6 +138,7 @@ config MDIO_OCTEON
depends on (64BIT && OF_MDIO) || COMPILE_TEST
depends on HAS_IOMEM
select MDIO_CAVIUM
+   select MDIO_DEVRES
help
  This module provides a driver for the Octeon and ThunderX MDIO
  buses. It is required by the Octeon and ThunderX ethernet device
-- 
2.28.0

Re: [PATCH] ath10k: qmi: Skip host capability request for Xiaomi Poco F1

2020-09-17 Thread Bjorn Andersson

On Thu 17 Sep 02:41 CDT 2020, Amit Pundir wrote:

> Workaround to get WiFi working on Xiaomi Poco F1 (sdm845)
> phone. We get a non-fatal QMI_ERR_MALFORMED_MSG_V01 error
> message in ath10k_qmi_host_cap_send_sync(), but we can still
> bring up WiFi services successfully on AOSP if we ignore it.
> 
> We suspect either the host cap is not implemented or there
> may be firmware specific issues. Firmware version is
> QC_IMAGE_VERSION_STRING=WLAN.HL.2.0.c3-00257-QCAHLSWMTPLZ-1
> 
> qcom,snoc-host-cap-8bit-quirk didn't help. If I use this
> quirk, then the host capability request does get accepted,
> but we run into fatal "msa info req rejected" error and
> WiFi interface doesn't come up.
> 

What happens if you skip sending the host-cap message? I had one
firmware version for which I implemented a
"qcom,snoc-host-cap-skip-quirk".

But testing showed that the link was pretty unusable - pushing any real
amount of data would cause it to silently stop working - and I realized
that I could use the linux-firmware wlanmdsp.mbn instead, which works
great on all my devices...

> Attempts are being made to debug the failure reasons but no
> luck so far. Hence this device specific workaround instead
> of checking for QMI_ERR_MALFORMED_MSG_V01 error message.
> Tried ath10k/WCN3990/hw1.0/wlanmdsp.mbn from the upstream
> linux-firmware project but it didn't help and neither did
> building board-2.bin file from stock bdwlan* files.
> 

"Didn't work" as in the wlanmdsp.mbn from linux-firmware failed to load
or some laer problem?

Regards,
Bjorn

> This workaround will be removed once we have a viable fix.
> Thanks to postmarketOS guys for catching this.
> 
> Signed-off-by: Amit Pundir 
> ---
> Device-tree for Xiaomi Poco F1(Beryllium) got merged in
> qcom/arm64-for-5.10 last week
> https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git/commit/?id=77809cf74a8c
> 
>  drivers/net/wireless/ath/ath10k/qmi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/ath/ath10k/qmi.c 
> b/drivers/net/wireless/ath/ath10k/qmi.c
> index 0dee1353d395..37c5350eb8b1 100644
> --- a/drivers/net/wireless/ath/ath10k/qmi.c
> +++ b/drivers/net/wireless/ath/ath10k/qmi.c
> @@ -651,7 +651,8 @@ static int ath10k_qmi_host_cap_send_sync(struct 
> ath10k_qmi *qmi)
>  
>   /* older FW didn't support this request, which is not fatal */
>   if (resp.resp.result != QMI_RESULT_SUCCESS_V01 &&
> - resp.resp.error != QMI_ERR_NOT_SUPPORTED_V01) {
> + resp.resp.error != QMI_ERR_NOT_SUPPORTED_V01 &&
> + !of_machine_is_compatible("xiaomi,beryllium")) { /* Xiaomi Poco F1 
> workaround */
>   ath10k_err(ar, "host capability request rejected: %d\n", 
> resp.resp.error);
>   ret = -EINVAL;
>   goto out;
> -- 
> 2.7.4
>

Re: [PATCH net 1/2] net: phy: Avoid NPD upon phy_detach() when driver is unbound

2020-09-17 Thread Florian Fainelli





On 9/17/2020 6:15 AM, Andrew Lunn wrote:

On Wed, Sep 16, 2020 at 08:43:09PM -0700, Florian Fainelli wrote:

If we have unbound the PHY driver prior to calling phy_detach() (often
via phy_disconnect()) then we can cause a NULL pointer de-reference
accessing the driver owner member. The steps to reproduce are:

echo unimac-mdio-0:01 > /sys/class/net/eth0/phydev/driver/unbind
ip link set eth0 down


Hi Florian

How forceful is this unbind? Can we actually block it while the
interface is up? Or returning -EBUSY would make sense.


It it not forceful, you can unbind the PHY driver from underneath the 
net_device and nothing bad happens, really. This is not a very realistic 
or practical use case, but several years ago, I went into making sure we 
would not create NPD if that happened.

--
Florian

[PATCH bpf-next v4] bpf: using rcu_read_lock for bpf_sk_storage_map iterator

2020-09-17 Thread Yonghong Song

If a bucket contains a lot of sockets, during bpf_iter traversing
a bucket, concurrent userspace bpf_map_update_elem() and
bpf program bpf_sk_storage_{get,delete}() may experience
some undesirable delays as they will compete with bpf_iter
for bucket lock.

Note that the number of buckets for bpf_sk_storage_map
is roughly the same as the number of cpus. So if there
are lots of sockets in the system, each bucket could
contain lots of sockets.

Different actual use cases may experience different delays.
Here, using selftest bpf_iter subtest bpf_sk_storage_map,
I hacked the kernel with ktime_get_mono_fast_ns()
to collect the time when a bucket was locked
during bpf_iter prog traversing that bucket. This way,
the maximum incurred delay was measured w.r.t. the
number of elements in a bucket.
# elems in each bucket  delay(ns)
  6417000
  256   72512
  2048  875246

The potential delays will be further increased if
we have even more elemnts in a bucket. Using rcu_read_lock()
is a reasonable compromise here. It may lose some precision, e.g.,
access stale sockets, but it will not hurt performance of
bpf program or user space application which also tries
to get/delete or update map elements.

Cc: Martin KaFai Lau 
Acked-by: Song Liu 
Signed-off-by: Yonghong Song 
---
 net/core/bpf_sk_storage.c | 31 +--
 1 file changed, 13 insertions(+), 18 deletions(-)

Changelog:
  v3 -> v4:
 - use rcu_dereference/hlist_next_rcu for hlist_entry_safe. (Martin)
  v2 -> v3:
 - fix a bug hlist_for_each_entry() => hlist_for_each_entry_rcu(). (Martin)
 - use rcu_dereference() instead of rcu_dereference_raw() for lockdep 
checking. (Martin)
  v1 -> v2:
- added some performance number. (Song)
- tried to silence some sparse complains. but still has some left like
context imbalance in "..." - different lock contexts for basic block
  which the code is too hard for sparse to analyze. (Jakub)

diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 4a86ea34f29e..6b6ba874061c 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -678,6 +678,7 @@ struct bpf_iter_seq_sk_storage_map_info {
 static struct bpf_local_storage_elem *
 bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info *info,
 struct bpf_local_storage_elem *prev_selem)
+   __acquires(RCU) __releases(RCU)
 {
struct bpf_local_storage *sk_storage;
struct bpf_local_storage_elem *selem;
@@ -696,16 +697,16 @@ bpf_sk_storage_map_seq_find_next(struct 
bpf_iter_seq_sk_storage_map_info *info,
selem = prev_selem;
count = 0;
while (selem) {
-   selem = hlist_entry_safe(selem->map_node.next,
+   selem = 
hlist_entry_safe(rcu_dereference(hlist_next_rcu(&selem->map_node)),
 struct bpf_local_storage_elem, 
map_node);
if (!selem) {
/* not found, unlock and go to the next bucket */
b = &smap->buckets[bucket_id++];
-   raw_spin_unlock_bh(&b->lock);
+   rcu_read_unlock();
skip_elems = 0;
break;
}
-   sk_storage = rcu_dereference_raw(selem->local_storage);
+   sk_storage = rcu_dereference(selem->local_storage);
if (sk_storage) {
info->skip_elems = skip_elems + count;
return selem;
@@ -715,10 +716,10 @@ bpf_sk_storage_map_seq_find_next(struct 
bpf_iter_seq_sk_storage_map_info *info,
 
for (i = bucket_id; i < (1U << smap->bucket_log); i++) {
b = &smap->buckets[i];
-   raw_spin_lock_bh(&b->lock);
+   rcu_read_lock();
count = 0;
-   hlist_for_each_entry(selem, &b->list, map_node) {
-   sk_storage = rcu_dereference_raw(selem->local_storage);
+   hlist_for_each_entry_rcu(selem, &b->list, map_node) {
+   sk_storage = rcu_dereference(selem->local_storage);
if (sk_storage && count >= skip_elems) {
info->bucket_id = i;
info->skip_elems = count;
@@ -726,7 +727,7 @@ bpf_sk_storage_map_seq_find_next(struct 
bpf_iter_seq_sk_storage_map_info *info,
}
count++;
}
-   raw_spin_unlock_bh(&b->lock);
+   rcu_read_unlock();
skip_elems = 0;
}
 
@@ -785,7 +786,7 @@ static int __bpf_sk_storage_map_seq_show(struct seq_file 
*seq,
ctx.meta = &meta;
ctx.map = info->map;
if (selem) {
-   sk_storage = rcu_dereference_raw(selem->local_storage);
+

Re: [PATCH bpf-next v5 2/8] bpf: verifier: refactor check_attach_btf_id()

2020-09-17 Thread Andrii Nakryiko

On Thu, Sep 17, 2020 at 3:06 AM Toke Høiland-Jørgensen  wrote:
>
> Andrii Nakryiko  writes:
>
> >>
> >> +int bpf_check_attach_target(struct bpf_verifier_log *log,
> >> +   const struct bpf_prog *prog,
> >> +   const struct bpf_prog *tgt_prog,
> >> +   u32 btf_id,
> >> +   struct btf_func_model *fmodel,
> >> +   long *tgt_addr,
> >> +   const char **tgt_name,
> >> +   const struct btf_type **tgt_type);
> >
> > So this is obviously an abomination of a function signature,
> > especially for a one exported to other files.
> >
> > One candidate to remove would be tgt_type, which is supposed to be a
> > derivative of target BTF (vmlinux or tgt_prog->btf) + btf_id,
> > **except** (and that's how I found the bug below), in case of
> > fentry/fexit programs attaching to "conservative" BPF functions, in
> > which case what's stored in aux->attach_func_proto is different from
> > what is passed into btf_distill_func_proto. So that's a bug already
> > (you'll return NULL in some cases for tgt_type, while it has to always
> > be non-NULL).
>
> Okay, looked at this in more detail, and I don't think the refactored
> code is doing anything different from the pre-refactor version?
>
> Before we had this:
>
> if (tgt_prog && conservative) {
> prog->aux->attach_func_proto = NULL;
> t = NULL;
> }
>
> and now we just have
>
> if (tgt_prog && conservative)
> t = NULL;
>
> in bpf_check_attach_target(), which gets returned as tgt_type and
> subsequently assigned to prog->aux->attach_func_proto.

Yeah, you are totally right, I don't know how I missed that
`prog->aux->attach_func_proto = NULL;`, sorry about that.

>
> > But related to that is fmodel. It seems like bpf_check_attach_target()
> > has no interest in fmodel itself and is just passing it from
> > btf_distill_func_proto(). So I was about to suggest dropping fmodel
> > and calling btf_distill_func_proto() outside of
> > bpf_check_attach_target(), but given the conservative + fentry/fexit
> > quirk, it's probably going to be more confusing.
> >
> > So with all this, I suggest dropping the tgt_type output param
> > altogether and let callers do a `btf__type_by_id(tgt_prog ?
> > tgt_prog->aux->btf : btf_vmlinux, btf_id);`. That will both fix the
> > bug and will make this function's signature just a tad bit less
> > horrible.
>
> Thought about this, but the logic also does a few transformations of the
> type itself, e.g., this for bpf_trace_raw_tp:
>
> tname += sizeof(prefix) - 1;
> t = btf_type_by_id(btf, t->type);
> if (!btf_type_is_ptr(t))
> /* should never happen in valid vmlinux build */
> return -EINVAL;
> t = btf_type_by_id(btf, t->type);
> if (!btf_type_is_func_proto(t))
> /* should never happen in valid vmlinux build */
> return -EINVAL;
>
> so to catch this we really do have to return the type from the function
> as well.

yeah, with func_proto sometimes being null, btf_id isn't enough, so
that can't be done anyways.

>
> I do agree that the function signature is a tad on the long side, but I
> couldn't think of any good way of making it smaller. I considered
> replacing the last two return values with a boolean 'save' parameter,
> that would just make it same the values directly in prog->aux; but I
> actually find it easier to reason about a function that is strictly
> checking things and returning the result, instead of 'sometimes modify'
> semantics...

I agree, modifying prog->aux would be worse. And
btf_distill_func_proto() can't be extracted right away, because it
doesn't happen for the RAW_TP case. Oh well, we'll have to live with
an 8-argument function, I suppose.

Please add my ack when you post a new version:

Acked-by: Andrii Nakryiko 

>
> -Toke
>

Re: [PATCH rdma-next v2 0/3] Fix in-kernel active_speed type

On Thu, Sep 17, 2020 at 08:41:54AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 17, 2020 at 12:02:20PM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky 
> >
> > Changelog:
> > v2:
> >  * Changed WARN_ON casting to be saturated value instead while returning 
> > active_speed
> >to the user.
> > v1: 
> > https://lore.kernel.org/linux-rdma/20200902074503.743310-1-l...@kernel.org
> >  * Changed patch #1 to fix memory corruption to help with bisect. No
> >change in series, because the added code is changed anyway in patch
> >#3.
> > v0:
> >  * 
> > https://lore.kernel.org/linux-rdma/20200824105826.1093613-1-l...@kernel.org
> >
> >
> > IBTA declares speed as 16 bits, but kernel stores it in u8. This series
> > fixes in-kernel declaration while keeping external interface intact.
> >
> > Thanks
> >
> > Aharon Landau (3):
> >   net/mlx5: Refactor query port speed functions
> >   RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
> >   RDMA: Fix link active_speed size
>
> Look OK, can you update the shared branch?

I pushed first two patches to mlx5-next branch:

e27014bdb47e RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
639bf4415cad net/mlx5: Refactor query port speed functions

Thanks

>
> Thanks,
> Jason

[PATCH] net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMAC

2020-09-17 Thread Necip Fazil Yildiran

When IPV6_SEG6_HMAC is enabled and CRYPTO is disabled, it results in the
following Kbuild warning:

WARNING: unmet direct dependencies detected for CRYPTO_HMAC
  Depends on [n]: CRYPTO [=n]
  Selected by [y]:
  - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]

WARNING: unmet direct dependencies detected for CRYPTO_SHA1
  Depends on [n]: CRYPTO [=n]
  Selected by [y]:
  - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]

WARNING: unmet direct dependencies detected for CRYPTO_SHA256
  Depends on [n]: CRYPTO [=n]
  Selected by [y]:
  - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]

The reason is that IPV6_SEG6_HMAC selects CRYPTO_HMAC, CRYPTO_SHA1, and
CRYPTO_SHA256 without depending on or selecting CRYPTO while those configs
are subordinate to CRYPTO.

Honor the kconfig menu hierarchy to remove kconfig dependency warnings.

Fixes: bf355b8d2c30 ("ipv6: sr: add core files for SR HMAC support")
Signed-off-by: Necip Fazil Yildiran 
---
 net/ipv6/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 76bff79d6fed..747f56e0c636 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -303,6 +303,7 @@ config IPV6_SEG6_LWTUNNEL
 config IPV6_SEG6_HMAC
bool "IPv6: Segment Routing HMAC support"
depends on IPV6
+   select CRYPTO
select CRYPTO_HMAC
select CRYPTO_SHA1
select CRYPTO_SHA256
-- 
2.25.1

[PATCH bpf-next] bpf: Use hlist_add_head_rcu when linking to local_storage

The local_storage->list will be traversed by rcu reader in parallel.
Thus, hlist_add_head_rcu() is needed in bpf_selem_link_storage_nolock().
This patch fixes it.

This part of the code has recently been refactored in bpf-next
and this patch makes changes to the new file "bpf_local_storage.c".
Instead of using the original offending commit in the Fixes tag,
the commit that created the file "bpf_local_storage.c" is used.

A separate fix has been provided to the bpf tree.

Fixes: 450af8d0f6be ("bpf: Split bpf_local_storage to bpf_sk_storage")
Signed-off-by: Martin KaFai Lau 
---
 kernel/bpf/bpf_local_storage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c
index ffa7d11fc2bd..5d3a7af9ba9b 100644
--- a/kernel/bpf/bpf_local_storage.c
+++ b/kernel/bpf/bpf_local_storage.c
@@ -159,7 +159,7 @@ void bpf_selem_link_storage_nolock(struct bpf_local_storage 
*local_storage,
   struct bpf_local_storage_elem *selem)
 {
RCU_INIT_POINTER(selem->local_storage, local_storage);
-   hlist_add_head(&selem->snode, &local_storage->list);
+   hlist_add_head_rcu(&selem->snode, &local_storage->list);
 }
 
 void bpf_selem_unlink_map(struct bpf_local_storage_elem *selem)
-- 
2.24.1

Re: [PATCH bpf-next v3] bpf: using rcu_read_lock for bpf_sk_storage_map iterator

On Tue, Sep 15, 2020 at 11:16:49PM -0700, Yonghong Song wrote:
[ ... ]

> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> index 4a86ea34f29e..d43c3d6d0693 100644
> --- a/net/core/bpf_sk_storage.c
> +++ b/net/core/bpf_sk_storage.c
> @@ -678,6 +678,7 @@ struct bpf_iter_seq_sk_storage_map_info {
>  static struct bpf_local_storage_elem *
>  bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info 
> *info,
>struct bpf_local_storage_elem *prev_selem)
> + __acquires(RCU) __releases(RCU)
>  {
>   struct bpf_local_storage *sk_storage;
>   struct bpf_local_storage_elem *selem;
In the while loop earlier in this function, if I read it correctly,
it is sort of continuing the earlier hlist_for_each_entry_rcu() for the
same bucket, so the hlist_entry_safe() needs to be changed also.
Something like this (uncompiled code):

while (selem) {
-   selem = hlist_entry_safe(selem->map_node.next,
+   selem = 
hlist_entry_safe(rcu_dereference(hlist_next_rcu(&selem->map_node)),
 struct bpf_local_storage_elem, 
map_node);
if (!selem) {
/* not found, unlock and go to the next bucket */

> @@ -701,11 +702,11 @@ bpf_sk_storage_map_seq_find_next(struct 
> bpf_iter_seq_sk_storage_map_info *info,
>   if (!selem) {
>   /* not found, unlock and go to the next bucket */
>   b = &smap->buckets[bucket_id++];
> - raw_spin_unlock_bh(&b->lock);
> + rcu_read_unlock();
>   skip_elems = 0;
>   break;
>   }
> - sk_storage = rcu_dereference_raw(selem->local_storage);
> + sk_storage = rcu_dereference(selem->local_storage);
>   if (sk_storage) {
>   info->skip_elems = skip_elems + count;
>   return selem;

Re: [PATCH bpf-next v3] bpf: using rcu_read_lock for bpf_sk_storage_map iterator

2020-09-17 Thread Yonghong Song





On 9/16/20 10:55 AM, Martin KaFai Lau wrote:

On Tue, Sep 15, 2020 at 11:16:49PM -0700, Yonghong Song wrote:
[ ... ]


diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 4a86ea34f29e..d43c3d6d0693 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -678,6 +678,7 @@ struct bpf_iter_seq_sk_storage_map_info {
  static struct bpf_local_storage_elem *
  bpf_sk_storage_map_seq_find_next(struct bpf_iter_seq_sk_storage_map_info 
*info,
 struct bpf_local_storage_elem *prev_selem)
+   __acquires(RCU) __releases(RCU)
  {
struct bpf_local_storage *sk_storage;
struct bpf_local_storage_elem *selem;

In the while loop earlier in this function, if I read it correctly,
it is sort of continuing the earlier hlist_for_each_entry_rcu() for the
same bucket, so the hlist_entry_safe() needs to be changed also.
Something like this (uncompiled code):

 while (selem) {
-   selem = hlist_entry_safe(selem->map_node.next,
+   selem = 
hlist_entry_safe(rcu_dereference(hlist_next_rcu(&selem->map_node)),
  struct bpf_local_storage_elem, 
map_node);
 if (!selem) {
 /* not found, unlock and go to the next bucket */


Thanks and Ack. Will send v4 shortly.




@@ -701,11 +702,11 @@ bpf_sk_storage_map_seq_find_next(struct 
bpf_iter_seq_sk_storage_map_info *info,
if (!selem) {
/* not found, unlock and go to the next bucket */
b = &smap->buckets[bucket_id++];
-   raw_spin_unlock_bh(&b->lock);
+   rcu_read_unlock();
skip_elems = 0;
break;
}
-   sk_storage = rcu_dereference_raw(selem->local_storage);
+   sk_storage = rcu_dereference(selem->local_storage);
if (sk_storage) {
info->skip_elems = skip_elems + count;
return selem;

Re: [PATCH bpf-next v5 5/8] bpf: Fix context type resolving for extension programs

2020-09-17 Thread Toke Høiland-Jørgensen

Andrii Nakryiko  writes:

> On Wed, Sep 16, 2020 at 12:59 PM Andrii Nakryiko
>  wrote:
>>
>> On Tue, Sep 15, 2020 at 5:50 PM Toke Høiland-Jørgensen  
>> wrote:
>> >
>> > From: Toke Høiland-Jørgensen 
>> >
>> > Eelco reported we can't properly access arguments if the tracing
>> > program is attached to extension program.
>> >
>> > Having following program:
>> >
>> >   SEC("classifier/test_pkt_md_access")
>> >   int test_pkt_md_access(struct __sk_buff *skb)
>> >
>> > with its extension:
>> >
>> >   SEC("freplace/test_pkt_md_access")
>> >   int test_pkt_md_access_new(struct __sk_buff *skb)
>> >
>> > and tracing that extension with:
>> >
>> >   SEC("fentry/test_pkt_md_access_new")
>> >   int BPF_PROG(fentry, struct sk_buff *skb)
>> >
>> > It's not possible to access skb argument in the fentry program,
>> > with following error from verifier:
>> >
>> >   ; int BPF_PROG(fentry, struct sk_buff *skb)
>> >   0: (79) r1 = *(u64 *)(r1 +0)
>> >   invalid bpf_context access off=0 size=8
>> >
>> > The problem is that btf_ctx_access gets the context type for the
>> > traced program, which is in this case the extension.
>> >
>> > But when we trace extension program, we want to get the context
>> > type of the program that the extension is attached to, so we can
>> > access the argument properly in the trace program.
>> >
>> > This version of the patch is tweaked slightly from Jiri's original one,
>> > since the refactoring in the previous patches means we have to get the
>> > target prog type from the new variable in prog->aux instead of directly
>> > from the target prog.
>> >
>> > Reported-by: Eelco Chaudron 
>> > Suggested-by: Jiri Olsa 
>> > Signed-off-by: Toke Høiland-Jørgensen 
>> > ---
>> >  kernel/bpf/btf.c |9 -
>> >  1 file changed, 8 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
>> > index 9228af9917a8..55f7b2ba1cbd 100644
>> > --- a/kernel/bpf/btf.c
>> > +++ b/kernel/bpf/btf.c
>> > @@ -3860,7 +3860,14 @@ bool btf_ctx_access(int off, int size, enum 
>> > bpf_access_type type,
>> >
>> > info->reg_type = PTR_TO_BTF_ID;
>> > if (tgt_prog) {
>> > -   ret = btf_translate_to_vmlinux(log, btf, t, 
>> > tgt_prog->type, arg);
>> > +   enum bpf_prog_type tgt_type;
>> > +
>> > +   if (tgt_prog->type == BPF_PROG_TYPE_EXT)
>> > +   tgt_type = tgt_prog->aux->tgt_prog_type;
>>
>> what if tgt_prog->aux->tgt_prog_type is also BPF_PROG_TYPE_EXT? Should
>> this be a loop?
>
> ok, never mind this specifically. there is an explicit check
>
> if (tgt_prog->type == prog->type) {
> verbose(env, "Cannot recursively attach\n");
> return -EINVAL;
> }
>
> that will prevent this.
>
> But, I think we still will be able to construct a long chain of
> fmod_ret -> freplace -> fmod_ret -> freplace -> and so on ad
> infinitum. Can you please construct such a selftest? And then we
> should probably fix those checks to also disallow FMOD_RET, in
> addition to BPF_TRACE_FENTRY/FEXIT (and someone more familiar with LSM
> prog type should check if that can cause any problems).

Huh, I thought fmod_ret was supposed to be for kernel functions only?
However, I can't really point to anywhere in the code that ensures this,
other than check_attach_modify_return(), but I think that will allow a
bpf function as long as its name starts with "security_" ?

Is there actually any use case for modify_return being attached to a BPF
function (you could just use freplace instead, couldn't you?). Or should
we just disallow that entirely (if I'm not missing somewhere it's
already blocked)?

-Toke

Re: [PATCH net-next] net: mdio: octeon: Select MDIO_DEVRES

2020-09-17 Thread Randy Dunlap

On 9/17/20 9:19 AM, Andrew Lunn wrote:
> This driver makes use of devm_mdiobus_alloc_size. To ensure this is
> available select MDIO_DEVRES which provides it.
> 

Reported-by: Randy Dunlap 
Acked-by: Randy Dunlap  # build-tested

Thanks.

> Signed-off-by: Andrew Lunn 
> ---
>  drivers/net/mdio/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/mdio/Kconfig b/drivers/net/mdio/Kconfig
> index 1299880dfe74..840727cc9499 100644
> --- a/drivers/net/mdio/Kconfig
> +++ b/drivers/net/mdio/Kconfig
> @@ -138,6 +138,7 @@ config MDIO_OCTEON
>   depends on (64BIT && OF_MDIO) || COMPILE_TEST
>   depends on HAS_IOMEM
>   select MDIO_CAVIUM
> + select MDIO_DEVRES
>   help
> This module provides a driver for the Octeon and ThunderX MDIO
> buses. It is required by the Octeon and ThunderX ethernet device
> 


-- 
~Randy

[PATCH bpf] bpf: Use hlist_add_head_rcu when linking to sk_storage

The sk_storage->list will be traversed by rcu reader in parallel.
Thus, hlist_add_head_rcu() is needed in __selem_link_sk().  This
patch fixes it.

This part of the code has recently been refactored in bpf-next.
A separate fix will be provided for the bpf-next tree.

Fixes: 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage")
Signed-off-by: Martin KaFai Lau 
---
 net/core/bpf_sk_storage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index b988f48153a4..d4d2a56e9d4a 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -219,7 +219,7 @@ static void __selem_link_sk(struct bpf_sk_storage 
*sk_storage,
struct bpf_sk_storage_elem *selem)
 {
RCU_INIT_POINTER(selem->sk_storage, sk_storage);
-   hlist_add_head(&selem->snode, &sk_storage->list);
+   hlist_add_head_rcu(&selem->snode, &sk_storage->list);
 }
 
 static void selem_unlink_map(struct bpf_sk_storage_elem *selem)
-- 
2.24.1

[PATCH net-next v2 0/8] devlink: Add SF add/delete devlink ops

Hi Dave, Jakub,

Similar to PCI VF, PCI SF represents portion of the device.
PCI SF is represented using a new devlink port flavour.

This short series implements small part of the RFC described in detail at [1] 
and [2].

It extends
(a) devlink core to expose new devlink port flavour 'pcisf'.
(b) Expose new user interface to add/delete devlink port.
(c) Extends netdevsim driver to simulate PCI PF and SF ports
(d) Add port function state attribute

Patch summary:
Patch-1 Extends devlink to expose new PCI SF port flavour
Patch-2 Extends devlink to let user add, delete devlink Port
Patch-3 Prepare code to handle multiple port attributes
Patch-4 Extends devlink to let user get, set function state
Patch-5 Extends netdevsim driver to simulate PCI PF ports
Patch-6 Extends netdevsim driver to simulate hw_addr get/set
Patch-7 Extends netdevsim driver to simulate function state get/set
Patch-8 Extends netdevsim driver to simulate PCI SF ports

[1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/
[2] https://marc.info/?l=linux-netdev&m=15855592851&w=2

---
Changelog:
v1->v2:
 - Fixed extra semicolon at end of switch case reportec by coccinelle

Parav Pandit (8):
  devlink: Introduce PCI SF port flavour and port attribute
  devlink: Support add and delete devlink port
  devlink: Prepare code to fill multiple port function attributes
  devlink: Support get and set state of port function
  netdevsim: Add support for add and delete of a PCI PF port
  netdevsim: Simulate get/set hardware address of a PCI port
  netdevsim: Simulate port function state for a PCI port
  netdevsim: Add support for add and delete PCI SF port

 drivers/net/netdevsim/Makefile|   3 +-
 drivers/net/netdevsim/dev.c   |  14 +
 drivers/net/netdevsim/netdevsim.h |  32 ++
 drivers/net/netdevsim/port_function.c | 498 ++
 include/net/devlink.h |  75 
 include/uapi/linux/devlink.h  |  13 +
 net/core/devlink.c| 230 ++--
 7 files changed, 840 insertions(+), 25 deletions(-)
 create mode 100644 drivers/net/netdevsim/port_function.c

-- 
2.26.2

[PATCH net-next v2 2/8] devlink: Support add and delete devlink port

Extended devlink interface for the user to add and delete port.
Extend devlink to connect user requests to driver to add/delete
such port in the device.

When driver routines are invoked, devlink instance lock is not held.
This enables driver to perform several devlink objects registration,
unregistration such as (port, health reporter, resource etc)
by using exising devlink APIs.
This also helps to uniformly used the code for port registration
during driver unload and during port deletion initiated by user.

Examples of add, show and delete commands:
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device

$ devlink port show netdevsim/netdevsim10/0
netdevsim/netdevsim10/0: type eth netdev eni10np1 flavour physical port 1 
splittable false

$ devlink port add netdevsim/netdevsim10 flavour pcipf pfnum 0

$ devlink port show netdevsim/netdevsim10/1
netdevsim/netdevsim10/1: type eth netdev eni10npf0 flavour pcipf controller 0 
pfnum 0 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port show netdevsim/netdevsim10/1 -jp
{
"port": {
"netdevsim/netdevsim10/1": {
"type": "eth",
"netdev": "eni10npf0",
"flavour": "pcipf",
"controller": 0,
"pfnum": 0,
"external": false,
"splittable": false,
"function": {
"hw_addr": "00:00:00:00:00:00",
"state": "inactive"
}
}
}
}

$ devlink port del netdevsim/netdevsim10/1

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 include/net/devlink.h | 38 
 net/core/devlink.c| 67 +++
 2 files changed, 105 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 1edb558125b0..ebab2c0360d0 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -142,6 +142,17 @@ struct devlink_port {
struct mutex reporters_lock; /* Protects reporter_list */
 };
 
+struct devlink_port_new_attrs {
+   enum devlink_port_flavour flavour;
+   unsigned int port_index;
+   u32 controller;
+   u32 sfnum;
+   u16 pfnum;
+   u8 port_index_valid:1,
+  controller_valid:1,
+  sfnum_valid:1;
+};
+
 struct devlink_sb_pool_info {
enum devlink_sb_pool_type pool_type;
u32 size;
@@ -1189,6 +1200,33 @@ struct devlink_ops {
int (*port_function_hw_addr_set)(struct devlink *devlink, struct 
devlink_port *port,
 const u8 *hw_addr, int hw_addr_len,
 struct netlink_ext_ack *extack);
+   /**
+* @port_new: Port add function.
+*
+* Should be used by device driver to let caller add new port of a 
specified flavour
+* with optional attributes.
+* Driver should return -EOPNOTSUPP if it doesn't support port addition 
of a specified
+* flavour or specified attributes. Driver should set extack error 
message in case of fail
+* to add the port.
+* devlink core does not hold a devlink instance lock when this 
callback is invoked.
+* Driver must ensures synchronization when adding or deleting a port. 
Driver must
+* register a port with devlink core.
+*/
+   int (*port_new)(struct devlink *devlink, const struct 
devlink_port_new_attrs *attrs,
+   struct netlink_ext_ack *extack);
+   /**
+* @port_del: Port delete function.
+*
+* Should be used by device driver to let caller delete port which was 
previously created
+* using port_new() callback.
+* Driver should return -EOPNOTSUPP if it doesn't support port deletion.
+* Driver should set extack error message in case of fail to delete the 
port.
+* devlink core does not hold a devlink instance lock when this 
callback is invoked.
+* Driver must ensures synchronization when adding or deleting a port. 
Driver must
+* register a port with devlink core.
+*/
+   int (*port_del)(struct devlink *devlink, unsigned int port_index,
+   struct netlink_ext_ack *extack);
 };
 
 static inline void *devlink_priv(struct devlink *devlink)
diff --git a/net/core/devlink.c b/net/core/devlink.c
index fada660fd515..e93730065c57 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -991,6 +991,57 @@ static int devlink_nl_cmd_port_unsplit_doit(struct sk_buff 
*skb,
return devlink_port_unsplit(devlink, port_index, info->extack);
 }
 
+static int devlink_nl_cmd_port_new_doit(struct sk_buff *skb, struct genl_info 
*info)
+{
+   struct netlink_ext_ack *extack = info->extack;
+   struct devlink_port_new_attrs new_attrs = {};
+   struct devlink *devlink = info->user_ptr[0];
+
+   if (!info->attrs[DEVLINK_ATTR_PORT_FLAVOUR] ||
+   !info->attrs[DEVLINK_ATTR_PORT_

[PATCH net-next v2 1/8] devlink: Introduce PCI SF port flavour and port attribute

A PCI sub-function (SF) represents a portion of the device similar
to PCI VF.

In an eswitch, PCI SF may have port which is normally represented
using a representor netdevice.
To have better visibility of eswitch port, its association with SF,
and its representor netdevice, introduce a PCI SF port flavour.

When devlink port flavour is PCI SF, fill up PCI SF attributes of the
port.

Extend port name creation using PCI PF and SF number scheme on best
effort basis, so that vendor drivers can skip defining their own
scheme.

An example view of a PCI SF port.

$ devlink port show netdevsim/netdevsim10/2
netdevsim/netdevsim10/2: type eth netdev eni10npf0sf44 flavour pcisf controller 
0 pfnum 0 sfnum 44 external false splittable false
  function:
hw_addr 00:00:00:00:00:00

devlink port show netdevsim/netdevsim10/2 -jp
{
"port": {
"netdevsim/netdevsim10/2": {
"type": "eth",
"netdev": "eni10npf0sf44",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": false,
"splittable": false,
"function": {
"hw_addr": "00:00:00:00:00:00"
}
}
}
}

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 include/net/devlink.h| 17 +
 include/uapi/linux/devlink.h |  7 +++
 net/core/devlink.c   | 37 
 3 files changed, 61 insertions(+)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index 48b1c1ef1ebd..1edb558125b0 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -83,6 +83,20 @@ struct devlink_port_pci_vf_attrs {
u8 external:1;
 };
 
+/**
+ * struct devlink_port_pci_sf_attrs - devlink port's PCI SF attributes
+ * @controller: Associated controller number
+ * @pf: Associated PCI PF number for this port.
+ * @sf: Associated PCI SF for of the PCI PF for this port.
+ * @external: when set, indicates if a port is for an external controller
+ */
+struct devlink_port_pci_sf_attrs {
+   u32 controller;
+   u16 pf;
+   u32 sf;
+   u8 external:1;
+};
+
 /**
  * struct devlink_port_attrs - devlink port object
  * @flavour: flavour of the port
@@ -104,6 +118,7 @@ struct devlink_port_attrs {
struct devlink_port_phys_attrs phys;
struct devlink_port_pci_pf_attrs pci_pf;
struct devlink_port_pci_vf_attrs pci_vf;
+   struct devlink_port_pci_sf_attrs pci_sf;
};
 };
 
@@ -1230,6 +1245,8 @@ void devlink_port_attrs_pci_pf_set(struct devlink_port 
*devlink_port, u32 contro
   u16 pf, bool external);
 void devlink_port_attrs_pci_vf_set(struct devlink_port *devlink_port, u32 
controller,
   u16 pf, u16 vf, bool external);
+void devlink_port_attrs_pci_sf_set(struct devlink_port *devlink_port, u32 
controller,
+  u16 pf, u32 sf, bool external);
 int devlink_sb_register(struct devlink *devlink, unsigned int sb_index,
u32 size, u16 ingress_pools_count,
u16 egress_pools_count, u16 ingress_tc_count,
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 631f5bdf1707..09c41b9ce407 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -195,6 +195,11 @@ enum devlink_port_flavour {
  * port that faces the PCI VF.
  */
DEVLINK_PORT_FLAVOUR_VIRTUAL, /* Any virtual port facing the user. */
+
+   DEVLINK_PORT_FLAVOUR_PCI_SF, /* Represents eswitch port
+ * for the PCI SF. It is an internal
+ * port that faces the PCI SF.
+ */
 };
 
 enum devlink_param_cmode {
@@ -462,6 +467,8 @@ enum devlink_attr {
 
DEVLINK_ATTR_PORT_EXTERNAL, /* u8 */
DEVLINK_ATTR_PORT_CONTROLLER_NUMBER,/* u32 */
+
+   DEVLINK_ATTR_PORT_PCI_SF_NUMBER,/* u32 */
/* add new attributes above here, update the policy in devlink.c */
 
__DEVLINK_ATTR_MAX,
diff --git a/net/core/devlink.c b/net/core/devlink.c
index e5b71f3c2d4d..fada660fd515 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -539,6 +539,15 @@ static int devlink_nl_port_attrs_put(struct sk_buff *msg,
if (nla_put_u8(msg, DEVLINK_ATTR_PORT_EXTERNAL, 
attrs->pci_vf.external))
return -EMSGSIZE;
break;
+   case DEVLINK_PORT_FLAVOUR_PCI_SF:
+   if (nla_put_u32(msg, DEVLINK_ATTR_PORT_CONTROLLER_NUMBER,
+   attrs->pci_sf.controller) ||
+   nla_put_u16(msg, DEVLINK_ATTR_PORT_PCI_PF_NUMBER, 
attrs->pci_sf.pf) ||
+   nla_put_u32(msg, DEVLINK_ATTR_PORT_PCI_SF_NUMBER, 
attrs->pci_sf.sf))
+

[PATCH net-next v2 8/8] netdevsim: Add support for add and delete PCI SF port

Simulate PCI SF ports. Allow user to create one or more PCI SF ports.

Examples:

Create a PCI PF and PCI SF port.
$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0
$ devlink port add netdevsim/netdevsim10/11 flavour pcisf pfnum 0 sfnum 44
$ devlink port show netdevsim/netdevsim10/11
netdevsim/netdevsim10/11: type eth netdev eni10npf0sf44 flavour pcisf 
controller 0 pfnum 0 sfnum 44 external true splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port function set netdevsim/netdevsim10/11 hw_addr 00:11:22:33:44:55 
state active

$ devlink port show netdevsim/netdevsim10/11 -jp
{
"port": {
"netdevsim/netdevsim10/11": {
"type": "eth",
"netdev": "eni10npf0sf44",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": true,
"splittable": false,
"function": {
"hw_addr": "00:11:22:33:44:55",
"state": "active"
}
}
}
}

Delete newly added devlink port
$ devlink port add netdevsim/netdevsim10/11

Add devlink port of flavour 'pcisf' where port index and sfnum are
auto assigned by driver.
$ devlink port add netdevsim/netdevsim10 flavour pcisf controller 0 pfnum 0

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 drivers/net/netdevsim/netdevsim.h |  1 +
 drivers/net/netdevsim/port_function.c | 95 +--
 2 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/drivers/net/netdevsim/netdevsim.h 
b/drivers/net/netdevsim/netdevsim.h
index 0ea9705eda38..c70782e444d5 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -222,6 +222,7 @@ struct nsim_dev {
struct list_head head;
struct ida ida;
struct ida pfnum_ida;
+   struct ida sfnum_ida;
} port_functions;
 };
 
diff --git a/drivers/net/netdevsim/port_function.c 
b/drivers/net/netdevsim/port_function.c
index 99581d3d15fe..e1812acd55b4 100644
--- a/drivers/net/netdevsim/port_function.c
+++ b/drivers/net/netdevsim/port_function.c
@@ -13,10 +13,12 @@ struct nsim_port_function {
unsigned int port_index;
enum devlink_port_flavour flavour;
u32 controller;
+   u32 sfnum;
u16 pfnum;
struct nsim_port_function *pf_port; /* Valid only for SF port */
u8 hw_addr[ETH_ALEN];
u8 state; /* enum devlink_port_function_state */
+   int refcount; /* Counts how many sf ports are bound attached to this pf 
port. */
 };
 
 void nsim_dev_port_function_init(struct nsim_dev *nsim_dev)
@@ -25,10 +27,13 @@ void nsim_dev_port_function_init(struct nsim_dev *nsim_dev)
INIT_LIST_HEAD(&nsim_dev->port_functions.head);
ida_init(&nsim_dev->port_functions.ida);
ida_init(&nsim_dev->port_functions.pfnum_ida);
+   ida_init(&nsim_dev->port_functions.sfnum_ida);
 }
 
 void nsim_dev_port_function_exit(struct nsim_dev *nsim_dev)
 {
+   WARN_ON(!ida_is_empty(&nsim_dev->port_functions.sfnum_ida));
+   ida_destroy(&nsim_dev->port_functions.sfnum_ida);
WARN_ON(!ida_is_empty(&nsim_dev->port_functions.pfnum_ida));
ida_destroy(&nsim_dev->port_functions.pfnum_ida);
WARN_ON(!ida_is_empty(&nsim_dev->port_functions.ida));
@@ -119,9 +124,24 @@ nsim_devlink_port_function_alloc(struct nsim_dev *dev, 
const struct devlink_port
goto fn_ida_err;
port->pfnum = ret;
break;
+   case DEVLINK_PORT_FLAVOUR_PCI_SF:
+   if (attrs->sfnum_valid)
+   ret = ida_alloc_range(&dev->port_functions.sfnum_ida, 
attrs->sfnum,
+ attrs->sfnum, GFP_KERNEL);
+   else
+   ret = ida_alloc(&dev->port_functions.sfnum_ida, 
GFP_KERNEL);
+   if (ret < 0)
+   goto fn_ida_err;
+   port->sfnum = ret;
+   port->pfnum = attrs->pfnum;
+   break;
default:
break;
}
+   /* refcount_t is not needed as port is protected by 
port_functions.mutex.
+* This count is to keep track of how many SF ports are attached a PF 
port.
+*/
+   port->refcount = 1;
return port;
 
 fn_ida_err:
@@ -137,6 +157,9 @@ static void nsim_devlink_port_function_free(struct nsim_dev 
*dev, struct nsim_po
case DEVLINK_PORT_FLAVOUR_PCI_PF:
ida_simple_remove(&dev->port_functions.pfnum_ida, port->pfnum);
break;
+   case DEVLINK_PORT_FLAVOUR_PCI_SF:
+   ida_simple_remove(&dev->port_functions.sfnum_ida, port->sfnum);
+   break;
default:
break;
}
@@ -170,6 +193,11 @@ nsim_dev_port_port_exists(struct nsim_dev *nsim_dev, const 
struct devlink_port_n
if (attrs->flavour == DEVLINK_PORT_FLAVOUR_PCI_P

[PATCH net-next v2 5/8] netdevsim: Add support for add and delete of a PCI PF port

Simulate PCI PF ports. Allow user to create one or more PCI PF ports.

Examples:

Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device

Add and show devlink port of flavour 'pcipf' for PF number 0.

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0

$ devlink port show netdevsim/netdevsim10/10
netdevsim/netdevsim10/10: type eth netdev eni10npf0 flavour pcipf controller 0 
pfnum 0 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

Delete newly added devlink port
$ devlink port add netdevsim/netdevsim10/10

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
Changelog:
v1->v2:
 - Fixed extra semicolon at end of switch case reportec by coccinelle
---
 drivers/net/netdevsim/Makefile|   3 +-
 drivers/net/netdevsim/dev.c   |  10 +
 drivers/net/netdevsim/netdevsim.h |  19 ++
 drivers/net/netdevsim/port_function.c | 337 ++
 4 files changed, 368 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/netdevsim/port_function.c

diff --git a/drivers/net/netdevsim/Makefile b/drivers/net/netdevsim/Makefile
index ade086eed955..e69e895af62c 100644
--- a/drivers/net/netdevsim/Makefile
+++ b/drivers/net/netdevsim/Makefile
@@ -3,7 +3,8 @@
 obj-$(CONFIG_NETDEVSIM) += netdevsim.o
 
 netdevsim-objs := \
-   netdev.o dev.o ethtool.o fib.o bus.o health.o udp_tunnels.o
+   netdev.o dev.o ethtool.o fib.o bus.o health.o udp_tunnels.o \
+   port_function.o
 
 ifeq ($(CONFIG_BPF_SYSCALL),y)
 netdevsim-objs += \
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index 32f339fedb21..e3b81c8b5125 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -884,6 +884,8 @@ static const struct devlink_ops nsim_dev_devlink_ops = {
.trap_group_set = nsim_dev_devlink_trap_group_set,
.trap_policer_set = nsim_dev_devlink_trap_policer_set,
.trap_policer_counter_get = nsim_dev_devlink_trap_policer_counter_get,
+   .port_new = nsim_dev_devlink_port_new,
+   .port_del = nsim_dev_devlink_port_del,
 };
 
 #define NSIM_DEV_MAX_MACS_DEFAULT 32
@@ -1017,6 +1019,8 @@ static int nsim_dev_reload_create(struct nsim_dev 
*nsim_dev,
  nsim_dev->ddir,
  nsim_dev,
&nsim_dev_take_snapshot_fops);
+
+   nsim_dev_port_function_enable(nsim_dev);
return 0;
 
 err_health_exit:
@@ -1050,6 +1054,7 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
nsim_dev->max_macs = NSIM_DEV_MAX_MACS_DEFAULT;
nsim_dev->test1 = NSIM_DEV_TEST1_DEFAULT;
spin_lock_init(&nsim_dev->fa_cookie_lock);
+   nsim_dev_port_function_init(nsim_dev);
 
dev_set_drvdata(&nsim_bus_dev->dev, nsim_dev);
 
@@ -1097,6 +1102,7 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
if (err)
goto err_bpf_dev_exit;
 
+   nsim_dev_port_function_enable(nsim_dev);
devlink_params_publish(devlink);
devlink_reload_enable(devlink);
return 0;
@@ -1131,6 +1137,9 @@ static void nsim_dev_reload_destroy(struct nsim_dev 
*nsim_dev)
 
if (devlink_is_reload_failed(devlink))
return;
+
+   /* Disable and destroy any user created devlink ports */
+   nsim_dev_port_function_disable(nsim_dev);
debugfs_remove(nsim_dev->take_snapshot);
nsim_dev_port_del_all(nsim_dev);
nsim_dev_health_exit(nsim_dev);
@@ -1155,6 +1164,7 @@ void nsim_dev_remove(struct nsim_bus_dev *nsim_bus_dev)
  ARRAY_SIZE(nsim_devlink_params));
devlink_unregister(devlink);
devlink_resources_unregister(devlink, NULL);
+   nsim_dev_port_function_exit(nsim_dev);
devlink_free(devlink);
 }
 
diff --git a/drivers/net/netdevsim/netdevsim.h 
b/drivers/net/netdevsim/netdevsim.h
index 0c86561e6d8d..aec3c4d5fda7 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -213,6 +213,16 @@ struct nsim_dev {
bool ipv4_only;
u32 sleep;
} udp_ports;
+   struct {
+   refcount_t refcount; /* refcount along with disable_complete 
serializes
+ * port operations with port function 
disablement
+ * during driver unload.
+ */
+   struct completion disable_complete;
+   struct list_head head;
+   struct ida ida;
+   struct ida pfnum_ida;
+   } port_functions;
 };
 
 static inline struct net *nsim_dev_net(struct nsim_dev *nsim_dev)
@@ -283,3 +293,12 @@ struct nsim_bus_dev {
 
 int nsim_bus_init(void);
 void nsim_bus_exit(void);
+
+void nsim_dev_port_function_init(struct nsim_dev *nsim_dev);
+void nsim_dev_port_function_exit(struct nsim_dev *nsim_de

[PATCH net-next v2 6/8] netdevsim: Simulate get/set hardware address of a PCI port

Allow users to get/set hardware address for the PCI port.

Below example creates one devlink port, queries a port, sets a
hardware address.

Example of a PCI SF port which supports a port function hw_addr set:
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0
$ devlink port show netdevsim/netdevsim10/10
netdevsim/netdevsim10/10: type eth netdev eni10npf0 flavour pcipf controller 0 
pfnum 0 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port function set netdevsim/netdevsim10/10 hw_addr 00:11:22:33:44:55

$ devlink port show netdevsim/netdevsim10/10 -jp
{
"port": {
"netdevsim/netdevsim10/11": {
"type": "eth",
"netdev": "eni10npf0",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": true,
"splittable": false,
"function": {
"hw_addr": "00:11:22:33:44:55"
}
}
}
}

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 drivers/net/netdevsim/dev.c   |  2 ++
 drivers/net/netdevsim/netdevsim.h |  6 
 drivers/net/netdevsim/port_function.c | 44 +++
 3 files changed, 52 insertions(+)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index e3b81c8b5125..ef2e293f358b 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -886,6 +886,8 @@ static const struct devlink_ops nsim_dev_devlink_ops = {
.trap_policer_counter_get = nsim_dev_devlink_trap_policer_counter_get,
.port_new = nsim_dev_devlink_port_new,
.port_del = nsim_dev_devlink_port_del,
+   .port_function_hw_addr_get = nsim_dev_port_function_hw_addr_get,
+   .port_function_hw_addr_set = nsim_dev_port_function_hw_addr_set,
 };
 
 #define NSIM_DEV_MAX_MACS_DEFAULT 32
diff --git a/drivers/net/netdevsim/netdevsim.h 
b/drivers/net/netdevsim/netdevsim.h
index aec3c4d5fda7..8dc8f4e5dcd8 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -302,3 +302,9 @@ int nsim_dev_devlink_port_new(struct devlink *devlink, 
const struct devlink_port
  struct netlink_ext_ack *extack);
 int nsim_dev_devlink_port_del(struct devlink *devlink, unsigned int port_index,
  struct netlink_ext_ack *extack);
+int nsim_dev_port_function_hw_addr_get(struct devlink *devlink, struct 
devlink_port *port,
+  u8 *hw_addr, int *hw_addr_len,
+  struct netlink_ext_ack *extack);
+int nsim_dev_port_function_hw_addr_set(struct devlink *devlink, struct 
devlink_port *port,
+  const u8 *hw_addr, int hw_addr_len,
+  struct netlink_ext_ack *extack);
diff --git a/drivers/net/netdevsim/port_function.c 
b/drivers/net/netdevsim/port_function.c
index 4f3e9cc9489f..6feeeaf19ce8 100644
--- a/drivers/net/netdevsim/port_function.c
+++ b/drivers/net/netdevsim/port_function.c
@@ -15,6 +15,7 @@ struct nsim_port_function {
u32 controller;
u16 pfnum;
struct nsim_port_function *pf_port; /* Valid only for SF port */
+   u8 hw_addr[ETH_ALEN];
 };
 
 void nsim_dev_port_function_init(struct nsim_dev *nsim_dev)
@@ -335,3 +336,46 @@ void nsim_dev_port_function_disable(struct nsim_dev 
*nsim_dev)
nsim_devlink_port_function_free(nsim_dev, port);
}
 }
+
+static struct nsim_port_function *nsim_dev_to_port_function(struct nsim_dev 
*nsim_dev,
+   struct devlink_port 
*dl_port)
+{
+   if (nsim_dev_port_index_internal(nsim_dev, dl_port->index))
+   return ERR_PTR(-EOPNOTSUPP);
+   return container_of(dl_port, struct nsim_port_function, dl_port);
+}
+
+int nsim_dev_port_function_hw_addr_get(struct devlink *devlink, struct 
devlink_port *dl_port,
+  u8 *hw_addr, int *hw_addr_len,
+  struct netlink_ext_ack *extack)
+{
+   struct nsim_dev *nsim_dev = devlink_priv(devlink);
+   struct nsim_port_function *port;
+
+   port = nsim_dev_to_port_function(nsim_dev, dl_port);
+   if (IS_ERR(port))
+   return PTR_ERR(port);
+
+   memcpy(hw_addr, port->hw_addr, ETH_ALEN);
+   *hw_addr_len = ETH_ALEN;
+   return 0;
+}
+
+int nsim_dev_port_function_hw_addr_set(struct devlink *devlink, struct 
devlink_port *dl_port,
+  const u8 *hw_addr, int hw_addr_len,
+  struct netlink_ext_ack *extack)
+{
+   struct nsim_dev *nsim_dev = devlink_priv(devlink);
+   struct nsim_port_function *port;
+
+   if (hw_addr_len != ETH_ALEN) {
+   NL_SET_ERR_MSG

[PATCH net-next v2 4/8] devlink: Support get and set state of port function

devlink port function can be in active or inactive state.
Allow users to get and set port function's state.

Example of a PCI SF port which supports a port function:
Create a device with ID=10 and one physical port.
$ echo "10 1" > /sys/bus/netdevsim/new_device
$ devlink port show
netdevsim/netdevsim10/0: type eth netdev eth0 flavour physical port 1 
splittable false

$ devlink port add netdevsim/netdevsim10/10 flavour pcipf pfnum 0
$ devlink port add netdevsim/netdevsim10/11 flavour pcisf pfnum 0 sfnum 44
$ devlink port show netdevsim/netdevsim10/11
netdevsim/netdevsim10/11: type eth netdev eni10npf0sf44 flavour pcisf 
controller 0 pfnum 0 sfnum 44 external false splittable false
  function:
hw_addr 00:00:00:00:00:00 state inactive

$ devlink port function set netdevsim/netdevsim10/11 hw_addr 00:11:22:33:44:55 
state active

$ devlink port show netdevsim/netdevsim10/11 -jp
{
"port": {
"netdevsim/netdevsim10/11": {
"type": "eth",
"netdev": "eni10npf0sf44",
"flavour": "pcisf",
"controller": 0,
"pfnum": 0,
"sfnum": 44,
"external": false,
"splittable": false,
"function": {
"hw_addr": "00:11:22:33:44:55",
"state": "active"
}
}
}
}

Signed-off-by: Parav Pandit 
Reviewed-by: Jiri Pirko 
---
 include/net/devlink.h| 20 ++
 include/uapi/linux/devlink.h |  6 +++
 net/core/devlink.c   | 77 +++-
 3 files changed, 101 insertions(+), 2 deletions(-)

diff --git a/include/net/devlink.h b/include/net/devlink.h
index ebab2c0360d0..500c22835686 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1200,6 +1200,26 @@ struct devlink_ops {
int (*port_function_hw_addr_set)(struct devlink *devlink, struct 
devlink_port *port,
 const u8 *hw_addr, int hw_addr_len,
 struct netlink_ext_ack *extack);
+   /**
+* @port_function_state_get: Port function's state get function.
+*
+* Should be used by device drivers to report the state of a function 
managed
+* by the devlink port. Driver should return -EOPNOTSUPP if it doesn't 
support port
+* function handling for a particular port.
+*/
+   int (*port_function_state_get)(struct devlink *devlink, struct 
devlink_port *port,
+  enum devlink_port_function_state *state,
+  struct netlink_ext_ack *extack);
+   /**
+* @port_function_state_set: Port function's state set function.
+*
+* Should be used by device drivers to set the state of a function 
managed
+* by the devlink port. Driver should return -EOPNOTSUPP if it doesn't 
support port
+* function handling for a particular port.
+*/
+   int (*port_function_state_set)(struct devlink *devlink, struct 
devlink_port *port,
+  enum devlink_port_function_state state,
+  struct netlink_ext_ack *extack);
/**
 * @port_new: Port add function.
 *
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 09c41b9ce407..8e513f1cd638 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -518,9 +518,15 @@ enum devlink_resource_unit {
 enum devlink_port_function_attr {
DEVLINK_PORT_FUNCTION_ATTR_UNSPEC,
DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR, /* binary */
+   DEVLINK_PORT_FUNCTION_ATTR_STATE,   /* u8 */
 
__DEVLINK_PORT_FUNCTION_ATTR_MAX,
DEVLINK_PORT_FUNCTION_ATTR_MAX = __DEVLINK_PORT_FUNCTION_ATTR_MAX - 1
 };
 
+enum devlink_port_function_state {
+   DEVLINK_PORT_FUNCTION_STATE_INACTIVE,
+   DEVLINK_PORT_FUNCTION_STATE_ACTIVE,
+};
+
 #endif /* _UAPI_LINUX_DEVLINK_H_ */
diff --git a/net/core/devlink.c b/net/core/devlink.c
index d152489e48da..c82098cb75da 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -87,6 +87,9 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(devlink_hwerr);
 
 static const struct nla_policy 
devlink_function_nl_policy[DEVLINK_PORT_FUNCTION_ATTR_MAX + 1] = {
[DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR] = { .type = NLA_BINARY },
+   [DEVLINK_PORT_FUNCTION_ATTR_STATE] =
+   NLA_POLICY_RANGE(NLA_U8, DEVLINK_PORT_FUNCTION_STATE_INACTIVE,
+DEVLINK_PORT_FUNCTION_STATE_ACTIVE),
 };
 
 static LIST_HEAD(devlink_list);
@@ -595,6 +598,40 @@ devlink_port_function_hw_addr_fill(struct devlink 
*devlink, const struct devlink
return 0;
 }
 
+static bool devlink_port_function_state_valid(u8 state)
+{
+   return state == DEVLINK_PORT_FUNCTION_STATE_INACTIVE ||
+  state == DEVLINK_PORT_FUNCTION_STATE_ACTIVE;
+}
+
+static int devlink_port_function_state_fill(struct devlink *devlink, const 
str

[PATCH net-next v2 7/8] netdevsim: Simulate port function state for a PCI port