date:20201103

Re: [net-next V3 6/8] net: sched: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais



In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
  net/sched/sch_atm.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c
index 1c281cc81f57..390d972bb2f0 100644
--- a/net/sched/sch_atm.c
+++ b/net/sched/sch_atm.c
@@ -466,10 +466,10 @@ drop: __maybe_unused
   * non-ATM interfaces.
   */

-static void sch_atm_dequeue(unsigned long data)
+static void sch_atm_dequeue(struct tasklet_struct *t)
  {
-   struct Qdisc *sch = (struct Qdisc *)data;
-   struct atm_qdisc_data *p = qdisc_priv(sch);
+   struct atm_qdisc_data *p = from_tasklet(p, t, task);
+   struct Qdisc *sch = (struct Qdisc *)((char *)p - sizeof(struct Qdisc));


Hmm... I think I prefer not burying implementation details in
net/sched/sch_atm.c and instead
define a helper in include/net/pkt_sched.h

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 
4ed32e6b020145afb015c3c07d2ec3a613f1311d..15b1b30f454e4837cd1fc07bb3ff6b4f178b1d39
100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -24,6 +24,11 @@ static inline void *qdisc_priv(struct Qdisc *q)
 return &q->privdata;
  }

+static inline struct Qdisc *qdisc_from_priv(void *priv)
+{
+   return container_of(priv, struct Qdisc, privdata);
+}
+
  /*
 Timer resolution MUST BE < 10% of min_schedulable_packet_size/bandwidth



Sure, I will have it updated and resent. Thanks.

[PATCH net v2] net: openvswitch: silence suspicious RCU usage warning

2020-11-03 Thread Eelco Chaudron

Silence suspicious RCU usage warning in ovs_flow_tbl_masks_cache_resize()
by replacing rcu_dereference() with rcu_dereference_ovsl().

In addition, when creating a new datapath, make sure it's configured under
the ovs_lock.

Fixes: 9bf24f594c6a ("net: openvswitch: make masks cache size configurable")
Reported-by: syzbot+9a8f8bfcc56e85780...@syzkaller.appspotmail.com
Signed-off-by: Eelco Chaudron 
---
v2: - Moved local variable initialization above lock
- Renamed jump label to indicate unlocking

 net/openvswitch/datapath.c   |   14 +++---
 net/openvswitch/flow_table.c |2 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 832f898edb6a..9d6ef6cb9b26 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1703,13 +1703,13 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
parms.port_no = OVSP_LOCAL;
parms.upcall_portids = a[OVS_DP_ATTR_UPCALL_PID];
 
-   err = ovs_dp_change(dp, a);
-   if (err)
-   goto err_destroy_meters;
-
/* So far only local changes have been made, now need the lock. */
ovs_lock();
 
+   err = ovs_dp_change(dp, a);
+   if (err)
+   goto err_unlock_and_destroy_meters;
+
vport = new_vport(&parms);
if (IS_ERR(vport)) {
err = PTR_ERR(vport);
@@ -1725,8 +1725,7 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
ovs_dp_reset_user_features(skb, info);
}
 
-   ovs_unlock();
-   goto err_destroy_meters;
+   goto err_unlock_and_destroy_meters;
}
 
err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
@@ -1741,7 +1740,8 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
ovs_notify(&dp_datapath_genl_family, reply, info);
return 0;
 
-err_destroy_meters:
+err_unlock_and_destroy_meters:
+   ovs_unlock();
ovs_meters_exit(dp);
 err_destroy_ports:
kfree(dp->ports);
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index f3486a37361a..c89c8da99f1a 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -390,7 +390,7 @@ static struct mask_cache *tbl_mask_cache_alloc(u32 size)
 }
 int ovs_flow_tbl_masks_cache_resize(struct flow_table *table, u32 size)
 {
-   struct mask_cache *mc = rcu_dereference(table->mask_cache);
+   struct mask_cache *mc = rcu_dereference_ovsl(table->mask_cache);
struct mask_cache *new;
 
if (size == mc->cache_size)

Re: [PATCH net] net: openvswitch: silence suspicious RCU usage warning

2020-11-03 Thread Eelco Chaudron




On 2 Nov 2020, at 20:51, Jakub Kicinski wrote:

> On Mon, 02 Nov 2020 09:52:19 +0100 Eelco Chaudron wrote:
>> On 30 Oct 2020, at 22:28, Jakub Kicinski wrote:
 @@ -1695,6 +1695,9 @@ static int ovs_dp_cmd_new(struct sk_buff *skb,
 struct genl_info *info)
if (err)
goto err_destroy_ports;

 +  /* So far only local changes have been made, now need the lock. */
 +  ovs_lock();
>>>
>>> Should we move the lock below assignments to param?
>>>
>>> Looks a little strange to protect stack variables with a global lock.
>>
>> You are right, I should have moved it down after the assignment. I will
>> send out a v2.
>>
>>> Let's update the name of the label.
>>
>> Guess now it is, unlock and destroy meters, so what label are you
>> looking for?
>>
>> err_unlock_and_destroy_meters: which looks a bit long, or just
>> err_unlock:
>
> I feel like I saw some names like err_unlock_and_destroy_meters in OvS
> code, but can't find them in this file right now.
>
> I'd personally go for kist err_unlock, or maybe err_unlock_ovs as is
> used in other functions in this file.
>
> But as long as it starts with err_unlock it's fine by me :)

Ack, sent out a v2.

Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support

2020-11-03 Thread Jiri Benc

On Mon, 2 Nov 2020 22:58:06 -0800, Andrii Nakryiko wrote:
> But I don't think I got a real answer as to what's the exact reason
> against the submodule. Like what "inappropriate" even means in this
> case? Jesper's security argument so far was the only objective
> criteria, as far as I can tell.

It's the fundamental objection. Distributions in general have the "no
bundled libraries" policy. It is sometimes annoying but it helps to
understand that the policy is not a whim of distros, it's coming from
years of experience with package maintenance for security and stability.

> But I also see that using libbpf through submodule gives iproute2
> exact control over which version of libbpf is being used. And that
> does not depend at all on any specific Linux distribution, its
> version, LTS vs non-LTS, etc. iproute2 will just work the same across
> all of them. So matches your stated goals very directly and
> explicitly.

If you take this route, the end result would be all dependencies for
all projects being included as submodules and bundled. At the first
sight, this sounds easier for the developers. Why bother with dynamic
linking at all? Everything can be linked statically.

The result would be nightmare for both distros and users. No timely
security updates possible, critical bugs not being fixed in some
programs, etc. There is enough experience with this kind of setup to
conclude it is not the right way to go.

Yes, dynamic linking is initially more work for developers of both apps
and libraries. However, it pays off over time - there's no need to keep
track of security and other important fixes in the dependencies, it
comes for free from the distro work.

Btw, taking the bundling to the extreme, every app could bundle its own
well tested and compatible kernel version and be run in a VM. This
might sound far fetched but there were actual attempts to do that. It
didn't take off; I think part of the reason was that the Linux kernel
is very good in keeping its APIs stable.

And I'm convinced this is the way to go for libraries, too: put an
emphasis on API stability. Make it easy to get consumed and updated
under the hood. Everybody wins this way.

 Jiri

Re: [PATCH 41/41] realtek: rtw88: pci: Add prototypes for .probe, .remove and .shutdown

2020-11-03 Thread Lee Jones

On Mon, 02 Nov 2020, Brian Norris wrote:

> On Mon, Nov 2, 2020 at 3:25 AM Lee Jones  wrote:
> > --- a/drivers/net/wireless/realtek/rtw88/pci.h
> > +++ b/drivers/net/wireless/realtek/rtw88/pci.h
> > @@ -212,6 +212,10 @@ struct rtw_pci {
> > void __iomem *mmap;
> >  };
> >
> > +int rtw_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id);
> > +void rtw_pci_remove(struct pci_dev *pdev);
> > +void rtw_pci_shutdown(struct pci_dev *pdev);
> > +
> >
> 
> These definitions are already in 4 other header files:
> 
> drivers/net/wireless/realtek/rtw88/rtw8723de.h
> drivers/net/wireless/realtek/rtw88/rtw8821ce.h
> drivers/net/wireless/realtek/rtw88/rtw8822be.h
> drivers/net/wireless/realtek/rtw88/rtw8822ce.h
> 
> Seems like you should be moving them, not just adding yet another duplicate.

I followed the current convention.

Happy to optimise if that's what is required.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

Re: [PATCHv3 iproute2-next 0/5] iproute2: add libbpf support

2020-11-03 Thread Daniel Borkmann


On 11/3/20 7:58 AM, Andrii Nakryiko wrote:

On Mon, Nov 2, 2020 at 7:47 AM David Ahern  wrote:

On 10/29/20 9:11 AM, Hangbin Liu wrote:

This series converts iproute2 to use libbpf for loading and attaching
BPF programs when it is available. This means that iproute2 will
correctly process BTF information and support the new-style BTF-defined
maps, while keeping compatibility with the old internal map definition
syntax.

This is achieved by checking for libbpf at './configure' time, and using
it if available. By default the system libbpf will be used, but static
linking against a custom libbpf version can be achieved by passing
LIBBPF_DIR to configure. FORCE_LIBBPF can be set to force configure to
abort if no suitable libbpf is found (useful for automatic packaging
that wants to enforce the dependency).

The old iproute2 bpf code is kept and will be used if no suitable libbpf
is available. When using libbpf, wrapper code ensures that iproute2 will
still understand the old map definition format, including populating
map-in-map and tail call maps before load.

The examples in bpf/examples are kept, and a separate set of examples
are added with BTF-based map definitions for those examples where this
is possible (libbpf doesn't currently support declaratively populating
tail call maps).

At last, Thanks a lot for Toke's help on this patch set.


In regards to comments from v2 of the series:

iproute2 is a stable, production package that requires minimal support
from external libraries. The external packages it does require are also
stable with few to no relevant changes.

bpf and libbpf on the other hand are under active development and
rapidly changing month over month. The git submodule approach has its
conveniences for rapid development but is inappropriate for a package
like iproute2 and will not be considered.


I thought last time this discussion came up there was consensus that the
submodule could be an explicit opt in for the configure script at least?

[PATCH v8 0/4] Add support for mv88e6393x family of Marvell

2020-11-03 Thread Pavana Sharma

Updated patchset with following changes.
- Add kerneldoc for 5GBASER phy interface
- Remove lane param initialization wherever is it not needed.

Pavana Sharma (4):
  dt-bindings: net: Add 5GBASER phy interface mode
  net: phy: Add 5GBASER interface mode
  net: dsa: mv88e6xxx: Change serdes lane parameter  from u8 type to int
  net: dsa: mv88e6xxx: Add support for mv88e6393x family  of Marvell

 .../bindings/net/ethernet-controller.yaml |   2 +
 drivers/net/dsa/mv88e6xxx/chip.c  | 164 +-
 drivers/net/dsa/mv88e6xxx/chip.h  |  20 +-
 drivers/net/dsa/mv88e6xxx/global1.h   |   2 +
 drivers/net/dsa/mv88e6xxx/global2.h   |   8 +
 drivers/net/dsa/mv88e6xxx/port.c  | 240 +-
 drivers/net/dsa/mv88e6xxx/port.h  |  43 ++-
 drivers/net/dsa/mv88e6xxx/serdes.c| 295 +++---
 drivers/net/dsa/mv88e6xxx/serdes.h|  91 --
 include/linux/phy.h   |   5 +
 10 files changed, 781 insertions(+), 89 deletions(-)

-- 
2.17.1

[PATCH v8 1/4] dt-bindings: net: Add 5GBASER phy interface mode

2020-11-03 Thread Pavana Sharma

Add 5gbase-r PHY interface mode.

Signed-off-by: Pavana Sharma 
---
 Documentation/devicetree/bindings/net/ethernet-controller.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml 
b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
index fdf709817218..aa6ae7851de9 100644
--- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml
+++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
@@ -89,6 +89,8 @@ properties:
   - trgmii
   - 1000base-x
   - 2500base-x
+  # 5GBASE-R
+  - 5gbase-r
   - rxaui
   - xaui
 
-- 
2.17.1

[PATCH v8 3/4] net: dsa: mv88e6xxx: Change serdes lane parameter from u8 type to int

2020-11-03 Thread Pavana Sharma

Returning 0 is no more an error case with MV88E6393 family
which has serdes lane numbers 0, 9 or 10.
So with this change .serdes_get_lane will return lane number
or error (-ENODEV).

Signed-off-by: Pavana Sharma 
---
 drivers/net/dsa/mv88e6xxx/chip.c   | 28 +--
 drivers/net/dsa/mv88e6xxx/chip.h   | 16 +++
 drivers/net/dsa/mv88e6xxx/port.c   |  6 +--
 drivers/net/dsa/mv88e6xxx/serdes.c | 74 +++---
 drivers/net/dsa/mv88e6xxx/serdes.h | 50 ++--
 5 files changed, 87 insertions(+), 87 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index bd297ae7cf9e..d32731a7c658 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -485,12 +485,12 @@ static int mv88e6xxx_serdes_pcs_get_state(struct 
dsa_switch *ds, int port,
  struct phylink_link_state *state)
 {
struct mv88e6xxx_chip *chip = ds->priv;
-   u8 lane;
+   int lane;
int err;
 
mv88e6xxx_reg_lock(chip);
lane = mv88e6xxx_serdes_get_lane(chip, port);
-   if (lane && chip->info->ops->serdes_pcs_get_state)
+   if ((lane >= 0) && chip->info->ops->serdes_pcs_get_state)
err = chip->info->ops->serdes_pcs_get_state(chip, port, lane,
state);
else
@@ -506,11 +506,11 @@ static int mv88e6xxx_serdes_pcs_config(struct 
mv88e6xxx_chip *chip, int port,
   const unsigned long *advertise)
 {
const struct mv88e6xxx_ops *ops = chip->info->ops;
-   u8 lane;
+   int lane;
 
if (ops->serdes_pcs_config) {
lane = mv88e6xxx_serdes_get_lane(chip, port);
-   if (lane)
+   if (lane >= 0)
return ops->serdes_pcs_config(chip, port, lane, mode,
  interface, advertise);
}
@@ -522,15 +522,15 @@ static void mv88e6xxx_serdes_pcs_an_restart(struct 
dsa_switch *ds, int port)
 {
struct mv88e6xxx_chip *chip = ds->priv;
const struct mv88e6xxx_ops *ops;
+   int lane;
int err = 0;
-   u8 lane;
 
ops = chip->info->ops;
 
if (ops->serdes_pcs_an_restart) {
mv88e6xxx_reg_lock(chip);
lane = mv88e6xxx_serdes_get_lane(chip, port);
-   if (lane)
+   if (lane >= 0)
err = ops->serdes_pcs_an_restart(chip, port, lane);
mv88e6xxx_reg_unlock(chip);
 
@@ -544,11 +544,11 @@ static int mv88e6xxx_serdes_pcs_link_up(struct 
mv88e6xxx_chip *chip, int port,
int speed, int duplex)
 {
const struct mv88e6xxx_ops *ops = chip->info->ops;
-   u8 lane;
+   int lane;
 
if (!phylink_autoneg_inband(mode) && ops->serdes_pcs_link_up) {
lane = mv88e6xxx_serdes_get_lane(chip, port);
-   if (lane)
+   if (lane >= 0)
return ops->serdes_pcs_link_up(chip, port, lane,
   speed, duplex);
}
@@ -2422,11 +2422,11 @@ static irqreturn_t mv88e6xxx_serdes_irq_thread_fn(int 
irq, void *dev_id)
struct mv88e6xxx_chip *chip = mvp->chip;
irqreturn_t ret = IRQ_NONE;
int port = mvp->port;
-   u8 lane;
+   int lane;
 
mv88e6xxx_reg_lock(chip);
lane = mv88e6xxx_serdes_get_lane(chip, port);
-   if (lane)
+   if (lane >= 0)
ret = mv88e6xxx_serdes_irq_status(chip, port, lane);
mv88e6xxx_reg_unlock(chip);
 
@@ -2434,7 +2434,7 @@ static irqreturn_t mv88e6xxx_serdes_irq_thread_fn(int 
irq, void *dev_id)
 }
 
 static int mv88e6xxx_serdes_irq_request(struct mv88e6xxx_chip *chip, int port,
-   u8 lane)
+   int lane)
 {
struct mv88e6xxx_port *dev_id = &chip->ports[port];
unsigned int irq;
@@ -2463,7 +2463,7 @@ static int mv88e6xxx_serdes_irq_request(struct 
mv88e6xxx_chip *chip, int port,
 }
 
 static int mv88e6xxx_serdes_irq_free(struct mv88e6xxx_chip *chip, int port,
-u8 lane)
+int lane)
 {
struct mv88e6xxx_port *dev_id = &chip->ports[port];
unsigned int irq = dev_id->serdes_irq;
@@ -2488,11 +2488,11 @@ static int mv88e6xxx_serdes_irq_free(struct 
mv88e6xxx_chip *chip, int port,
 static int mv88e6xxx_serdes_power(struct mv88e6xxx_chip *chip, int port,
  bool on)
 {
-   u8 lane;
+   int lane;
int err;
 
lane = mv88e6xxx_serdes_get_lane(chip, port);
-   if (!lane)
+   if (lane < 0)
return 0;
 
if (on) {
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h
index 81c244fc0419..d81f586d67e8 100644
--- a/dri

[PATCH v8 2/4] net: phy: Add 5GBASER interface mode

2020-11-03 Thread Pavana Sharma

Add 5GBASE-R phy interface mode

Signed-off-by: Pavana Sharma 
---
 include/linux/phy.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index eb3cb1a98b45..71e280059ec5 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -106,6 +106,7 @@ extern const int phy_10gbit_features_array[1];
  * @PHY_INTERFACE_MODE_TRGMII: Turbo RGMII
  * @PHY_INTERFACE_MODE_1000BASEX: 1000 BaseX
  * @PHY_INTERFACE_MODE_2500BASEX: 2500 BaseX
+ * @PHY_INTERFACE_MODE_5GBASER: 5G BaseR
  * @PHY_INTERFACE_MODE_RXAUI: Reduced XAUI
  * @PHY_INTERFACE_MODE_XAUI: 10 Gigabit Attachment Unit Interface
  * @PHY_INTERFACE_MODE_10GBASER: 10G BaseR
@@ -137,6 +138,8 @@ typedef enum {
PHY_INTERFACE_MODE_TRGMII,
PHY_INTERFACE_MODE_1000BASEX,
PHY_INTERFACE_MODE_2500BASEX,
+   /* 5GBASE-R mode */
+   PHY_INTERFACE_MODE_5GBASER,
PHY_INTERFACE_MODE_RXAUI,
PHY_INTERFACE_MODE_XAUI,
/* 10GBASE-R, XFI, SFI - single lane 10G Serdes */
@@ -215,6 +218,8 @@ static inline const char *phy_modes(phy_interface_t 
interface)
return "1000base-x";
case PHY_INTERFACE_MODE_2500BASEX:
return "2500base-x";
+   case PHY_INTERFACE_MODE_5GBASER:
+   return "5gbase-r";
case PHY_INTERFACE_MODE_RXAUI:
return "rxaui";
case PHY_INTERFACE_MODE_XAUI:
-- 
2.17.1

Re: [PATCH 05/41] rtl8192cu: trx: Demote clear abuse of kernel-doc format

2020-11-03 Thread Lee Jones

On Mon, 02 Nov 2020, Larry Finger wrote:

> On 11/2/20 5:23 AM, Lee Jones wrote:
> > Fixes the following W=1 kernel build warning(s):
> > 
> >   drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c:455: warning: 
> > Function parameter or member 'txdesc' not described in 
> > '_rtl_tx_desc_checksum'
> > 
> > Cc: Ping-Ke Shih 
> > Cc: Kalle Valo 
> > Cc: "David S. Miller" 
> > Cc: Jakub Kicinski 
> > Cc: Larry Finger 
> > Cc: linux-wirel...@vger.kernel.org
> > Cc: netdev@vger.kernel.org
> > Signed-off-by: Lee Jones 
> > ---
> >   drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c 
> > b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c
> > index 1ad0cf37f60bb..87f959d5d861d 100644
> > --- a/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c
> > +++ b/drivers/net/wireless/realtek/rtlwifi/rtl8192cu/trx.c
> > @@ -448,7 +448,7 @@ static void _rtl_fill_usb_tx_desc(__le32 *txdesc)
> > set_tx_desc_first_seg(txdesc, 1);
> >   }
> > -/**
> > +/*
> >*For HW recovery information
> >*/
> >   static void _rtl_tx_desc_checksum(__le32 *txdesc)
> > 
> 
> Did you check this patch with checkpatch.pl?

Yes.

> I think you substituted one
> warning for another. The wireless-testing trees previously did not accept a
> bare "/*", which is why "/**" was present.

I don't see a problem.

$ git format-patch -n1 --stdout 8cd8b929e0458 | ./scripts/checkpatch.pl 
total: 0 errors, 0 warnings, 0 checks, 8 lines checked

"[PATCH 1/1] rtl8192cu: trx: Demote clear abuse of kernel-doc format"
  has no obvious style problems and is ready for submission.

> This particular instance should have
> /* For HW recovery information */
> as the comment.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

[PATCH v8 4/4] net: dsa: mv88e6xxx: Add support for mv88e6393x family of Marvell

2020-11-03 Thread Pavana Sharma

The Marvell 88E6393X device is a single-chip integration of a 11-port
Ethernet switch with eight integrated Gigabit Ethernet (GbE) transceivers
and three 10-Gigabit interfaces.

This patch adds functionalities specific to mv88e6393x family (88E6393X,
88E6193X and 88E6191X)

Co-developed-by: Ashkan Boldaji 
Signed-off-by: Ashkan Boldaji 
Signed-off-by: Pavana Sharma 
---
Changes in v2:
  - Fix a warning (Reported-by: kernel test robot )
Changes in v3:
  - Fix 'unused function' warning
Changes in v4-v8:
  - Incorporated feedback from maintainers.
---
 drivers/net/dsa/mv88e6xxx/chip.c| 136 
 drivers/net/dsa/mv88e6xxx/chip.h|   4 +
 drivers/net/dsa/mv88e6xxx/global1.h |   2 +
 drivers/net/dsa/mv88e6xxx/global2.h |   8 +
 drivers/net/dsa/mv88e6xxx/port.c| 234 
 drivers/net/dsa/mv88e6xxx/port.h|  43 -
 drivers/net/dsa/mv88e6xxx/serdes.c  | 225 +-
 drivers/net/dsa/mv88e6xxx/serdes.h  |  41 -
 8 files changed, 689 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index d32731a7c658..bfcbe70affa3 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -635,6 +635,24 @@ static void mv88e6390x_phylink_validate(struct 
mv88e6xxx_chip *chip, int port,
mv88e6390_phylink_validate(chip, port, mask, state);
 }
 
+static void mv88e6393x_phylink_validate(struct mv88e6xxx_chip *chip, int port,
+   unsigned long *mask,
+   struct phylink_link_state *state)
+{
+   if (port == 0 || port == 9 || port == 10) {
+   phylink_set(mask, 1baseT_Full);
+   phylink_set(mask, 1baseKR_Full);
+   phylink_set(mask, 5000baseT_Full);
+   phylink_set(mask, 2500baseX_Full);
+   phylink_set(mask, 2500baseT_Full);
+   }
+
+   phylink_set(mask, 1000baseT_Full);
+   phylink_set(mask, 1000baseX_Full);
+
+   mv88e6065_phylink_validate(chip, port, mask, state);
+}
+
 static void mv88e6xxx_validate(struct dsa_switch *ds, int port,
   unsigned long *supported,
   struct phylink_link_state *state)
@@ -3906,6 +3924,55 @@ static const struct mv88e6xxx_ops mv88e6191_ops = {
.phylink_validate = mv88e6390_phylink_validate,
 };
 
+static const struct mv88e6xxx_ops mv88e6393x_ops = {
+   /* MV88E6XXX_FAMILY_6393 */
+   .setup_errata = mv88e6393x_setup_errata,
+   .irl_init_all = mv88e6390_g2_irl_init_all,
+   .get_eeprom = mv88e6xxx_g2_get_eeprom8,
+   .set_eeprom = mv88e6xxx_g2_set_eeprom8,
+   .set_switch_mac = mv88e6xxx_g2_set_switch_mac,
+   .phy_read = mv88e6xxx_g2_smi_phy_read,
+   .phy_write = mv88e6xxx_g2_smi_phy_write,
+   .port_set_link = mv88e6xxx_port_set_link,
+   .port_set_speed_duplex = mv88e6393x_port_set_speed_duplex,
+   .port_set_rgmii_delay = mv88e6390_port_set_rgmii_delay,
+   .port_tag_remap = mv88e6390_port_tag_remap,
+   .port_set_frame_mode = mv88e6351_port_set_frame_mode,
+   .port_set_egress_floods = mv88e6352_port_set_egress_floods,
+   .port_set_ether_type = mv88e6393x_port_set_ether_type,
+   .port_set_jumbo_size = mv88e6165_port_set_jumbo_size,
+   .port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
+   .port_pause_limit = mv88e6390_port_pause_limit,
+   .port_set_cmode = mv88e6393x_port_set_cmode,
+   .port_disable_learn_limit = mv88e6xxx_port_disable_learn_limit,
+   .port_disable_pri_override = mv88e6xxx_port_disable_pri_override,
+   .port_get_cmode = mv88e6352_port_get_cmode,
+   .stats_snapshot = mv88e6390_g1_stats_snapshot,
+   .stats_set_histogram = mv88e6390_g1_stats_set_histogram,
+   .stats_get_sset_count = mv88e6320_stats_get_sset_count,
+   .stats_get_strings = mv88e6320_stats_get_strings,
+   .stats_get_stats = mv88e6390_stats_get_stats,
+   .set_cpu_port = mv88e6393x_port_set_cpu_dest,
+   .set_egress_port = mv88e6393x_set_egress_port,
+   .watchdog_ops = &mv88e6390_watchdog_ops,
+   .mgmt_rsvd2cpu = mv88e6393x_port_mgmt_rsvd2cpu,
+   .pot_clear = mv88e6xxx_g2_pot_clear,
+   .reset = mv88e6352_g1_reset,
+   .rmu_disable = mv88e6390_g1_rmu_disable,
+   .vtu_getnext = mv88e6390_g1_vtu_getnext,
+   .vtu_loadpurge = mv88e6390_g1_vtu_loadpurge,
+   .serdes_power = mv88e6393x_serdes_power,
+   .serdes_get_lane = mv88e6393x_serdes_get_lane,
+   .serdes_pcs_get_state = mv88e6390_serdes_pcs_get_state,
+   .serdes_irq_mapping = mv88e6390_serdes_irq_mapping,
+   .serdes_irq_enable = mv88e6393x_serdes_irq_enable,
+   .serdes_irq_status = mv88e6393x_serdes_irq_status,
+   .gpio_ops = &mv88e6352_gpio_ops,
+   .avb_ops = &mv88e6390_avb_ops,
+   .ptp_ops = &mv88e6352_ptp_ops,
+   .phylink_validate = mv88e6393x_phylink_validate,
+

Re: [PATCH] vhost/vsock: add IOTLB API support

2020-11-03 Thread Jason Wang




On 2020/11/3 上午1:11, Stefano Garzarella wrote:

On Fri, Oct 30, 2020 at 07:44:43PM +0800, Jason Wang wrote:


On 2020/10/30 下午6:54, Stefano Garzarella wrote:

On Fri, Oct 30, 2020 at 06:02:18PM +0800, Jason Wang wrote:


On 2020/10/30 上午1:43, Stefano Garzarella wrote:

This patch enables the IOTLB API support for vhost-vsock devices,
allowing the userspace to emulate an IOMMU for the guest.

These changes were made following vhost-net, in details this patch:
- exposes VIRTIO_F_ACCESS_PLATFORM feature and inits the iotlb
  device if the feature is acked
- implements VHOST_GET_BACKEND_FEATURES and
  VHOST_SET_BACKEND_FEATURES ioctls
- calls vq_meta_prefetch() before vq processing to prefetch vq
  metadata address in IOTLB
- provides .read_iter, .write_iter, and .poll callbacks for the
  chardev; they are used by the userspace to exchange IOTLB messages

This patch was tested with QEMU and a patch applied [1] to fix a
simple issue:
    $ qemu -M q35,accel=kvm,kernel-irqchip=split \
   -drive file=fedora.qcow2,format=qcow2,if=virtio \
   -device intel-iommu,intremap=on \
   -device vhost-vsock-pci,guest-cid=3,iommu_platform=on



Patch looks good, but a question:

It looks to me you don't enable ATS which means vhost won't get any 
invalidation request or did I miss anything?




You're right, I didn't see invalidation requests, only miss and 
updates.
Now I have tried to enable 'ats' and 'device-iotlb' but I still 
don't see any invalidation.


How can I test it? (Sorry but I don't have much experience yet with 
vIOMMU)



I guess it's because the batched unmap. Maybe you can try to use 
"intel_iommu=strict" in guest kernel command line to see if it works.


Btw, make sure the qemu contains the patch [1]. Otherwise ATS won't 
be enabled for recent Linux Kernel in the guest.


The problem was my kernel, it was built with a tiny configuration.
Using fedora stock kernel I can see the 'invalidate' requests, but I 
also had the following issues.


Do they make you ring any bells?

$ ./qemu -m 4G -smp 4 -M q35,accel=kvm,kernel-irqchip=split \
    -drive file=fedora.qcow2,format=qcow2,if=virtio \
    -device intel-iommu,intremap=on,device-iotlb=on \
    -device vhost-vsock-pci,guest-cid=6,iommu_platform=on,ats=on,id=v1

    qemu-system-x86_64: vtd_iova_to_slpte: detected IOVA overflow     
(iova=0x1d4030c0)



It's a hint that IOVA exceeds the AW. It might be worth to check whether 
the missed IOVA reported from IOTLB is legal.


Thanks


qemu-system-x86_64: vtd_iommu_translate: detected translation failure 
(dev=00:03:00, iova=0x1d4030c0)
    qemu-system-x86_64: New fault is not recorded due to compression 
of     faults


Guest kernel messages:
    [   44.940872] DMAR: DRHD: handling fault status reg 2
    [   44.941989] DMAR: [DMA Read] Request device [00:03.0] PASID     
 fault addr 88W

    [   49.785884] DMAR: DRHD: handling fault status reg 2
    [   49.788874] DMAR: [DMA Read] Request device [00:03.0] PASID     
 fault addr 88W



QEMU: b149dea55c Merge remote-tracking branch 
'remotes/cschoenebeck/tags/pull-9p-20201102' into staging


Linux guest: 5.8.16-200.fc32.x86_64


Thanks,
Stefano

Re: [PATCH v9 2/2] net: Add mhi-net driver

2020-11-03 Thread Loic Poulain

Hi Jakub,

On Mon, 2 Nov 2020 at 23:40, Jakub Kicinski  wrote:
>
> On Fri, 30 Oct 2020 11:48:15 +0100 Loic Poulain wrote:
> > This patch adds a new network driver implementing MHI transport for
> > network packets. Packets can be in any format, though QMAP (rmnet)
> > is the usual protocol (flow control + PDN mux).
> >
> > It support two MHI devices, IP_HW0 which is, the path to the IPA
> > (IP accelerator) on qcom modem, And IP_SW0 which is the software
> > driven IP path (to modem CPU).
> >
> > Signed-off-by: Loic Poulain 
> > Reviewed-by: Manivannan Sadhasivam 
>
> > +static int mhi_ndo_stop(struct net_device *ndev)
> > +{
> > + struct mhi_net_dev *mhi_netdev = netdev_priv(ndev);
> > +
> > + netif_stop_queue(ndev);
> > + netif_carrier_off(ndev);
> > + cancel_delayed_work_sync(&mhi_netdev->rx_refill);
>
> Where do you free the allocated skbs? Does
> mhi_unprepare_from_transfer() do that?

When a buffer is queued, it is owned by the device until the transfer
callback (ul_cb/dl_cb) is called. mhi_unprepare_from_transfer() causes
the MHI channels to be reset which in turn leads to releasing the
buffers, for each buffer the MHI core will call the mhi-net transfer
callback with -ENOTCONN status, and we free it from here.

>
> The skbs should be freed somehow in .ndo_stop().

The skbs are released in remove() (mhi_unprepare_from_transfer), I do
not do prepare/unprepare in ndo_open/ndo_stop because we need to have
channels started during the whole life of the interface. That's
because it set up kind of internal routing of on the device/modem
side. Indeed, if channels are not started, configuration of the modem
(via out-of-band qmi, at commands, etc) is not possible.

>
> > + return 0;
> > +}
> > +
> > +static int mhi_ndo_xmit(struct sk_buff *skb, struct net_device *ndev)
> > +{
> > + struct mhi_net_dev *mhi_netdev = netdev_priv(ndev);
> > + struct mhi_device *mdev = mhi_netdev->mdev;
> > + int err;
> > +
> > + err = mhi_queue_skb(mdev, DMA_TO_DEVICE, skb, skb->len, MHI_EOT);
> > + if (unlikely(err)) {
> > + net_err_ratelimited("%s: Failed to queue TX buf (%d)\n",
> > + ndev->name, err);
> > +
> > + u64_stats_update_begin(&mhi_netdev->stats.tx_syncp);
> > + u64_stats_inc(&mhi_netdev->stats.tx_dropped);
> > + u64_stats_update_end(&mhi_netdev->stats.tx_syncp);
> > +
> > + /* drop the packet */
> > + kfree_skb(skb);
>
> dev_kfree_skb_any()
>
> > + }
> > +
> > + if (mhi_queue_is_full(mdev, DMA_TO_DEVICE))
> > + netif_stop_queue(ndev);
> > +
> > + return NETDEV_TX_OK;
> > +}
>
> > +static void mhi_net_dl_callback(struct mhi_device *mhi_dev,
> > + struct mhi_result *mhi_res)
> > +{
> > + struct mhi_net_dev *mhi_netdev = dev_get_drvdata(&mhi_dev->dev);
> > + struct sk_buff *skb = mhi_res->buf_addr;
> > + int remaining;
> > +
> > + remaining = atomic_dec_return(&mhi_netdev->stats.rx_queued);
> > +
> > + if (unlikely(mhi_res->transaction_status)) {
> > + u64_stats_update_begin(&mhi_netdev->stats.rx_syncp);
> > + u64_stats_inc(&mhi_netdev->stats.rx_errors);
> > + u64_stats_update_end(&mhi_netdev->stats.rx_syncp);
> > +
> > + kfree_skb(skb);
>
> Are you sure this never runs with irqs disabled or from irq context?
>
> Otherwise dev_kfree_skb_any().

Yes will fix that.

>
> > +
> > + /* MHI layer resetting the DL channel */
> > + if (mhi_res->transaction_status == -ENOTCONN)
> > + return;
> > + } else {
> > + u64_stats_update_begin(&mhi_netdev->stats.rx_syncp);
> > + u64_stats_inc(&mhi_netdev->stats.rx_packets);
> > + u64_stats_add(&mhi_netdev->stats.rx_bytes, 
> > mhi_res->bytes_xferd);
> > + u64_stats_update_end(&mhi_netdev->stats.rx_syncp);
> > +
> > + skb->protocol = htons(ETH_P_MAP);
> > + skb_put(skb, mhi_res->bytes_xferd);
> > + netif_rx(skb);
> > + }
> > +
> > + /* Refill if RX buffers queue becomes low */
> > + if (remaining <= mhi_netdev->rx_queue_sz / 2)
> > + schedule_delayed_work(&mhi_netdev->rx_refill, 0);
> > +}
> > +
> > +static void mhi_net_ul_callback(struct mhi_device *mhi_dev,
> > + struct mhi_result *mhi_res)
> > +{
> > + struct mhi_net_dev *mhi_netdev = dev_get_drvdata(&mhi_dev->dev);
> > + struct net_device *ndev = mhi_netdev->ndev;
> > + struct sk_buff *skb = mhi_res->buf_addr;
> > +
> > + /* Hardware has consumed the buffer, so free the skb (which is not
> > +  * freed by the MHI stack) and perform accounting.
> > +  */
> > + consume_skb(skb);
>
> ditto
>
> > + u64_stats_update_begin(&mhi_netdev->stats.tx_syncp);
> > + if (unlikely(mhi_res->transaction_status)) {
> > + u64_stats_inc(&mhi_netdev->stats.tx_errors);

[net-next v4 0/8]net: convert tasklets to use new tasklet_setup API

2020-11-03 Thread Allen Pais

From: Allen Pais 

Commit 12cc923f1ccc ("tasklet: Introduce new initialization API")'
introduced a new tasklet initialization API. This series converts
all the net/* drivers to use the new tasklet_setup() API

The following series is based on net-next (9faebeb2d)

v3:
 introduce qdisc_from_priv, suggested by Eric Dumazet.
v2:
  get rid of QDISC_ALIGN() 
v1:
  fix kerneldoc

Allen Pais (8):
  net: dccp: convert tasklets to use new tasklet_setup() API
  net: ipv4: convert tasklets to use new tasklet_setup() API
  net: mac80211: convert tasklets to use new tasklet_setup() API
  net: mac802154: convert tasklets to use new tasklet_setup() API
  net: rds: convert tasklets to use new tasklet_setup() API
  net: sched: convert tasklets to use new tasklet_setup() API
  net: smc: convert tasklets to use new tasklet_setup() API
  net: xfrm: convert tasklets to use new tasklet_setup() API

 include/net/pkt_sched.h|  5 +
 net/dccp/timer.c   | 12 ++--
 net/ipv4/tcp_output.c  |  8 +++-
 net/mac80211/ieee80211_i.h |  4 ++--
 net/mac80211/main.c| 14 +-
 net/mac80211/tx.c  |  5 +++--
 net/mac80211/util.c|  5 +++--
 net/mac802154/main.c   |  8 +++-
 net/rds/ib_cm.c| 14 ++
 net/sched/sch_atm.c|  8 
 net/smc/smc_cdc.c  |  6 +++---
 net/smc/smc_wr.c   | 14 ++
 net/xfrm/xfrm_input.c  |  7 +++
 13 files changed, 52 insertions(+), 58 deletions(-)

-- 
2.25.1

[net-next v4 1/8] net: dccp: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 net/dccp/timer.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/dccp/timer.c b/net/dccp/timer.c
index a934d2932373..db768f223ef7 100644
--- a/net/dccp/timer.c
+++ b/net/dccp/timer.c
@@ -215,13 +215,14 @@ static void dccp_delack_timer(struct timer_list *t)
 
 /**
  * dccp_write_xmitlet  -  Workhorse for CCID packet dequeueing interface
- * @data: Socket to act on
+ * @t: pointer to the tasklet associated with this handler
  *
  * See the comments above %ccid_dequeueing_decision for supported modes.
  */
-static void dccp_write_xmitlet(unsigned long data)
+static void dccp_write_xmitlet(struct tasklet_struct *t)
 {
-   struct sock *sk = (struct sock *)data;
+   struct dccp_sock *dp = from_tasklet(dp, t, dccps_xmitlet);
+   struct sock *sk = &dp->dccps_inet_connection.icsk_inet.sk;
 
bh_lock_sock(sk);
if (sock_owned_by_user(sk))
@@ -235,16 +236,15 @@ static void dccp_write_xmitlet(unsigned long data)
 static void dccp_write_xmit_timer(struct timer_list *t)
 {
struct dccp_sock *dp = from_timer(dp, t, dccps_xmit_timer);
-   struct sock *sk = &dp->dccps_inet_connection.icsk_inet.sk;
 
-   dccp_write_xmitlet((unsigned long)sk);
+   dccp_write_xmitlet(&dp->dccps_xmitlet);
 }
 
 void dccp_init_xmit_timers(struct sock *sk)
 {
struct dccp_sock *dp = dccp_sk(sk);
 
-   tasklet_init(&dp->dccps_xmitlet, dccp_write_xmitlet, (unsigned long)sk);
+   tasklet_setup(&dp->dccps_xmitlet, dccp_write_xmitlet);
timer_setup(&dp->dccps_xmit_timer, dccp_write_xmit_timer, 0);
inet_csk_init_xmit_timers(sk, &dccp_write_timer, &dccp_delack_timer,
  &dccp_keepalive_timer);
-- 
2.25.1

[net-next v4 8/8] net: xfrm: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 net/xfrm/xfrm_input.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 37456d022cfa..be6351e3f3cd 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -760,9 +760,9 @@ int xfrm_input_resume(struct sk_buff *skb, int nexthdr)
 }
 EXPORT_SYMBOL(xfrm_input_resume);
 
-static void xfrm_trans_reinject(unsigned long data)
+static void xfrm_trans_reinject(struct tasklet_struct *t)
 {
-   struct xfrm_trans_tasklet *trans = (void *)data;
+   struct xfrm_trans_tasklet *trans = from_tasklet(trans, t, tasklet);
struct sk_buff_head queue;
struct sk_buff *skb;
 
@@ -818,7 +818,6 @@ void __init xfrm_input_init(void)
 
trans = &per_cpu(xfrm_trans_tasklet, i);
__skb_queue_head_init(&trans->queue);
-   tasklet_init(&trans->tasklet, xfrm_trans_reinject,
-(unsigned long)trans);
+   tasklet_setup(&trans->tasklet, xfrm_trans_reinject);
}
 }
-- 
2.25.1

[net-next v4 6/8] net: sched: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 include/net/pkt_sched.h | 5 +
 net/sched/sch_atm.c | 8 
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 4ed32e6b0201..15b1b30f454e 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -24,6 +24,11 @@ static inline void *qdisc_priv(struct Qdisc *q)
return &q->privdata;
 }
 
+static inline struct Qdisc *qdisc_from_priv(void *priv)
+{
+   return container_of(priv, struct Qdisc, privdata);
+}
+
 /* 
Timer resolution MUST BE < 10% of min_schedulable_packet_size/bandwidth

diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c
index 1c281cc81f57..007bd2d9f1ff 100644
--- a/net/sched/sch_atm.c
+++ b/net/sched/sch_atm.c
@@ -466,10 +466,10 @@ drop: __maybe_unused
  * non-ATM interfaces.
  */
 
-static void sch_atm_dequeue(unsigned long data)
+static void sch_atm_dequeue(struct tasklet_struct *t)
 {
-   struct Qdisc *sch = (struct Qdisc *)data;
-   struct atm_qdisc_data *p = qdisc_priv(sch);
+   struct atm_qdisc_data *p = from_tasklet(p, t, task);
+   struct Qdisc *sch = qdisc_from_priv(p);
struct atm_flow_data *flow;
struct sk_buff *skb;
 
@@ -563,7 +563,7 @@ static int atm_tc_init(struct Qdisc *sch, struct nlattr 
*opt,
if (err)
return err;
 
-   tasklet_init(&p->task, sch_atm_dequeue, (unsigned long)sch);
+   tasklet_setup(&p->task, sch_atm_dequeue);
return 0;
 }
 
-- 
2.25.1

[net-next v4 7/8] net: smc: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 net/smc/smc_cdc.c |  6 +++---
 net/smc/smc_wr.c  | 14 ++
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c
index b1ce6ccbfaec..f23f558054a7 100644
--- a/net/smc/smc_cdc.c
+++ b/net/smc/smc_cdc.c
@@ -389,9 +389,9 @@ static void smc_cdc_msg_recv(struct smc_sock *smc, struct 
smc_cdc_msg *cdc)
  * Context:
  * - tasklet context
  */
-static void smcd_cdc_rx_tsklet(unsigned long data)
+static void smcd_cdc_rx_tsklet(struct tasklet_struct *t)
 {
-   struct smc_connection *conn = (struct smc_connection *)data;
+   struct smc_connection *conn = from_tasklet(conn, t, rx_tsklet);
struct smcd_cdc_msg *data_cdc;
struct smcd_cdc_msg cdc;
struct smc_sock *smc;
@@ -411,7 +411,7 @@ static void smcd_cdc_rx_tsklet(unsigned long data)
  */
 void smcd_cdc_rx_init(struct smc_connection *conn)
 {
-   tasklet_init(&conn->rx_tsklet, smcd_cdc_rx_tsklet, (unsigned long)conn);
+   tasklet_setup(&conn->rx_tsklet, smcd_cdc_rx_tsklet);
 }
 
 /* init, exit, misc **/
diff --git a/net/smc/smc_wr.c b/net/smc/smc_wr.c
index 1e23cdd41eb1..cbc73a7e4d59 100644
--- a/net/smc/smc_wr.c
+++ b/net/smc/smc_wr.c
@@ -131,9 +131,9 @@ static inline void smc_wr_tx_process_cqe(struct ib_wc *wc)
wake_up(&link->wr_tx_wait);
 }
 
-static void smc_wr_tx_tasklet_fn(unsigned long data)
+static void smc_wr_tx_tasklet_fn(struct tasklet_struct *t)
 {
-   struct smc_ib_device *dev = (struct smc_ib_device *)data;
+   struct smc_ib_device *dev = from_tasklet(dev, t, send_tasklet);
struct ib_wc wc[SMC_WR_MAX_POLL_CQE];
int i = 0, rc;
int polled = 0;
@@ -435,9 +435,9 @@ static inline void smc_wr_rx_process_cqes(struct ib_wc 
wc[], int num)
}
 }
 
-static void smc_wr_rx_tasklet_fn(unsigned long data)
+static void smc_wr_rx_tasklet_fn(struct tasklet_struct *t)
 {
-   struct smc_ib_device *dev = (struct smc_ib_device *)data;
+   struct smc_ib_device *dev = from_tasklet(dev, t, recv_tasklet);
struct ib_wc wc[SMC_WR_MAX_POLL_CQE];
int polled = 0;
int rc;
@@ -698,10 +698,8 @@ void smc_wr_remove_dev(struct smc_ib_device *smcibdev)
 
 void smc_wr_add_dev(struct smc_ib_device *smcibdev)
 {
-   tasklet_init(&smcibdev->recv_tasklet, smc_wr_rx_tasklet_fn,
-(unsigned long)smcibdev);
-   tasklet_init(&smcibdev->send_tasklet, smc_wr_tx_tasklet_fn,
-(unsigned long)smcibdev);
+   tasklet_setup(&smcibdev->recv_tasklet, smc_wr_rx_tasklet_fn);
+   tasklet_setup(&smcibdev->send_tasklet, smc_wr_tx_tasklet_fn);
 }
 
 int smc_wr_create_link(struct smc_link *lnk)
-- 
2.25.1

[net-next v4 2/8] net: ipv4: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 net/ipv4/tcp_output.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index bf48cd73e967..6e998d428ceb 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1038,9 +1038,9 @@ static void tcp_tsq_handler(struct sock *sk)
  * transferring tsq->head because tcp_wfree() might
  * interrupt us (non NAPI drivers)
  */
-static void tcp_tasklet_func(unsigned long data)
+static void tcp_tasklet_func(struct tasklet_struct *t)
 {
-   struct tsq_tasklet *tsq = (struct tsq_tasklet *)data;
+   struct tsq_tasklet *tsq = from_tasklet(tsq,  t, tasklet);
LIST_HEAD(list);
unsigned long flags;
struct list_head *q, *n;
@@ -1125,9 +1125,7 @@ void __init tcp_tasklet_init(void)
struct tsq_tasklet *tsq = &per_cpu(tsq_tasklet, i);
 
INIT_LIST_HEAD(&tsq->head);
-   tasklet_init(&tsq->tasklet,
-tcp_tasklet_func,
-(unsigned long)tsq);
+   tasklet_setup(&tsq->tasklet, tcp_tasklet_func);
}
 }
 
-- 
2.25.1

[net-next v4 4/8] net: mac802154: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Acked-by: Stefan Schmidt 
Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 net/mac802154/main.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/mac802154/main.c b/net/mac802154/main.c
index 06ea0f8bfd5c..520cedc594e1 100644
--- a/net/mac802154/main.c
+++ b/net/mac802154/main.c
@@ -20,9 +20,9 @@
 #include "ieee802154_i.h"
 #include "cfg.h"
 
-static void ieee802154_tasklet_handler(unsigned long data)
+static void ieee802154_tasklet_handler(struct tasklet_struct *t)
 {
-   struct ieee802154_local *local = (struct ieee802154_local *)data;
+   struct ieee802154_local *local = from_tasklet(local, t, tasklet);
struct sk_buff *skb;
 
while ((skb = skb_dequeue(&local->skb_queue))) {
@@ -91,9 +91,7 @@ ieee802154_alloc_hw(size_t priv_data_len, const struct 
ieee802154_ops *ops)
INIT_LIST_HEAD(&local->interfaces);
mutex_init(&local->iflist_mtx);
 
-   tasklet_init(&local->tasklet,
-ieee802154_tasklet_handler,
-(unsigned long)local);
+   tasklet_setup(&local->tasklet, ieee802154_tasklet_handler);
 
skb_queue_head_init(&local->skb_queue);
 
-- 
2.25.1

[net-next v4 5/8] net: rds: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 net/rds/ib_cm.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index b36b60668b1d..d06398be4b80 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -314,9 +314,9 @@ static void poll_scq(struct rds_ib_connection *ic, struct 
ib_cq *cq,
}
 }
 
-static void rds_ib_tasklet_fn_send(unsigned long data)
+static void rds_ib_tasklet_fn_send(struct tasklet_struct *t)
 {
-   struct rds_ib_connection *ic = (struct rds_ib_connection *)data;
+   struct rds_ib_connection *ic = from_tasklet(ic, t, i_send_tasklet);
struct rds_connection *conn = ic->conn;
 
rds_ib_stats_inc(s_ib_tasklet_call);
@@ -354,9 +354,9 @@ static void poll_rcq(struct rds_ib_connection *ic, struct 
ib_cq *cq,
}
 }
 
-static void rds_ib_tasklet_fn_recv(unsigned long data)
+static void rds_ib_tasklet_fn_recv(struct tasklet_struct *t)
 {
-   struct rds_ib_connection *ic = (struct rds_ib_connection *)data;
+   struct rds_ib_connection *ic = from_tasklet(ic, t, i_recv_tasklet);
struct rds_connection *conn = ic->conn;
struct rds_ib_device *rds_ibdev = ic->rds_ibdev;
struct rds_ib_ack_state state;
@@ -1219,10 +1219,8 @@ int rds_ib_conn_alloc(struct rds_connection *conn, gfp_t 
gfp)
}
 
INIT_LIST_HEAD(&ic->ib_node);
-   tasklet_init(&ic->i_send_tasklet, rds_ib_tasklet_fn_send,
-(unsigned long)ic);
-   tasklet_init(&ic->i_recv_tasklet, rds_ib_tasklet_fn_recv,
-(unsigned long)ic);
+   tasklet_setup(&ic->i_send_tasklet, rds_ib_tasklet_fn_send);
+   tasklet_setup(&ic->i_recv_tasklet, rds_ib_tasklet_fn_recv);
mutex_init(&ic->i_recv_mutex);
 #ifndef KERNEL_HAS_ATOMIC64
spin_lock_init(&ic->i_ack_lock);
-- 
2.25.1

[net-next v4 3/8] net: mac80211: convert tasklets to use new tasklet_setup() API

2020-11-03 Thread Allen Pais

From: Allen Pais 

In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Reviewed-by: Johannes Berg 
Signed-off-by: Romain Perier 
Signed-off-by: Allen Pais 
---
 net/mac80211/ieee80211_i.h |  4 ++--
 net/mac80211/main.c| 14 +-
 net/mac80211/tx.c  |  5 +++--
 net/mac80211/util.c|  5 +++--
 4 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index 2a21226fb518..2a3b0ee65637 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -1795,7 +1795,7 @@ static inline bool ieee80211_sdata_running(struct 
ieee80211_sub_if_data *sdata)
 
 /* tx handling */
 void ieee80211_clear_tx_pending(struct ieee80211_local *local);
-void ieee80211_tx_pending(unsigned long data);
+void ieee80211_tx_pending(struct tasklet_struct *t);
 netdev_tx_t ieee80211_monitor_start_xmit(struct sk_buff *skb,
 struct net_device *dev);
 netdev_tx_t ieee80211_subif_start_xmit(struct sk_buff *skb,
@@ -2146,7 +2146,7 @@ void ieee80211_txq_remove_vlan(struct ieee80211_local 
*local,
   struct ieee80211_sub_if_data *sdata);
 void ieee80211_fill_txq_stats(struct cfg80211_txq_stats *txqstats,
  struct txq_info *txqi);
-void ieee80211_wake_txqs(unsigned long data);
+void ieee80211_wake_txqs(struct tasklet_struct *t);
 void ieee80211_send_auth(struct ieee80211_sub_if_data *sdata,
 u16 transaction, u16 auth_alg, u16 status,
 const u8 *extra, size_t extra_len, const u8 *bssid,
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 523380aed92e..48ab05186610 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -220,9 +220,9 @@ u32 ieee80211_reset_erp_info(struct ieee80211_sub_if_data 
*sdata)
   BSS_CHANGED_ERP_SLOT;
 }
 
-static void ieee80211_tasklet_handler(unsigned long data)
+static void ieee80211_tasklet_handler(struct tasklet_struct *t)
 {
-   struct ieee80211_local *local = (struct ieee80211_local *) data;
+   struct ieee80211_local *local = from_tasklet(local, t, tasklet);
struct sk_buff *skb;
 
while ((skb = skb_dequeue(&local->skb_queue)) ||
@@ -733,16 +733,12 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t 
priv_data_len,
skb_queue_head_init(&local->pending[i]);
atomic_set(&local->agg_queue_stop[i], 0);
}
-   tasklet_init(&local->tx_pending_tasklet, ieee80211_tx_pending,
-(unsigned long)local);
+   tasklet_setup(&local->tx_pending_tasklet, ieee80211_tx_pending);
 
if (ops->wake_tx_queue)
-   tasklet_init(&local->wake_txqs_tasklet, ieee80211_wake_txqs,
-(unsigned long)local);
+   tasklet_setup(&local->wake_txqs_tasklet, ieee80211_wake_txqs);
 
-   tasklet_init(&local->tasklet,
-ieee80211_tasklet_handler,
-(unsigned long) local);
+   tasklet_setup(&local->tasklet, ieee80211_tasklet_handler);
 
skb_queue_head_init(&local->skb_queue);
skb_queue_head_init(&local->skb_queue_unreliable);
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 8ba10a48ded4..a50c0edb1153 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -4406,9 +4406,10 @@ static bool ieee80211_tx_pending_skb(struct 
ieee80211_local *local,
 /*
  * Transmit all pending packets. Called from tasklet.
  */
-void ieee80211_tx_pending(unsigned long data)
+void ieee80211_tx_pending(struct tasklet_struct *t)
 {
-   struct ieee80211_local *local = (struct ieee80211_local *)data;
+   struct ieee80211_local *local = from_tasklet(local, t,
+tx_pending_tasklet);
unsigned long flags;
int i;
bool txok;
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 49342060490f..a25e47750ed9 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -386,9 +386,10 @@ _ieee80211_wake_txqs(struct ieee80211_local *local, 
unsigned long *flags)
rcu_read_unlock();
 }
 
-void ieee80211_wake_txqs(unsigned long data)
+void ieee80211_wake_txqs(struct tasklet_struct *t)
 {
-   struct ieee80211_local *local = (struct ieee80211_local *)data;
+   struct ieee80211_local *local = from_tasklet(local, t,
+wake_txqs_tasklet);
unsigned long flags;
 
spin_lock_irqsave(&local->queue_stop_reason_lock, flags);
-- 
2.25.1

Re: [PATCH net-next v2] net/usb/r8153_ecm: support ECM mode for RTL8153

2020-11-03 Thread Greg Kroah-Hartman

On Mon, Nov 02, 2020 at 11:47:18AM -0800, Jakub Kicinski wrote:
> On Mon, 2 Nov 2020 07:20:15 + Hayes Wang wrote:
> > Jakub Kicinski 
> > > Can you describe the use case in more detail?
> > > 
> > > AFAICT r8152 defines a match for the exact same device.
> > > Does it not mean that which driver is used will be somewhat random
> > > if both are built?  
> > 
> > I export rtl_get_version() from r8152. It would return none zero
> > value if r8152 could support this device. Both r8152 and r8153_ecm
> > would check the return value of rtl_get_version() in porbe().
> > Therefore, if rtl_get_version() return none zero value, the r8152
> > is used for the device with vendor mode. Otherwise, the r8153_ecm
> > is used for the device with ECM mode.
> 
> Oh, I see, I missed that the rtl_get_version() checking is the inverse
> of r8152.
> 
> > > > +/* Define these values to match your device */
> > > > +#define VENDOR_ID_REALTEK  0x0bda
> > > > +#define VENDOR_ID_MICROSOFT0x045e
> > > > +#define VENDOR_ID_SAMSUNG  0x04e8
> > > > +#define VENDOR_ID_LENOVO   0x17ef
> > > > +#define VENDOR_ID_LINKSYS  0x13b1
> > > > +#define VENDOR_ID_NVIDIA   0x0955
> > > > +#define VENDOR_ID_TPLINK   0x2357  
> > > 
> > > $ git grep 0x2357 | grep -i tplink
> > > drivers/net/usb/cdc_ether.c:#define TPLINK_VENDOR_ID  0x2357
> > > drivers/net/usb/r8152.c:#define VENDOR_ID_TPLINK  0x2357
> > > drivers/usb/serial/option.c:#define TPLINK_VENDOR_ID  
> > > 0x2357
> > > 
> > > $ git grep 0x17ef | grep -i lenovo
> > > drivers/hid/hid-ids.h:#define USB_VENDOR_ID_LENOVO0x17ef
> > > drivers/hid/wacom.h:#define USB_VENDOR_ID_LENOVO  0x17ef
> > > drivers/net/usb/cdc_ether.c:#define LENOVO_VENDOR_ID  0x17ef
> > > drivers/net/usb/r8152.c:#define VENDOR_ID_LENOVO  0x17ef
> > > 
> > > Time to consolidate those vendor id defines perhaps?  
> > 
> > It seems that there is no such header file which I could include
> > or add the new vendor IDs.
> 
> Please create one. (Adding Greg KH to the recipients, in case there is
> a reason that USB subsystem doesn't have a common vendor id header.)

There is a reason, it's a nightmare to maintain and handle merges for,
just don't do it.

Read the comments at the top of the pci_ids.h file if you are curious
why we don't even do this for PCI device ids anymore for the past 10+
years.

So no, please do not create such a common file, it is not needed or a
good idea.

thanks,

greg k-h

[PATCH bpf 2/2] libbpf: fix possible use after free in xsk_socket__delete

2020-11-03 Thread Magnus Karlsson

From: Magnus Karlsson 

Fix a possible use after free in xsk_socket__delete that will happen
if xsk_put_ctx() frees the ctx. To fix, save the umem reference taken
from the context and just use that instead.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Magnus Karlsson 
---
 tools/lib/bpf/xsk.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 504b7a8..9bc537d 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -892,6 +892,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
 {
size_t desc_sz = sizeof(struct xdp_desc);
struct xdp_mmap_offsets off;
+   struct xsk_umem *umem;
struct xsk_ctx *ctx;
int err;
 
@@ -899,6 +900,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
return;
 
ctx = xsk->ctx;
+   umem = ctx->umem;
if (ctx->prog_fd != -1) {
xsk_delete_bpf_maps(xsk);
close(ctx->prog_fd);
@@ -918,11 +920,11 @@ void xsk_socket__delete(struct xsk_socket *xsk)
 
xsk_put_ctx(ctx);
 
-   ctx->umem->refcount--;
+   umem->refcount--;
/* Do not close an fd that also has an associated umem connected
 * to it.
 */
-   if (xsk->fd != ctx->umem->fd)
+   if (xsk->fd != umem->fd)
close(xsk->fd);
free(xsk);
 }
-- 
2.7.4

[PATCH bpf 1/2] libbpf: fix null dereference in xsk_socket__delete

2020-11-03 Thread Magnus Karlsson

From: Magnus Karlsson 

Fix a possible null pointer dereference in xsk_socket__delete that
will occur if a null pointer is fed into the function.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Reported-by: Andrii Nakryiko 
Signed-off-by: Magnus Karlsson 
---
 tools/lib/bpf/xsk.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index e3c98c0..504b7a8 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -891,13 +891,14 @@ int xsk_umem__delete(struct xsk_umem *umem)
 void xsk_socket__delete(struct xsk_socket *xsk)
 {
size_t desc_sz = sizeof(struct xdp_desc);
-   struct xsk_ctx *ctx = xsk->ctx;
struct xdp_mmap_offsets off;
+   struct xsk_ctx *ctx;
int err;
 
if (!xsk)
return;
 
+   ctx = xsk->ctx;
if (ctx->prog_fd != -1) {
xsk_delete_bpf_maps(xsk);
close(ctx->prog_fd);
-- 
2.7.4

[PATCH bpf 0/2] libbpf: fix two bugs in xsk_socket__delete

2020-11-03 Thread Magnus Karlsson

This small series fixes two bugs in xsk_socket__delete. Details can be
found in the individual commit messages, but a brief summary follows:

Patch 1: fix null pointer dereference in xsk_socket__delete
Patch 2: fix possible use after free in xsk_socket__delete

This patch has been applied against commit 7a078d2d1880 ("libbpf, hashmap: Fix 
undefined behavior in hash_bits")

Thanks: Magnus

Magnus Karlsson (2):
  libbpf: fix null dereference in xsk_socket__delete
  libbpf: fix possible use after free in xsk_socket__delete

 tools/lib/bpf/xsk.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

--
2.7.4

[PATCH net-next v3 2/2] net/usb/r8153_ecm: support ECM mode for RTL8153

2020-11-03 Thread Hayes Wang

Support ECM mode based on cdc_ether with relative mii
functions, when CONFIG_USB_RTL8152 is not set, or the
device is not supported by r8152 driver.

Both r8152 and r8153_ecm would check the return value of
rtl8152_get_version() in porbe(). If rtl8152_get_version()
return none zero value, the r8152 is used for the device
with vendor mode. Otherwise, the r8153_ecm is used for the
device with ECM mode.

Signed-off-by: Hayes Wang 
---
 drivers/net/usb/Makefile|   2 +-
 drivers/net/usb/r8152.c |  22 +
 drivers/net/usb/r8153_ecm.c | 162 
 include/linux/usb/r8152.h   |  30 +++
 4 files changed, 197 insertions(+), 19 deletions(-)
 create mode 100644 drivers/net/usb/r8153_ecm.c
 create mode 100644 include/linux/usb/r8152.h

diff --git a/drivers/net/usb/Makefile b/drivers/net/usb/Makefile
index 99fd12be2111..99381e6bea78 100644
--- a/drivers/net/usb/Makefile
+++ b/drivers/net/usb/Makefile
@@ -13,7 +13,7 @@ obj-$(CONFIG_USB_LAN78XX) += lan78xx.o
 obj-$(CONFIG_USB_NET_AX8817X)  += asix.o
 asix-y := asix_devices.o asix_common.o ax88172a.o
 obj-$(CONFIG_USB_NET_AX88179_178A)  += ax88179_178a.o
-obj-$(CONFIG_USB_NET_CDCETHER) += cdc_ether.o
+obj-$(CONFIG_USB_NET_CDCETHER) += cdc_ether.o r8153_ecm.o
 obj-$(CONFIG_USB_NET_CDC_EEM)  += cdc_eem.o
 obj-$(CONFIG_USB_NET_DM9601)   += dm9601.o
 obj-$(CONFIG_USB_NET_SR9700)   += sr9700.o
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index d8ae89aa470c..41b803729996 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -26,7 +26,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 /* Information for net-next */
 #define NETNEXT_VERSION"11"
@@ -654,18 +654,6 @@ enum rtl_register_content {
 
 #define INTR_LINK  0x0004
 
-#define RTL8152_REQT_READ  0xc0
-#define RTL8152_REQT_WRITE 0x40
-#define RTL8152_REQ_GET_REGS   0x05
-#define RTL8152_REQ_SET_REGS   0x05
-
-#define BYTE_EN_DWORD  0xff
-#define BYTE_EN_WORD   0x33
-#define BYTE_EN_BYTE   0x11
-#define BYTE_EN_SIX_BYTES  0x3f
-#define BYTE_EN_START_MASK 0x0f
-#define BYTE_EN_END_MASK   0xf0
-
 #define RTL8153_MAX_PACKET 9216 /* 9K */
 #define RTL8153_MAX_MTU(RTL8153_MAX_PACKET - VLAN_ETH_HLEN - \
 ETH_FCS_LEN)
@@ -693,9 +681,6 @@ enum rtl8152_flags {
 #define DEVICE_ID_THINKPAD_THUNDERBOLT3_DOCK_GEN2  0x3082
 #define DEVICE_ID_THINKPAD_USB_C_DOCK_GEN2 0xa387
 
-#define MCU_TYPE_PLA   0x0100
-#define MCU_TYPE_USB   0x
-
 struct tally_counter {
__le64  tx_packets;
__le64  rx_packets;
@@ -6607,7 +6592,7 @@ static int rtl_fw_init(struct r8152 *tp)
return 0;
 }
 
-static u8 rtl_get_version(struct usb_interface *intf)
+u8 rtl8152_get_version(struct usb_interface *intf)
 {
struct usb_device *udev = interface_to_usbdev(intf);
u32 ocp_data = 0;
@@ -6665,12 +6650,13 @@ static u8 rtl_get_version(struct usb_interface *intf)
 
return version;
 }
+EXPORT_SYMBOL_GPL(rtl8152_get_version);
 
 static int rtl8152_probe(struct usb_interface *intf,
 const struct usb_device_id *id)
 {
struct usb_device *udev = interface_to_usbdev(intf);
-   u8 version = rtl_get_version(intf);
+   u8 version = rtl8152_get_version(intf);
struct r8152 *tp;
struct net_device *netdev;
int ret;
diff --git a/drivers/net/usb/r8153_ecm.c b/drivers/net/usb/r8153_ecm.c
new file mode 100644
index ..13eba7a72633
--- /dev/null
+++ b/drivers/net/usb/r8153_ecm.c
@@ -0,0 +1,162 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define OCP_BASE   0xe86c
+
+static int pla_read_word(struct usbnet *dev, u16 index)
+{
+   u16 byen = BYTE_EN_WORD;
+   u8 shift = index & 2;
+   __le32 tmp;
+   int ret;
+
+   if (shift)
+   byen <<= shift;
+
+   index &= ~3;
+
+   ret = usbnet_read_cmd(dev, RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, 
index,
+ MCU_TYPE_PLA | byen, &tmp, sizeof(tmp));
+   if (ret < 0)
+   goto out;
+
+   ret = __le32_to_cpu(tmp);
+   ret >>= (shift * 8);
+   ret &= 0x;
+
+out:
+   return ret;
+}
+
+static int pla_write_word(struct usbnet *dev, u16 index, u32 data)
+{
+   u32 mask = 0x;
+   u16 byen = BYTE_EN_WORD;
+   u8 shift = index & 2;
+   __le32 tmp;
+   int ret;
+
+   data &= mask;
+
+   if (shift) {
+   byen <<= shift;
+   mask <<= (shift * 8);
+   data <<= (shift * 8);
+   }
+
+   index &= ~3;
+
+   ret = usbnet_read_cmd(dev, RTL8152_REQ_GET_REGS, RTL8152_REQT_READ, 
index,
+ MCU_TYPE_PLA | byen, &tmp, sizeof(tmp));
+
+   if (ret < 0)
+   goto out;
+
+   data |

[PATCH net-next v3 1/2] include/linux/usb: new header file for the vendor ID of USB devices

2020-11-03 Thread Hayes Wang

Add a new header file usb_vendor_id.h to consolidate the definitions
of the vendor ID of USB devices which may be used by cdc_ether and
r8152 driver.

Signed-off-by: Hayes Wang 
---
 drivers/net/usb/cdc_ether.c   | 139 +-
 drivers/net/usb/r8152.c   |  48 +--
 include/linux/usb/usb_vendor_id.h |  51 +++
 3 files changed, 133 insertions(+), 105 deletions(-)
 create mode 100644 include/linux/usb/usb_vendor_id.h

diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
index 8c1d61c2cbac..1f6d9b46883a 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 #if IS_ENABLED(CONFIG_USB_NET_RNDIS_HOST)
@@ -540,22 +541,6 @@ static const struct driver_info wwan_info = {
 
 /*-*/
 
-#define HUAWEI_VENDOR_ID   0x12D1
-#define NOVATEL_VENDOR_ID  0x1410
-#define ZTE_VENDOR_ID  0x19D2
-#define DELL_VENDOR_ID 0x413C
-#define REALTEK_VENDOR_ID  0x0bda
-#define SAMSUNG_VENDOR_ID  0x04e8
-#define LENOVO_VENDOR_ID   0x17ef
-#define LINKSYS_VENDOR_ID  0x13b1
-#define NVIDIA_VENDOR_ID   0x0955
-#define HP_VENDOR_ID   0x03f0
-#define MICROSOFT_VENDOR_ID0x045e
-#define UBLOX_VENDOR_ID0x1546
-#define TPLINK_VENDOR_ID   0x2357
-#define AQUANTIA_VENDOR_ID 0x2eca
-#define ASIX_VENDOR_ID 0x0b95
-
 static const struct usb_device_id  products[] = {
 /* BLACKLIST !!
  *
@@ -661,49 +646,49 @@ static const struct usb_device_id products[] = {
 
 /* Novatel USB551L and MC551 - handled by qmi_wwan */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(NOVATEL_VENDOR_ID, 0xB001, USB_CLASS_COMM,
-   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_NOVATEL, 0xB001, 
USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_ETHERNET, 
USB_CDC_PROTO_NONE),
.driver_info = 0,
 },
 
 /* Novatel E362 - handled by qmi_wwan */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(NOVATEL_VENDOR_ID, 0x9010, USB_CLASS_COMM,
-   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_NOVATEL, 0x9010, 
USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_ETHERNET, 
USB_CDC_PROTO_NONE),
.driver_info = 0,
 },
 
 /* Dell Wireless 5800 (Novatel E362) - handled by qmi_wwan */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x8195, USB_CLASS_COMM,
-   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_DELL, 0x8195, 
USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_ETHERNET, 
USB_CDC_PROTO_NONE),
.driver_info = 0,
 },
 
 /* Dell Wireless 5800 (Novatel E362) - handled by qmi_wwan */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x8196, USB_CLASS_COMM,
-   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_DELL, 0x8196, 
USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_ETHERNET, 
USB_CDC_PROTO_NONE),
.driver_info = 0,
 },
 
 /* Dell Wireless 5804 (Novatel E371) - handled by qmi_wwan */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, 0x819b, USB_CLASS_COMM,
-   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_DELL, 0x819b, 
USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_ETHERNET, 
USB_CDC_PROTO_NONE),
.driver_info = 0,
 },
 
 /* Novatel Expedite E371 - handled by qmi_wwan */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(NOVATEL_VENDOR_ID, 0x9011, USB_CLASS_COMM,
-   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_NOVATEL, 0x9011, 
USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_ETHERNET, 
USB_CDC_PROTO_NONE),
.driver_info = 0,
 },
 
 /* HP lt2523 (Novatel E371) - handled by qmi_wwan */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(HP_VENDOR_ID, 0x421d, USB_CLASS_COMM,
+   USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_HP, 0x421d, USB_CLASS_COMM,
  USB_CDC_SUBCLASS_ETHERNET, 
USB_CDC_PROTO_NONE),
.driver_info = 0,
 },
@@ -717,127 +702,127 @@ static const struct usb_device_id   products[] = {
 
 /* Huawei E1820 - handled by qmi_wwan */
 {
-   USB_DEVICE_INTERFACE_NUMBER(HUAWEI_VENDOR_ID, 0x14ac, 1),
+   USB_DEVICE_INTERFACE_NUMBER(USB_VENDOR_ID_HUAWEI, 0x14ac, 1),
.driver_info = 0,
 },
 
 /* Realtek RTL8152 Based USB 2.0 Ethernet Adapters */
 {
-   USB_DEVICE_AND_INTERFACE_INFO(REALTEK_VENDOR_ID, 0x8152, USB_CLASS_COMM,
-   USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+   USB_DEVICE_AND_INTER

[PATCH net-next v3 0/2] drivers/net/usb: support ECM mode for RTL8153

2020-11-03 Thread Hayes Wang

v3:
Move original patch to #2. And add a new patch #1 to consolidate vendor ID
of USB devices.

v2:
Add include/linux/usb/r8152.h to avoid the warning about
no previous prototype for rtl8152_get_version.

Hayes Wang (2):
  include/linux/usb: new header file for the vendor ID of USB devices
  net/usb/r8153_ecm: support ECM mode for RTL8153

 drivers/net/usb/Makefile  |   2 +-
 drivers/net/usb/cdc_ether.c   |  93 +++--
 drivers/net/usb/r8152.c   |  68 +
 drivers/net/usb/r8153_ecm.c   | 162 ++
 include/linux/usb/r8152.h |  30 ++
 include/linux/usb/usb_vendor_id.h |  51 ++
 6 files changed, 306 insertions(+), 100 deletions(-)
 create mode 100644 drivers/net/usb/r8153_ecm.c
 create mode 100644 include/linux/usb/r8152.h
 create mode 100644 include/linux/usb/usb_vendor_id.h

-- 
2.26.2

RE: [PATCH net-next v2] net/usb/r8153_ecm: support ECM mode for RTL8153

2020-11-03 Thread Hayes Wang

Greg Kroah-Hartman 
> Sent: Tuesday, November 3, 2020 5:33 PM
[...]
> There is a reason, it's a nightmare to maintain and handle merges for,
> just don't do it.
> 
> Read the comments at the top of the pci_ids.h file if you are curious
> why we don't even do this for PCI device ids anymore for the past 10+
> years.
> 
> So no, please do not create such a common file, it is not needed or a
> good idea.

Oops. I have sent it.

Re: [PATCH net-next v3 1/2] include/linux/usb: new header file for the vendor ID of USB devices

2020-11-03 Thread Greg KH

On Tue, Nov 03, 2020 at 05:46:37PM +0800, Hayes Wang wrote:
> diff --git a/include/linux/usb/usb_vendor_id.h 
> b/include/linux/usb/usb_vendor_id.h
> new file mode 100644
> index ..23b6e6849515
> --- /dev/null
> +++ b/include/linux/usb/usb_vendor_id.h
> @@ -0,0 +1,51 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +

No, this is not ok, sorry.  Please see the top of the pci_ids.h file why
we do not do this.

There is nothing wrong with putting the individual ids in the different
drivers, we don't want one single huge file that is a pain for merges
and builds.  We learn from our past mistakes, please do not fail to
learn from history :)

thanks,

greg k-h

Re: [PATCH v7 0/6] CTU CAN FD open-source IP core SocketCAN driver, PCI, platform integration and documentation

2020-11-03 Thread Pavel Pisa

Hello Marc,

thanks for response

On Saturday 31 of October 2020 12:35:11 Marc Kleine-Budde wrote:
> On 10/30/20 11:19 PM, Pavel Pisa wrote:
> > This driver adds support for the CTU CAN FD open-source IP core.
>
> Please fix the following checkpatch warnings/errors:

Yes I recheck with actual checkpatch, I have used 5.4 one
and may it be overlooked something during last upadates.

> -
> drivers/net/can/ctucanfd/ctucanfd_frame.h
> -
> CHECK: Please don't use multiple blank lines
> #46: FILE: drivers/net/can/ctucanfd/ctucanfd_frame.h:46:

OK, we find a reason for this blank line in header generator.

> CHECK: Prefer kernel type 'u32' over 'uint32_t'
> #49: FILE: drivers/net/can/ctucanfd/ctucanfd_frame.h:49:
> + uint32_t u32;

In this case, please confirm that even your personal opinion
is against uint32_t in headers, you request the change.

uint32_t is used in many kernel headers and in this case
allows our tooling to use headers for mutual test of HDL
design match with HW access in the C.

If the reasons to remove uint32_t prevails, we need to
separate Linux generator from the one used for other
purposes. When we add Linux mode then we can revamp
headers even more and in such case we can even invest
time to switch from structure bitfields to plain bitmask
defines. It is quite lot of work and takes some time,
but if there is consensus I do it during next weeks,
I would like to see what is preferred way to define
registers bitfields. I personally like RTEMS approach
for which we have prepared generator from parsed PDFs
when we added BSP for TMS570 

https://git.rtems.org/rtems/tree/bsps/arm/tms570/include/bsp/ti_herc/reg_dcan.h#n152

Other solution I like (biased, because I have even designed it)
is

  #define __val2mfld(mask,val) (((mask)&~((mask)<<1))*(val)&(mask))
  #define __mfld2val(mask,val) (((val)&(mask))/((mask)&~((mask)<<1)))

https://gitlab.com/pikron/sw-base/sysless/-/blob/master/arch/arm/generic/defines/cpu_def.h#L314

Which allows to use simple masks, i.e.
  #define SSP_CR0_DSS_m  0x000f  /* Data Size Select (num bits - 1) */
  #define SSP_CR0_FRF_m  0x0030  /* Frame Format: 0 SPI, 1 TI, 2 Microwire */
  #define SSP_CR0_CPOL_m 0x0040  /* SPI Clock Polarity. 0 low between frames, 1 
high */ #

https://gitlab.com/pikron/sw-base/sysless/-/blob/master/libs4c/spi/spi_lpcssp.c#L46

in the sources

  lpcssp_drv->ssp_regs->CR0 =
__val2mfld(SSP_CR0_DSS_m, lpcssp_drv->data16_fl? 16 - 1 : 8 
- 1) |
__val2mfld(SSP_CR0_FRF_m, 0) |
(msg->size_mode & SPI_MODE_CPOL? SSP_CR0_CPOL_m: 0) |
(msg->size_mode & SPI_MODE_CPHA? SSP_CR0_CPHA_m: 0) |
__val2mfld(SSP_CR0_SCR_m, rate);

https://gitlab.com/pikron/sw-base/sysless/-/blob/master/libs4c/spi/spi_lpcssp.c#L217

If you have some preferred Linux style then please send us pointers.
In the fact, Ondrej Ille has based his structure bitfileds style
on the other driver included in the Linux kernel and it seems
to be a problem now. So when I invest my time, I want to use style
which pleases me and others.

Thanks for the support and best wishes,

Pavel Pisa

Re: [PATCH v2 0/8] slab: provide and use krealloc_array()

2020-11-03 Thread Bartosz Golaszewski

On Tue, Nov 3, 2020 at 5:14 AM Joe Perches  wrote:
>
> On Mon, 2020-11-02 at 16:20 +0100, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski 
> >
> > Andy brought to my attention the fact that users allocating an array of
> > equally sized elements should check if the size multiplication doesn't
> > overflow. This is why we have helpers like kmalloc_array().
> >
> > However we don't have krealloc_array() equivalent and there are many
> > users who do their own multiplication when calling krealloc() for arrays.
> >
> > This series provides krealloc_array() and uses it in a couple places.
>
> My concern about this is a possible assumption that __GFP_ZERO will
> work, and as far as I know, it will not.
>

Yeah so I had this concern for devm_krealloc() and even sent a patch
that extended it to honor __GFP_ZERO before I noticed that regular
krealloc() silently ignores __GFP_ZERO. I'm not sure if this is on
purpose. Maybe we should either make krealloc() honor __GFP_ZERO or
explicitly state in its documentation that it ignores it?

This concern isn't really related to this patch as such - it's more of
a general krealloc() inconsistency.

Bartosz

RE: [PATCH net-next 6/7] drivers: net: smc911x: Fix cast from pointer to integer of different size

2020-11-03 Thread David Laight

From: Jakub Kicinski
> Sent: 02 November 2020 23:48
> 
> On Sat, 31 Oct 2020 01:49:57 +0100 Andrew Lunn wrote:
> > drivers/net/ethernet/smsc/smc911x.c: In function 
> > ‘smc911x_hardware_send_pkt’:
> > drivers/net/ethernet/smsc/smc911x.c:471:11: warning: cast from pointer to 
> > integer of different size
> [-Wpointer-to-int-cast]
> >   471 |  cmdA = (((u32)skb->data & 0x3) << 16) |
> >
> > When built on 64bit targets, the skb->data pointer cannot be cast to a
> > u32 in a meaningful way. Use long instead.
> >
> > Signed-off-by: Andrew Lunn 
> > ---
> >  drivers/net/ethernet/smsc/smc911x.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/smsc/smc911x.c 
> > b/drivers/net/ethernet/smsc/smc911x.c
> > index 4ec292563f38..f37832540364 100644
> > --- a/drivers/net/ethernet/smsc/smc911x.c
> > +++ b/drivers/net/ethernet/smsc/smc911x.c
> > @@ -466,9 +466,9 @@ static void smc911x_hardware_send_pkt(struct net_device 
> > *dev)
> > TX_CMD_A_INT_FIRST_SEG_ | TX_CMD_A_INT_LAST_SEG_ |
> > skb->len;
> >  #else
> > -   buf = (char*)((u32)skb->data & ~0x3);
> > -   len = (skb->len + 3 + ((u32)skb->data & 3)) & ~0x3;
> > -   cmdA = (((u32)skb->data & 0x3) << 16) |
> > +   buf = (char *)((long)skb->data & ~0x3);
> > +   len = (skb->len + 3 + ((long)skb->data & 3)) & ~0x3;
> > +   cmdA = (((long)skb->data & 0x3) << 16) |
> 
> Probably best if you swap the (long) for something unsigned here as
> well.

It would be much clearer with a temporary variable:
offset = (unsigned long)skb->data & 3;
buf = skb->data - offset;
len = skb->len + offset;
cmdA = offset << 16 | ...

   David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

RE: [PATCH net-next] drivers: net: sky2: Fix -Wstringop-truncation with W=1

2020-11-03 Thread David Laight

From: Jakub Kicinski
> Sent: 03 November 2020 00:01
> 
> On Sat, 31 Oct 2020 18:40:28 +0100 Andrew Lunn wrote:
> > In function ‘strncpy’,
> > inlined from ‘sky2_name’ at drivers/net/ethernet/marvell/sky2.c:4903:3,
> > inlined from ‘sky2_probe’ at drivers/net/ethernet/marvell/sky2.c:5049:2:
> > ./include/linux/string.h:297:30: warning: ‘__builtin_strncpy’ specified 
> > bound 16 equals destination
> size [-Wstringop-truncation]
> >
> > None of the device names are 16 characters long, so it was never an
> > issue, but reduce the length of the buffer size by one to avoid the
> > warning.
> >
> > Signed-off-by: Andrew Lunn 
> > ---
> >  drivers/net/ethernet/marvell/sky2.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/marvell/sky2.c 
> > b/drivers/net/ethernet/marvell/sky2.c
> > index 25981a7a43b5..35b0ec5afe13 100644
> > --- a/drivers/net/ethernet/marvell/sky2.c
> > +++ b/drivers/net/ethernet/marvell/sky2.c
> > @@ -4900,7 +4900,7 @@ static const char *sky2_name(u8 chipid, char *buf, 
> > int sz)
> > };
> >
> > if (chipid >= CHIP_ID_YUKON_XL && chipid <= CHIP_ID_YUKON_OP_2)
> > -   strncpy(buf, name[chipid - CHIP_ID_YUKON_XL], sz);
> > +   strncpy(buf, name[chipid - CHIP_ID_YUKON_XL], sz - 1);
> 
> Hm. This irks the eye a little. AFAIK the idiomatic code would be:
> 
>   strncpy(buf, name..., sz - 1);
>   buf[sz - 1] = '\0';
> 
> Perhaps it's easier to convert to strscpy()/strscpy_pad()?
> 
> > else
> > snprintf(buf, sz, "(chip %#x)", chipid);
> > return buf;

Is the pad needed?
It isn't present in the 'else' branch.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

[PATCH net-next v2 00/15] net/smc: extend diagnostic netlink interface

2020-11-03 Thread Karsten Graul

Please apply the following patch series for smc to netdev's net-next tree.

This patch series refactors the current netlink API in smc_diag module
which is used for diagnostic purposes and extends the netlink API in a
backward compatible way so that the extended API can provide information
about SMC linkgroups, links and devices (both for SMC-R and SMC-D) and
can still work with the legacy netlink API.

Please note that patch 9 triggers a checkpatch warning because a comment
line was added using the style of the already existing comment block.

v2: in patch 10, add missing include to uapi header smc_diag.h

Guvenc Gulce (14):
  net/smc: Use active link of the connection
  net/smc: Add connection counters for links
  net/smc: Add link counters for IB device ports
  net/smc: Add diagnostic information to smc ib-device
  net/smc: Add diagnostic information to link structure
  net/smc: Refactor the netlink reply processing routine
  net/smc: Add ability to work with extended SMC netlink API
  net/smc: Introduce SMCR get linkgroup command
  net/smc: Introduce SMCR get link command
  net/smc: Add SMC-D Linkgroup diagnostic support
  net/smc: Add support for obtaining SMCD device list
  net/smc: Add support for obtaining SMCR device list
  net/smc: Refactor smc ism v2 capability handling
  net/smc: Add support for obtaining system information

Karsten Graul (1):
  net/smc: use helper smc_conn_abort() in listen processing

 include/net/smc.h |   2 +-
 include/uapi/linux/smc.h  |   8 +
 include/uapi/linux/smc_diag.h | 109 +
 net/smc/af_smc.c  |  29 +-
 net/smc/smc.h |   5 +-
 net/smc/smc_clc.c |   6 +
 net/smc/smc_clc.h |   1 +
 net/smc/smc_core.c|  32 +-
 net/smc/smc_core.h|  32 +-
 net/smc/smc_diag.c| 766 +-
 net/smc/smc_ib.c  |  49 +++
 net/smc/smc_ib.h  |   4 +-
 net/smc/smc_ism.c |  12 +-
 net/smc/smc_ism.h |   5 +-
 net/smc/smc_pnet.c|   3 +
 15 files changed, 939 insertions(+), 124 deletions(-)

-- 
2.17.1

[PATCH net-next v2 10/15] net/smc: Introduce SMCR get link command

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Introduce get link command which loops through
all available links of all available link groups. It
uses the SMC-R linkgroup list as entry point, not
the socket list, which makes linkgroup diagnosis
possible, in case linkgroup does not contain active
connections anymore.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 include/uapi/linux/smc_diag.h |  8 +
 net/smc/smc_diag.c| 62 ++-
 2 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/smc_diag.h b/include/uapi/linux/smc_diag.h
index 6ae028344b6d..a57df0296aa4 100644
--- a/include/uapi/linux/smc_diag.h
+++ b/include/uapi/linux/smc_diag.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -79,6 +80,7 @@ enum {
 /* SMC_DIAG_GET_LGR_INFO command extensions */
 enum {
SMC_DIAG_LGR_INFO_SMCR = 1,
+   SMC_DIAG_LGR_INFO_SMCR_LINK,
 };
 
 #define SMC_DIAG_MAX (__SMC_DIAG_MAX - 1)
@@ -129,6 +131,12 @@ struct smc_diag_linkinfo {
__u8 ibport;/* RDMA device port number */
__u8 gid[40];   /* local GID */
__u8 peer_gid[40];  /* peer GID */
+   /* Fields above used by legacy v1 code */
+   __u32 conn_cnt;
+   __u8 netdev[IFNAMSIZ];  /* ethernet device name */
+   __u8 link_uid[4];   /* unique link id */
+   __u8 peer_link_uid[4];  /* unique peer link id */
+   __u32 link_state;   /* link state */
 };
 
 struct smc_diag_lgrinfo {
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index c53904b3f350..6885814b6e4f 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -20,6 +20,7 @@
 #include 
 
 #include "smc.h"
+#include "smc_ib.h"
 #include "smc_core.h"
 
 struct smc_diag_dump_ctx {
@@ -203,6 +204,54 @@ static bool smc_diag_fill_dmbinfo(struct sock *sk, struct 
sk_buff *skb)
return true;
 }
 
+static int smc_diag_fill_lgr_link(struct smc_link_group *lgr,
+ struct smc_link *link,
+ struct sk_buff *skb,
+ struct netlink_callback *cb,
+ struct smc_diag_req_v2 *req)
+{
+   struct smc_diag_linkinfo link_info;
+   int dummy = 0, rc = 0;
+   struct nlmsghdr *nlh;
+
+   nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, MAGIC_SEQ_V2_ACK,
+   cb->nlh->nlmsg_type, 0, NLM_F_MULTI);
+
+   memset(&link_info, 0, sizeof(link_info));
+   link_info.link_state = link->state;
+   link_info.link_id = link->link_id;
+   link_info.conn_cnt = atomic_read(&link->conn_cnt);
+   link_info.ibport = link->ibport;
+
+   memcpy(link_info.link_uid, link->link_uid,
+  sizeof(link_info.link_uid));
+   snprintf(link_info.ibname, sizeof(link_info.ibname), "%s",
+link->ibname);
+   snprintf(link_info.netdev, sizeof(link_info.netdev), "%s",
+link->ndevname);
+   memcpy(link_info.peer_link_uid, link->peer_link_uid,
+  sizeof(link_info.peer_link_uid));
+
+   smc_gid_be16_convert(link_info.gid,
+link->gid);
+   smc_gid_be16_convert(link_info.peer_gid,
+link->peer_gid);
+
+   /* Just a command place holder to signal back the command reply type */
+   if (nla_put(skb, SMC_DIAG_GET_LGR_INFO, sizeof(dummy), &dummy) < 0)
+   goto errout;
+   if (nla_put(skb, SMC_DIAG_LGR_INFO_SMCR_LINK,
+   sizeof(link_info), &link_info) < 0)
+   goto errout;
+
+   nlmsg_end(skb, nlh);
+   return rc;
+
+errout:
+   nlmsg_cancel(skb, nlh);
+   return -EMSGSIZE;
+}
+
 static int smc_diag_fill_lgr(struct smc_link_group *lgr,
 struct sk_buff *skb,
 struct netlink_callback *cb,
@@ -238,7 +287,7 @@ static int smc_diag_handle_lgr(struct smc_link_group *lgr,
   struct smc_diag_req_v2 *req)
 {
struct nlmsghdr *nlh;
-   int rc = 0;
+   int i, rc = 0;
 
nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, MAGIC_SEQ_V2_ACK,
cb->nlh->nlmsg_type, 0, NLM_F_MULTI);
@@ -250,6 +299,17 @@ static int smc_diag_handle_lgr(struct smc_link_group *lgr,
goto errout;
 
nlmsg_end(skb, nlh);
+
+   if ((req->cmd_ext & (1 << (SMC_DIAG_LGR_INFO_SMCR_LINK - 1 {
+   for (i = 0; i < SMC_LINKS_PER_LGR_MAX; i++) {
+   if (!smc_link_usable(&lgr->lnk[i]))
+   continue;
+   rc = smc_diag_fill_lgr_link(lgr, &lgr->lnk[i], skb,
+   cb, req);
+   if (rc < 0)
+   goto errout;
+   }
+   }
return rc;
 
 errout:
-- 
2.17.1

[PATCH net-next v2 05/15] net/smc: Add diagnostic information to smc ib-device

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

During smc ib-device creation, add network device name to smc
ib-device structure. Register for netdevice name changes and
update ib-device accordingly. This is needed for diagnostic purposes.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 net/smc/smc_ib.c   | 47 ++
 net/smc/smc_ib.h   |  2 ++
 net/smc/smc_pnet.c |  3 +++
 3 files changed, 52 insertions(+)

diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 1c314dbdc7fa..c4a04e868bf0 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -557,6 +557,52 @@ static void smc_ib_cleanup_per_ibdev(struct smc_ib_device 
*smcibdev)
 
 static struct ib_client smc_ib_client;
 
+static void smc_copy_netdev_name(struct smc_ib_device *smcibdev, int port)
+{
+   struct ib_device *ibdev = smcibdev->ibdev;
+   struct net_device *ndev;
+
+   if (ibdev->ops.get_netdev) {
+   ndev = ibdev->ops.get_netdev(ibdev, port + 1);
+   if (ndev) {
+   snprintf((char *)&smcibdev->netdev[port],
+sizeof(smcibdev->netdev[port]),
+"%s", ndev->name);
+   dev_put(ndev);
+   }
+   }
+}
+
+void smc_ib_ndev_name_change(struct net_device *ndev)
+{
+   struct smc_ib_device *smcibdev;
+   struct ib_device *libdev;
+   struct net_device *lndev;
+   u8 port_cnt;
+   int i;
+
+   mutex_lock(&smc_ib_devices.mutex);
+   list_for_each_entry(smcibdev, &smc_ib_devices.list, list) {
+   port_cnt = smcibdev->ibdev->phys_port_cnt;
+   for (i = 0;
+i < min_t(size_t, port_cnt, SMC_MAX_PORTS);
+i++) {
+   libdev = smcibdev->ibdev;
+   if (libdev->ops.get_netdev) {
+   lndev = libdev->ops.get_netdev(libdev, i + 1);
+   if (lndev)
+   dev_put(lndev);
+   if (lndev == ndev) {
+   snprintf((char *)&smcibdev->netdev[i],
+sizeof(smcibdev->netdev[i]),
+"%s", ndev->name);
+   }
+   }
+   }
+   }
+   mutex_unlock(&smc_ib_devices.mutex);
+}
+
 /* callback function for ib_register_client() */
 static int smc_ib_add_dev(struct ib_device *ibdev)
 {
@@ -596,6 +642,7 @@ static int smc_ib_add_dev(struct ib_device *ibdev)
if (smc_pnetid_by_dev_port(ibdev->dev.parent, i,
   smcibdev->pnetid[i]))
smc_pnetid_by_table_ib(smcibdev, i + 1);
+   smc_copy_netdev_name(smcibdev, i);
pr_warn_ratelimited("smc:ib device %s port %d has pnetid "
"%.16s%s\n",
smcibdev->ibdev->name, i + 1,
diff --git a/net/smc/smc_ib.h b/net/smc/smc_ib.h
index 3e6bfeddd53b..b0868146b46b 100644
--- a/net/smc/smc_ib.h
+++ b/net/smc/smc_ib.h
@@ -54,11 +54,13 @@ struct smc_ib_device {  /* 
ib-device infos for smc */
wait_queue_head_t   lnks_deleted;   /* wait 4 removal of all links*/
struct mutexmutex;  /* protect dev setup+cleanup */
atomic_tlnk_cnt_by_port[SMC_MAX_PORTS];/*#lnk per port*/
+   u8  netdev[SMC_MAX_PORTS][IFNAMSIZ];/* ndev names */
 };
 
 struct smc_buf_desc;
 struct smc_link;
 
+void smc_ib_ndev_name_change(struct net_device *ndev);
 int smc_ib_register_client(void) __init;
 void smc_ib_unregister_client(void);
 bool smc_ib_port_active(struct smc_ib_device *smcibdev, u8 ibport);
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index f3c18b991d35..b0f40d73afd6 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -828,6 +828,9 @@ static int smc_pnet_netdev_event(struct notifier_block 
*this,
case NETDEV_UNREGISTER:
smc_pnet_remove_by_ndev(event_dev);
return NOTIFY_OK;
+   case NETDEV_CHANGENAME:
+   smc_ib_ndev_name_change(event_dev);
+   return NOTIFY_OK;
case NETDEV_REGISTER:
smc_pnet_add_by_ndev(event_dev);
return NOTIFY_OK;
-- 
2.17.1

[PATCH net-next v2 04/15] net/smc: Add link counters for IB device ports

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Add link counters to the structure of the smc ib device, one counter per
ib port. Increase/decrease the counters as needed in the corresponding
routines.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 net/smc/smc_core.c | 3 +++
 net/smc/smc_ib.h   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 6e2077161267..da94725deb09 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -316,6 +316,7 @@ int smcr_link_init(struct smc_link_group *lgr, struct 
smc_link *lnk,
lnk->link_idx = link_idx;
lnk->smcibdev = ini->ib_dev;
lnk->ibport = ini->ib_port;
+   atomic_inc(&ini->ib_dev->lnk_cnt_by_port[ini->ib_port - 1]);
lnk->path_mtu = ini->ib_dev->pattr[ini->ib_port - 1].active_mtu;
atomic_set(&lnk->conn_cnt, 0);
smc_llc_link_set_uid(lnk);
@@ -360,6 +361,7 @@ int smcr_link_init(struct smc_link_group *lgr, struct 
smc_link *lnk,
smc_llc_link_clear(lnk, false);
 out:
put_device(&ini->ib_dev->ibdev->dev);
+   atomic_dec(&ini->ib_dev->lnk_cnt_by_port[ini->ib_port - 1]);
memset(lnk, 0, sizeof(struct smc_link));
lnk->state = SMC_LNK_UNUSED;
if (!atomic_dec_return(&ini->ib_dev->lnk_cnt))
@@ -750,6 +752,7 @@ void smcr_link_clear(struct smc_link *lnk, bool log)
smc_ib_dealloc_protection_domain(lnk);
smc_wr_free_link_mem(lnk);
put_device(&lnk->smcibdev->ibdev->dev);
+   atomic_dec(&lnk->smcibdev->lnk_cnt_by_port[lnk->ibport - 1]);
smcibdev = lnk->smcibdev;
memset(lnk, 0, sizeof(struct smc_link));
lnk->state = SMC_LNK_UNUSED;
diff --git a/net/smc/smc_ib.h b/net/smc/smc_ib.h
index 2ce481187dd0..3e6bfeddd53b 100644
--- a/net/smc/smc_ib.h
+++ b/net/smc/smc_ib.h
@@ -53,6 +53,7 @@ struct smc_ib_device {/* 
ib-device infos for smc */
atomic_tlnk_cnt;/* number of links on ibdev */
wait_queue_head_t   lnks_deleted;   /* wait 4 removal of all links*/
struct mutexmutex;  /* protect dev setup+cleanup */
+   atomic_tlnk_cnt_by_port[SMC_MAX_PORTS];/*#lnk per port*/
 };
 
 struct smc_buf_desc;
-- 
2.17.1

[PATCH net-next v2 12/15] net/smc: Add support for obtaining SMCD device list

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Deliver SMCD device information via netlink based
diagnostic interface.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 include/uapi/linux/smc.h  |  2 +
 include/uapi/linux/smc_diag.h | 20 +
 net/smc/smc_core.h| 27 +
 net/smc/smc_diag.c| 76 +++
 net/smc/smc_ib.h  |  1 -
 5 files changed, 125 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h
index 635e2c2aeac5..736e8b98c8a5 100644
--- a/include/uapi/linux/smc.h
+++ b/include/uapi/linux/smc.h
@@ -38,4 +38,6 @@ enum {/* SMC PNET Table 
commands */
 #define SMC_LGR_ID_SIZE4
 #define SMC_MAX_HOSTNAME_LEN   32 /* Max length of hostname */
 #define SMC_MAX_EID_LEN32 /* Max length of eid */
+#define SMC_MAX_PORTS  2 /* Max # of ports per ib device */
+#define SMC_PCI_ID_STR_LEN 16 /* Max length of pci id string */
 #endif /* _UAPI_LINUX_SMC_H */
diff --git a/include/uapi/linux/smc_diag.h b/include/uapi/linux/smc_diag.h
index 5a80172df757..ab8f76bdd1a4 100644
--- a/include/uapi/linux/smc_diag.h
+++ b/include/uapi/linux/smc_diag.h
@@ -74,6 +74,7 @@ enum {
 /* V2 Commands */
 enum {
SMC_DIAG_GET_LGR_INFO = SMC_DIAG_EXTS_PER_CMD,
+   SMC_DIAG_GET_DEV_INFO,
__SMC_DIAG_EXT_MAX,
 };
 
@@ -84,6 +85,11 @@ enum {
SMC_DIAG_LGR_INFO_SMCD,
 };
 
+/* SMC_DIAG_GET_DEV_INFO command extensions */
+enum {
+   SMC_DIAG_DEV_INFO_SMCD = 1,
+};
+
 #define SMC_DIAG_MAX (__SMC_DIAG_MAX - 1)
 #define SMC_DIAG_EXT_MAX (__SMC_DIAG_EXT_MAX - 1)
 
@@ -164,6 +170,20 @@ struct smcd_diag_dmbinfo { /* SMC-D Socket 
internals */
struct smc_diag_v2_lgr_info v2_lgr_info; /* SMCv2 info */
 };
 
+struct smc_diag_dev_info {
+   /* Pnet ID per device port */
+   __u8pnet_id[SMC_MAX_PORTS][SMC_MAX_PNETID_LEN];
+   /* whether pnetid is set by user */
+   __u8pnetid_by_user[SMC_MAX_PORTS];
+   __u32   use_cnt;/* Number of linkgroups */
+   __u8is_critical;/* Is device critical */
+   __u32   pci_fid;/* PCI FID */
+   __u16   pci_pchid;  /* PCI CHID */
+   __u16   pci_vendor; /* PCI Vendor */
+   __u16   pci_device; /* PCI Device Vendor ID */
+   __u8pci_id[SMC_PCI_ID_STR_LEN]; /* PCI ID */
+};
+
 struct smc_diag_lgr {
__u8lgr_id[SMC_LGR_ID_SIZE]; /* Linkgroup identifier */
__u8lgr_role;   /* Linkgroup role */
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index 639c7565b302..0f966a21c223 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -13,6 +13,7 @@
 #define _SMC_CORE_H
 
 #include 
+#include 
 #include 
 
 #include "smc.h"
@@ -366,6 +367,32 @@ static inline bool smc_link_active(struct smc_link *lnk)
return lnk->state == SMC_LNK_ACTIVE;
 }
 
+struct smc_pci_dev {
+   __u32   pci_fid;
+   __u16   pci_pchid;
+   __u16   pci_vendor;
+   __u16   pci_device;
+   __u8pci_id[SMC_PCI_ID_STR_LEN];
+};
+
+static inline void smc_set_pci_values(struct pci_dev *pci_dev,
+ struct smc_pci_dev *smc_dev)
+{
+   smc_dev->pci_vendor = pci_dev->vendor;
+   smc_dev->pci_device = pci_dev->device;
+   snprintf(smc_dev->pci_id, sizeof(smc_dev->pci_id), "%s",
+pci_name(pci_dev));
+#if IS_ENABLED(CONFIG_S390)
+   {
+   struct zpci_dev *zdev;
+
+   zdev = to_zpci(pci_dev);
+   smc_dev->pci_fid = zdev->fid;
+   smc_dev->pci_pchid = zdev->pchid;
+   }
+#endif
+}
+
 struct smc_sock;
 struct smc_clc_msg_accept_confirm;
 struct smc_clc_msg_local;
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index fcff07a9ea47..252aae0b11d9 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -448,6 +448,78 @@ static int smc_diag_fill_smcd_dev(struct smcd_dev_list 
*dev_list,
return rc;
 }
 
+static int smc_diag_handle_smcd_dev(struct smcd_dev *smcd,
+   struct sk_buff *skb,
+   struct netlink_callback *cb,
+   struct smc_diag_req_v2 *req)
+{
+   struct smc_diag_dev_info smc_diag_dev;
+   struct smc_pci_dev smc_pci_dev;
+   struct nlmsghdr *nlh;
+   int dummy = 0;
+   int rc = 0;
+
+   nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, MAGIC_SEQ_V2_ACK,
+   cb->nlh->nlmsg_type, 0, NLM_F_MULTI);
+   if (!nlh)
+   return -EMSGSIZE;
+
+   memset(&smc_diag_dev, 0, sizeof(smc_diag_dev));
+   memset(&smc_pci_dev, 0, sizeof(smc_pci_dev));
+   smc_diag_dev.use_cnt = atomic_read(&smcd->lgr_cnt);
+   smc_diag_

[PATCH net-next v2 06/15] net/smc: Add diagnostic information to link structure

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

During link creation add network and ib-device name to
link structure. This is needed for diagnostic purposes.

When diagnostic information is gathered, we need to traverse
device, linkgroup and link structures, to be able to do that
we need to hold a spinlock for the linkgroup list, without this
diagnostic information in link structure, another device list
mutex holding would be necessary to dereference the device
pointer in the link structure which would be impossible when
holding a spinlock already.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 net/smc/smc_core.c | 10 ++
 net/smc/smc_core.h |  3 +++
 2 files changed, 13 insertions(+)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index da94725deb09..28fc583d9033 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -303,6 +303,15 @@ static u8 smcr_next_link_id(struct smc_link_group *lgr)
return link_id;
 }
 
+static inline void smcr_copy_dev_info_to_link(struct smc_link *link)
+{
+   struct smc_ib_device *smcibdev = link->smcibdev;
+
+   memcpy(link->ibname, smcibdev->ibdev->name, sizeof(link->ibname));
+   memcpy(link->ndevname, smcibdev->netdev[link->ibport - 1],
+  sizeof(link->ndevname));
+}
+
 int smcr_link_init(struct smc_link_group *lgr, struct smc_link *lnk,
   u8 link_idx, struct smc_init_info *ini)
 {
@@ -317,6 +326,7 @@ int smcr_link_init(struct smc_link_group *lgr, struct 
smc_link *lnk,
lnk->smcibdev = ini->ib_dev;
lnk->ibport = ini->ib_port;
atomic_inc(&ini->ib_dev->lnk_cnt_by_port[ini->ib_port - 1]);
+   smcr_copy_dev_info_to_link(lnk);
lnk->path_mtu = ini->ib_dev->pattr[ini->ib_port - 1].active_mtu;
atomic_set(&lnk->conn_cnt, 0);
smc_llc_link_set_uid(lnk);
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index 83a88a4635db..bd16d63c5222 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -124,6 +124,9 @@ struct smc_link {
u8  link_is_asym;   /* is link asymmetric? */
struct smc_link_group   *lgr;   /* parent link group */
struct work_struct  link_down_wrk;  /* wrk to bring link down */
+   /* Diagnostic relevant link information */
+   u8  ibname[IB_DEVICE_NAME_MAX];/* ib device name */
+   u8  ndevname[IFNAMSIZ];/* network device name */
 
enum smc_link_state state;  /* state of link */
struct delayed_work llc_testlink_wrk; /* testlink worker */
-- 
2.17.1

[PATCH net-next v2 14/15] net/smc: Refactor smc ism v2 capability handling

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Encapsulate the smc ism v2 capability boolean value
in a function for better information hiding.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 net/smc/af_smc.c  | 12 ++--
 net/smc/smc_ism.c |  9 -
 net/smc/smc_ism.h |  5 ++---
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index bc3e45289771..850e6df47a59 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -668,7 +668,7 @@ static int smc_find_proposal_devices(struct smc_sock *smc,
ini->smc_type_v1 = SMC_TYPE_N;
} /* else RDMA is supported for this connection */
}
-   if (smc_ism_v2_capable && smc_find_ism_v2_device_clnt(smc, ini))
+   if (smc_ism_is_v2_capable() && smc_find_ism_v2_device_clnt(smc, ini))
ini->smc_type_v2 = SMC_TYPE_N;
 
/* if neither ISM nor RDMA are supported, fallback */
@@ -920,7 +920,7 @@ static int smc_connect_check_aclc(struct smc_init_info *ini,
 /* perform steps before actually connecting */
 static int __smc_connect(struct smc_sock *smc)
 {
-   u8 version = smc_ism_v2_capable ? SMC_V2 : SMC_V1;
+   u8 version = smc_ism_is_v2_capable() ? SMC_V2 : SMC_V1;
struct smc_clc_msg_accept_confirm_v2 *aclc2;
struct smc_clc_msg_accept_confirm *aclc;
struct smc_init_info *ini = NULL;
@@ -945,9 +945,9 @@ static int __smc_connect(struct smc_sock *smc)
version);
 
ini->smcd_version = SMC_V1;
-   ini->smcd_version |= smc_ism_v2_capable ? SMC_V2 : 0;
+   ini->smcd_version |= smc_ism_is_v2_capable() ? SMC_V2 : 0;
ini->smc_type_v1 = SMC_TYPE_B;
-   ini->smc_type_v2 = smc_ism_v2_capable ? SMC_TYPE_D : SMC_TYPE_N;
+   ini->smc_type_v2 = smc_ism_is_v2_capable() ? SMC_TYPE_D : SMC_TYPE_N;
 
/* get vlan id from IP device */
if (smc_vlan_by_tcpsk(smc->clcsock, ini)) {
@@ -1354,7 +1354,7 @@ static int smc_listen_v2_check(struct smc_sock *new_smc,
rc = SMC_CLC_DECL_PEERNOSMC;
goto out;
}
-   if (!smc_ism_v2_capable) {
+   if (!smc_ism_is_v2_capable()) {
ini->smcd_version &= ~SMC_V2;
rc = SMC_CLC_DECL_NOISM2SUPP;
goto out;
@@ -1680,7 +1680,7 @@ static void smc_listen_work(struct work_struct *work)
 {
struct smc_sock *new_smc = container_of(work, struct smc_sock,
smc_listen_work);
-   u8 version = smc_ism_v2_capable ? SMC_V2 : SMC_V1;
+   u8 version = smc_ism_is_v2_capable() ? SMC_V2 : SMC_V1;
struct socket *newclcsock = new_smc->clcsock;
struct smc_clc_msg_accept_confirm *cclc;
struct smc_clc_msg_proposal_area *buf;
diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c
index 5bb2c7fb4ea8..2a2571637bc6 100644
--- a/net/smc/smc_ism.c
+++ b/net/smc/smc_ism.c
@@ -22,7 +22,7 @@ struct smcd_dev_list smcd_dev_list = {
 };
 EXPORT_SYMBOL_GPL(smcd_dev_list);
 
-bool smc_ism_v2_capable;
+static bool smc_ism_v2_capable;
 
 /* Test if an ISM communication is possible - same CPC */
 int smc_ism_cantalk(u64 peer_gid, unsigned short vlan_id, struct smcd_dev 
*smcd)
@@ -53,6 +53,13 @@ u16 smc_ism_get_chid(struct smcd_dev *smcd)
 }
 EXPORT_SYMBOL_GPL(smc_ism_get_chid);
 
+/* HW supports ISM V2 and thus System EID is defined */
+bool smc_ism_is_v2_capable(void)
+{
+   return smc_ism_v2_capable;
+}
+EXPORT_SYMBOL_GPL(smc_ism_is_v2_capable);
+
 /* Set a connection using this DMBE. */
 void smc_ism_set_conn(struct smc_connection *conn)
 {
diff --git a/net/smc/smc_ism.h b/net/smc/smc_ism.h
index 8048e09ddcf8..481a4b7df30b 100644
--- a/net/smc/smc_ism.h
+++ b/net/smc/smc_ism.h
@@ -10,6 +10,7 @@
 #define SMCD_ISM_H
 
 #include 
+#include 
 #include 
 
 #include "smc.h"
@@ -20,9 +21,6 @@ struct smcd_dev_list {/* List of SMCD devices */
 };
 
 extern struct smcd_dev_listsmcd_dev_list;  /* list of smcd devices */
-extern boolsmc_ism_v2_capable; /* HW supports ISM V2 and thus
-* System EID is defined
-*/
 
 struct smc_ism_vlanid {/* VLAN id set on ISM device */
struct list_head list;
@@ -52,5 +50,6 @@ int smc_ism_write(struct smcd_dev *dev, const struct 
smc_ism_position *pos,
 int smc_ism_signal_shutdown(struct smc_link_group *lgr);
 void smc_ism_get_system_eid(struct smcd_dev *dev, u8 **eid);
 u16 smc_ism_get_chid(struct smcd_dev *dev);
+bool smc_ism_is_v2_capable(void);
 void smc_ism_init(void);
 #endif
-- 
2.17.1

[PATCH net-next v2 07/15] net/smc: Refactor the netlink reply processing routine

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Refactor the netlink reply processing routine so that
it provides sub functions for specific parts of the processing.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 net/smc/smc_diag.c | 218 +++--
 1 file changed, 133 insertions(+), 85 deletions(-)

diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index c2225231f679..44be723c97fe 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -69,35 +69,25 @@ static void smc_diag_msg_common_fill(struct smc_diag_msg 
*r, struct sock *sk)
}
 }
 
-static int smc_diag_msg_attrs_fill(struct sock *sk, struct sk_buff *skb,
-  struct smc_diag_msg *r,
-  struct user_namespace *user_ns)
+static bool smc_diag_msg_attrs_fill(struct sock *sk, struct sk_buff *skb,
+   struct smc_diag_msg *r,
+   struct user_namespace *user_ns)
 {
-   if (nla_put_u8(skb, SMC_DIAG_SHUTDOWN, sk->sk_shutdown))
-   return 1;
+   if (nla_put_u8(skb, SMC_DIAG_SHUTDOWN, sk->sk_shutdown) < 0)
+   return false;
 
r->diag_uid = from_kuid_munged(user_ns, sock_i_uid(sk));
r->diag_inode = sock_i_ino(sk);
-   return 0;
+   return true;
 }
 
-static int __smc_diag_dump(struct sock *sk, struct sk_buff *skb,
-  struct netlink_callback *cb,
-  const struct smc_diag_req *req,
-  struct nlattr *bc)
+static bool smc_diag_fill_base_struct(struct sock *sk, struct sk_buff *skb,
+ struct netlink_callback *cb,
+ struct smc_diag_msg *r)
 {
struct smc_sock *smc = smc_sk(sk);
-   struct smc_diag_fallback fallback;
struct user_namespace *user_ns;
-   struct smc_diag_msg *r;
-   struct nlmsghdr *nlh;
 
-   nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
-   cb->nlh->nlmsg_type, sizeof(*r), NLM_F_MULTI);
-   if (!nlh)
-   return -EMSGSIZE;
-
-   r = nlmsg_data(nlh);
smc_diag_msg_common_fill(r, sk);
r->diag_state = sk->sk_state;
if (smc->use_fallback)
@@ -107,89 +97,148 @@ static int __smc_diag_dump(struct sock *sk, struct 
sk_buff *skb,
else
r->diag_mode = SMC_DIAG_MODE_SMCR;
user_ns = sk_user_ns(NETLINK_CB(cb->skb).sk);
-   if (smc_diag_msg_attrs_fill(sk, skb, r, user_ns))
-   goto errout;
+   if (!smc_diag_msg_attrs_fill(sk, skb, r, user_ns))
+   return false;
 
+   return true;
+}
+
+static bool smc_diag_fill_fallback(struct sock *sk, struct sk_buff *skb)
+{
+   struct smc_diag_fallback fallback;
+   struct smc_sock *smc = smc_sk(sk);
+
+   memset(&fallback, 0, sizeof(fallback));
fallback.reason = smc->fallback_rsn;
fallback.peer_diagnosis = smc->peer_diagnosis;
if (nla_put(skb, SMC_DIAG_FALLBACK, sizeof(fallback), &fallback) < 0)
+   return false;
+
+   return true;
+}
+
+static bool smc_diag_fill_conninfo(struct sock *sk, struct sk_buff *skb)
+{
+   struct smc_host_cdc_msg *local_tx, *local_rx;
+   struct smc_diag_conninfo cinfo;
+   struct smc_connection *conn;
+   struct smc_sock *smc;
+
+   smc = smc_sk(sk);
+   conn = &smc->conn;
+   local_tx = &conn->local_tx_ctrl;
+   local_rx = &conn->local_rx_ctrl;
+   memset(&cinfo, 0, sizeof(cinfo));
+   cinfo.token = conn->alert_token_local;
+   cinfo.sndbuf_size = conn->sndbuf_desc ? conn->sndbuf_desc->len : 0;
+   cinfo.rmbe_size = conn->rmb_desc ? conn->rmb_desc->len : 0;
+   cinfo.peer_rmbe_size = conn->peer_rmbe_size;
+
+   cinfo.rx_prod.wrap = local_rx->prod.wrap;
+   cinfo.rx_prod.count = local_rx->prod.count;
+   cinfo.rx_cons.wrap = local_rx->cons.wrap;
+   cinfo.rx_cons.count = local_rx->cons.count;
+
+   cinfo.tx_prod.wrap = local_tx->prod.wrap;
+   cinfo.tx_prod.count = local_tx->prod.count;
+   cinfo.tx_cons.wrap = local_tx->cons.wrap;
+   cinfo.tx_cons.count = local_tx->cons.count;
+
+   cinfo.tx_prod_flags = *(u8 *)&local_tx->prod_flags;
+   cinfo.tx_conn_state_flags = *(u8 *)&local_tx->conn_state_flags;
+   cinfo.rx_prod_flags = *(u8 *)&local_rx->prod_flags;
+   cinfo.rx_conn_state_flags = *(u8 *)&local_rx->conn_state_flags;
+
+   cinfo.tx_prep.wrap = conn->tx_curs_prep.wrap;
+   cinfo.tx_prep.count = conn->tx_curs_prep.count;
+   cinfo.tx_sent.wrap = conn->tx_curs_sent.wrap;
+   cinfo.tx_sent.count = conn->tx_curs_sent.count;
+   cinfo.tx_fin.wrap = conn->tx_curs_fin.wrap;
+   cinfo.tx_fin.count = conn->tx_curs_fin.count;
+
+   if (nla_put(skb, SMC_DIAG_CONNINFO, sizeof(cinfo), &cinfo) < 0)
+   return false;
+
+   return true;
+}
+
+static bool smc_diag_fill_lgrinfo(struct

[PATCH net-next v2 02/15] net/smc: Use active link of the connection

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Use active link of the connection directly and not
via linkgroup array structure when obtaining link
data of the connection.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 net/smc/smc_diag.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index f15fca59b4b2..c2225231f679 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -160,17 +160,17 @@ static int __smc_diag_dump(struct sock *sk, struct 
sk_buff *skb,
!list_empty(&smc->conn.lgr->list)) {
struct smc_diag_lgrinfo linfo = {
.role = smc->conn.lgr->role,
-   .lnk[0].ibport = smc->conn.lgr->lnk[0].ibport,
-   .lnk[0].link_id = smc->conn.lgr->lnk[0].link_id,
+   .lnk[0].ibport = smc->conn.lnk->ibport,
+   .lnk[0].link_id = smc->conn.lnk->link_id,
};
 
memcpy(linfo.lnk[0].ibname,
   smc->conn.lgr->lnk[0].smcibdev->ibdev->name,
-  sizeof(smc->conn.lgr->lnk[0].smcibdev->ibdev->name));
+  sizeof(smc->conn.lnk->smcibdev->ibdev->name));
smc_gid_be16_convert(linfo.lnk[0].gid,
-smc->conn.lgr->lnk[0].gid);
+smc->conn.lnk->gid);
smc_gid_be16_convert(linfo.lnk[0].peer_gid,
-smc->conn.lgr->lnk[0].peer_gid);
+smc->conn.lnk->peer_gid);
 
if (nla_put(skb, SMC_DIAG_LGRINFO, sizeof(linfo), &linfo) < 0)
goto errout;
-- 
2.17.1

[PATCH net-next v2 03/15] net/smc: Add connection counters for links

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Add connection counters to the structure of the link.
Increase/decrease the counters as needed in the corresponding
routines.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 net/smc/smc_core.c | 16 ++--
 net/smc/smc_core.h |  1 +
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 2b19863f7171..6e2077161267 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -139,6 +139,7 @@ static int smcr_lgr_conn_assign_link(struct smc_connection 
*conn, bool first)
}
if (!conn->lnk)
return SMC_CLC_DECL_NOACTLINK;
+   atomic_inc(&conn->lnk->conn_cnt);
return 0;
 }
 
@@ -180,6 +181,8 @@ static void __smc_lgr_unregister_conn(struct smc_connection 
*conn)
struct smc_link_group *lgr = conn->lgr;
 
rb_erase(&conn->alert_node, &lgr->conns_all);
+   if (conn->lnk)
+   atomic_dec(&conn->lnk->conn_cnt);
lgr->conns_num--;
conn->alert_token_local = 0;
sock_put(&smc->sk); /* sock_hold in smc_lgr_register_conn() */
@@ -314,6 +317,7 @@ int smcr_link_init(struct smc_link_group *lgr, struct 
smc_link *lnk,
lnk->smcibdev = ini->ib_dev;
lnk->ibport = ini->ib_port;
lnk->path_mtu = ini->ib_dev->pattr[ini->ib_port - 1].active_mtu;
+   atomic_set(&lnk->conn_cnt, 0);
smc_llc_link_set_uid(lnk);
INIT_WORK(&lnk->link_down_wrk, smc_link_down_work);
if (!ini->ib_dev->initialized) {
@@ -526,6 +530,14 @@ static int smc_switch_cursor(struct smc_sock *smc, struct 
smc_cdc_tx_pend *pend,
return rc;
 }
 
+static inline void smc_switch_link_and_count(struct smc_connection *conn,
+struct smc_link *to_lnk)
+{
+   atomic_dec(&conn->lnk->conn_cnt);
+   conn->lnk = to_lnk;
+   atomic_inc(&conn->lnk->conn_cnt);
+}
+
 struct smc_link *smc_switch_conns(struct smc_link_group *lgr,
  struct smc_link *from_lnk, bool is_dev_err)
 {
@@ -574,7 +586,7 @@ struct smc_link *smc_switch_conns(struct smc_link_group 
*lgr,
smc->sk.sk_state == SMC_PEERABORTWAIT ||
smc->sk.sk_state == SMC_PROCESSABORT) {
spin_lock_bh(&conn->send_lock);
-   conn->lnk = to_lnk;
+   smc_switch_link_and_count(conn, to_lnk);
spin_unlock_bh(&conn->send_lock);
continue;
}
@@ -588,7 +600,7 @@ struct smc_link *smc_switch_conns(struct smc_link_group 
*lgr,
}
/* avoid race with smcr_tx_sndbuf_nonempty() */
spin_lock_bh(&conn->send_lock);
-   conn->lnk = to_lnk;
+   smc_switch_link_and_count(conn, to_lnk);
rc = smc_switch_cursor(smc, pend, wr_buf);
spin_unlock_bh(&conn->send_lock);
sock_put(&smc->sk);
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index 9aee54a6bcba..83a88a4635db 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -129,6 +129,7 @@ struct smc_link {
struct delayed_work llc_testlink_wrk; /* testlink worker */
struct completion   llc_testlink_resp; /* wait for rx of testlink */
int llc_testlink_time; /* testlink interval */
+   atomic_tconn_cnt;
 };
 
 /* For now we just allow one parallel link per link group. The SMC protocol
-- 
2.17.1

[PATCH net-next v2 09/15] net/smc: Introduce SMCR get linkgroup command

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Introduce get linkgroup command which loops through
all available SMCR linkgroups. It uses the SMC-R linkgroup
list as entry point, not the socket list, which makes
linkgroup diagnosis possible, in case linkgroup does not
contain active connections anymore.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 include/net/smc.h |  2 +-
 include/uapi/linux/smc.h  |  5 ++
 include/uapi/linux/smc_diag.h | 43 +
 net/smc/smc.h |  5 +-
 net/smc/smc_core.c|  3 +-
 net/smc/smc_core.h|  1 -
 net/smc/smc_diag.c| 88 +++
 7 files changed, 141 insertions(+), 6 deletions(-)

diff --git a/include/net/smc.h b/include/net/smc.h
index e441aa97ad61..59d25dcb8e92 100644
--- a/include/net/smc.h
+++ b/include/net/smc.h
@@ -10,8 +10,8 @@
  */
 #ifndef _SMC_H
 #define _SMC_H
+#include 
 
-#define SMC_MAX_PNETID_LEN 16  /* Max. length of PNET id */
 
 struct smc_hashinfo {
rwlock_t lock;
diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h
index 0e11ca421ca4..635e2c2aeac5 100644
--- a/include/uapi/linux/smc.h
+++ b/include/uapi/linux/smc.h
@@ -3,6 +3,7 @@
  *  Shared Memory Communications over RDMA (SMC-R) and RoCE
  *
  *  Definitions for generic netlink based configuration of an SMC-R PNET table
+ *  Definitions for SMC Linkgroup and Devices.
  *
  *  Copyright IBM Corp. 2016
  *
@@ -33,4 +34,8 @@ enum {/* SMC PNET Table 
commands */
 #define SMCR_GENL_FAMILY_NAME  "SMC_PNETID"
 #define SMCR_GENL_FAMILY_VERSION   1
 
+#define SMC_MAX_PNETID_LEN 16 /* Max. length of PNET id */
+#define SMC_LGR_ID_SIZE4
+#define SMC_MAX_HOSTNAME_LEN   32 /* Max length of hostname */
+#define SMC_MAX_EID_LEN32 /* Max length of eid */
 #endif /* _UAPI_LINUX_SMC_H */
diff --git a/include/uapi/linux/smc_diag.h b/include/uapi/linux/smc_diag.h
index 236c1c52d562..6ae028344b6d 100644
--- a/include/uapi/linux/smc_diag.h
+++ b/include/uapi/linux/smc_diag.h
@@ -4,8 +4,10 @@
 
 #include 
 #include 
+#include 
 #include 
 
+#define SMC_DIAG_EXTS_PER_CMD 16
 /* Sequence numbers */
 enum {
MAGIC_SEQ = 123456,
@@ -21,6 +23,17 @@ struct smc_diag_req {
struct inet_diag_sockid id;
 };
 
+/* Request structure v2 */
+struct smc_diag_req_v2 {
+   __u8diag_family;
+   __u8pad[2];
+   __u8diag_ext;   /* Query extended information */
+   struct inet_diag_sockid id;
+   __u32   cmd;
+   __u32   cmd_ext;
+   __u8cmd_val[8];
+};
+
 /* Base info structure. It contains socket identity (addrs/ports/cookie) based
  * on the internal clcsock, and more SMC-related socket data
  */
@@ -57,7 +70,19 @@ enum {
__SMC_DIAG_MAX,
 };
 
+/* V2 Commands */
+enum {
+   SMC_DIAG_GET_LGR_INFO = SMC_DIAG_EXTS_PER_CMD,
+   __SMC_DIAG_EXT_MAX,
+};
+
+/* SMC_DIAG_GET_LGR_INFO command extensions */
+enum {
+   SMC_DIAG_LGR_INFO_SMCR = 1,
+};
+
 #define SMC_DIAG_MAX (__SMC_DIAG_MAX - 1)
+#define SMC_DIAG_EXT_MAX (__SMC_DIAG_EXT_MAX - 1)
 
 /* SMC_DIAG_CONNINFO */
 
@@ -88,6 +113,14 @@ struct smc_diag_conninfo {
struct smc_diag_cursor  tx_fin; /* confirmed sent cursor */
 };
 
+struct smc_diag_v2_lgr_info {
+   __u8smc_version;/* SMC Version */
+   __u8peer_smc_release;   /* Peer SMC Version */
+   __u8peer_os;/* Peer operating system */
+   __u8negotiated_eid[SMC_MAX_EID_LEN]; /* Negotiated EID */
+   __u8peer_hostname[SMC_MAX_HOSTNAME_LEN]; /* Peer host */
+};
+
 /* SMC_DIAG_LINKINFO */
 
 struct smc_diag_linkinfo {
@@ -116,4 +149,14 @@ struct smcd_diag_dmbinfo { /* SMC-D Socket 
internals */
__aligned_u64   peer_token; /* Token of remote DMBE */
 };
 
+struct smc_diag_lgr {
+   __u8lgr_id[SMC_LGR_ID_SIZE]; /* Linkgroup identifier */
+   __u8lgr_role;   /* Linkgroup role */
+   __u8lgr_type;   /* Linkgroup type */
+   __u8pnet_id[SMC_MAX_PNETID_LEN]; /* Linkgroup pnet id */
+   __u8vlan_id;/* Linkgroup vland id */
+   __u32   conns_num;  /* Number of connections */
+   __u8reserved;   /* Reserved for future use */
+   struct smc_diag_v2_lgr_info v2_lgr_info; /* SMCv2 info */
+};
 #endif /* _UAPI_SMC_DIAG_H_ */
diff --git a/net/smc/smc.h b/net/smc/smc.h
index d65e15f0c944..447cf9be979d 100644
--- a/net/smc/smc.h
+++ b/net/smc/smc.h
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include  /* __aligned */
+#include 
 #include 
 
 #include "smc_ib.h"
@@ -29,11 +30,9 @@
 * devices
 */
 
-#define SMC_MAX_HOSTNAME_LEN   32
-#define SMC_MAX_E

[PATCH net-next v2 13/15] net/smc: Add support for obtaining SMCR device list

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Deliver SMCR device information via netlink based
diagnostic interface.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 include/uapi/linux/smc_diag.h |   6 ++
 net/smc/smc_diag.c| 133 ++
 net/smc/smc_ib.c  |   2 +
 3 files changed, 141 insertions(+)

diff --git a/include/uapi/linux/smc_diag.h b/include/uapi/linux/smc_diag.h
index ab8f76bdd1a4..4c6332785533 100644
--- a/include/uapi/linux/smc_diag.h
+++ b/include/uapi/linux/smc_diag.h
@@ -88,6 +88,7 @@ enum {
 /* SMC_DIAG_GET_DEV_INFO command extensions */
 enum {
SMC_DIAG_DEV_INFO_SMCD = 1,
+   SMC_DIAG_DEV_INFO_SMCR,
 };
 
 #define SMC_DIAG_MAX (__SMC_DIAG_MAX - 1)
@@ -182,6 +183,11 @@ struct smc_diag_dev_info {
__u16   pci_vendor; /* PCI Vendor */
__u16   pci_device; /* PCI Device Vendor ID */
__u8pci_id[SMC_PCI_ID_STR_LEN]; /* PCI ID */
+   __u8dev_name[IB_DEVICE_NAME_MAX]; /* IB Device name */
+   __u8netdev[SMC_MAX_PORTS][IFNAMSIZ]; /* Netdev name(s) */
+   __u8port_state[SMC_MAX_PORTS]; /* IB Port State */
+   __u8port_valid[SMC_MAX_PORTS]; /* Is IB Port valid */
+   __u32   lnk_cnt_by_port[SMC_MAX_PORTS]; /* # lnks per port */
 };
 
 struct smc_diag_lgr {
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index 252aae0b11d9..58bfbe0bef4d 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -365,6 +365,34 @@ static int smc_diag_handle_lgr(struct smc_link_group *lgr,
return rc;
 }
 
+static bool smcr_diag_is_dev_critical(struct smc_lgr_list *smc_lgr,
+ struct smc_ib_device *smcibdev)
+{
+   struct smc_link_group *lgr;
+   bool rc = false;
+   int i;
+
+   spin_lock_bh(&smc_lgr->lock);
+   list_for_each_entry(lgr, &smc_lgr->list, list) {
+   if (lgr->is_smcd)
+   continue;
+   for (i = 0; i < SMC_LINKS_PER_LGR_MAX; i++) {
+   if (lgr->lnk[i].state == SMC_LNK_UNUSED)
+   continue;
+   if (lgr->lnk[i].smcibdev == smcibdev) {
+   if (lgr->type == SMC_LGR_SINGLE ||
+   lgr->type == SMC_LGR_ASYMMETRIC_LOCAL) {
+   rc = true;
+   goto out;
+   }
+   }
+   }
+   }
+out:
+   spin_unlock_bh(&smc_lgr->lock);
+   return rc;
+}
+
 static int smc_diag_fill_lgr_list(struct smc_lgr_list *smc_lgr,
  struct sk_buff *skb,
  struct netlink_callback *cb,
@@ -520,6 +548,108 @@ static int smc_diag_prep_smcd_dev(struct smcd_dev_list 
*dev_list,
return rc;
 }
 
+static inline void smc_diag_handle_dev_port(struct smc_diag_dev_info 
*smc_diag_dev,
+   struct ib_device *ibdev,
+   struct smc_ib_device *smcibdev,
+   int port)
+{
+   unsigned char port_state;
+
+   smc_diag_dev->port_valid[port] = 1;
+   snprintf((char *)&smc_diag_dev->netdev[port],
+sizeof(smc_diag_dev->netdev[port]),
+"%s", (char *)&smcibdev->netdev[port]);
+   snprintf((char *)&smc_diag_dev->pnet_id[port],
+sizeof(smc_diag_dev->pnet_id[port]), "%s",
+(char *)&smcibdev->pnetid[port]);
+   smc_diag_dev->pnetid_by_user[port] = smcibdev->pnetid_by_user[port];
+   port_state = smc_ib_port_active(smcibdev, port + 1);
+   smc_diag_dev->port_state[port] = port_state;
+   smc_diag_dev->lnk_cnt_by_port[port] =
+   atomic_read(&smcibdev->lnk_cnt_by_port[port]);
+}
+
+static int smc_diag_handle_smcr_dev(struct smc_ib_device *smcibdev,
+   struct sk_buff *skb,
+   struct netlink_callback *cb,
+   struct smc_diag_req_v2 *req)
+{
+   struct smc_diag_dev_info smc_dev;
+   struct smc_pci_dev smc_pci_dev;
+   struct pci_dev *pci_dev;
+   unsigned char is_crit;
+   struct nlmsghdr *nlh;
+   int dummy = 0;
+   int i, rc = 0;
+
+   nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, MAGIC_SEQ_V2_ACK,
+   cb->nlh->nlmsg_type, 0, NLM_F_MULTI);
+   if (!nlh)
+   return -EMSGSIZE;
+
+   memset(&smc_dev, 0, sizeof(smc_dev));
+   memset(&smc_pci_dev, 0, sizeof(smc_pci_dev));
+   for (i = 1; i <= SMC_MAX_PORTS; i++) {
+   if (rdma_is_port_valid(smcibdev->ibdev, i)) {
+   smc_diag_handle_dev_port(&smc_dev, smcibdev->ibdev,
+smcibdev, i - 1);
+

[PATCH net-next v2 01/15] net/smc: use helper smc_conn_abort() in listen processing

2020-11-03 Thread Karsten Graul

The helper smc_connect_abort() can be used by the listen processing
functions, too. And rename this helper to smc_conn_abort() to make the
purpose clearer.
No functional change.

Signed-off-by: Karsten Graul 
---
 net/smc/af_smc.c | 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 527185af7bf3..bc3e45289771 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -552,8 +552,7 @@ static int smc_connect_decline_fallback(struct smc_sock 
*smc, int reason_code,
return smc_connect_fallback(smc, reason_code);
 }
 
-/* abort connecting */
-static void smc_connect_abort(struct smc_sock *smc, int local_first)
+static void smc_conn_abort(struct smc_sock *smc, int local_first)
 {
if (local_first)
smc_lgr_cleanup_early(&smc->conn);
@@ -814,7 +813,7 @@ static int smc_connect_rdma(struct smc_sock *smc,
 
return 0;
 connect_abort:
-   smc_connect_abort(smc, ini->first_contact_local);
+   smc_conn_abort(smc, ini->first_contact_local);
mutex_unlock(&smc_client_lgr_pending);
smc->connect_nonblock = 0;
 
@@ -893,7 +892,7 @@ static int smc_connect_ism(struct smc_sock *smc,
 
return 0;
 connect_abort:
-   smc_connect_abort(smc, ini->first_contact_local);
+   smc_conn_abort(smc, ini->first_contact_local);
mutex_unlock(&smc_server_lgr_pending);
smc->connect_nonblock = 0;
 
@@ -1320,10 +1319,7 @@ static void smc_listen_decline(struct smc_sock *new_smc, 
int reason_code,
   int local_first, u8 version)
 {
/* RDMA setup failed, switch back to TCP */
-   if (local_first)
-   smc_lgr_cleanup_early(&new_smc->conn);
-   else
-   smc_conn_free(&new_smc->conn);
+   smc_conn_abort(new_smc, local_first);
if (reason_code < 0) { /* error, no fallback possible */
smc_listen_out_err(new_smc);
return;
@@ -1429,10 +1425,7 @@ static int smc_listen_ism_init(struct smc_sock *new_smc,
/* Create send and receive buffers */
rc = smc_buf_create(new_smc, true);
if (rc) {
-   if (ini->first_contact_local)
-   smc_lgr_cleanup_early(&new_smc->conn);
-   else
-   smc_conn_free(&new_smc->conn);
+   smc_conn_abort(new_smc, ini->first_contact_local);
return (rc == -ENOSPC) ? SMC_CLC_DECL_MAX_DMB :
 SMC_CLC_DECL_MEM;
}
-- 
2.17.1

[PATCH net-next v2 08/15] net/smc: Add ability to work with extended SMC netlink API

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

smc_diag module should be able to work with legacy and
extended netlink api. This is done by using the sequence field
of the netlink message header. Sequence field is optional and was
filled with a constant value MAGIC_SEQ in the current
implementation.
New constant values MAGIC_SEQ_V2 and MAGIC_SEQ_V2_ACK are used to
signal the usage of the new Netlink API between userspace and
kernel.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 include/uapi/linux/smc_diag.h |  7 +++
 net/smc/smc_diag.c| 21 +
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/smc_diag.h b/include/uapi/linux/smc_diag.h
index 8cb3a6fef553..236c1c52d562 100644
--- a/include/uapi/linux/smc_diag.h
+++ b/include/uapi/linux/smc_diag.h
@@ -6,6 +6,13 @@
 #include 
 #include 
 
+/* Sequence numbers */
+enum {
+   MAGIC_SEQ = 123456,
+   MAGIC_SEQ_V2,
+   MAGIC_SEQ_V2_ACK,
+};
+
 /* Request structure */
 struct smc_diag_req {
__u8diag_family;
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index 44be723c97fe..bc2b616524ff 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -293,19 +293,24 @@ static int smc_diag_dump(struct sk_buff *skb, struct 
netlink_callback *cb)
return skb->len;
 }
 
+static int smc_diag_dump_ext(struct sk_buff *skb, struct netlink_callback *cb)
+{
+   return skb->len;
+}
+
 static int smc_diag_handler_dump(struct sk_buff *skb, struct nlmsghdr *h)
 {
struct net *net = sock_net(skb->sk);
-
+   struct netlink_dump_control c = {
+   .min_dump_alloc = SKB_WITH_OVERHEAD(32768),
+   };
if (h->nlmsg_type == SOCK_DIAG_BY_FAMILY &&
h->nlmsg_flags & NLM_F_DUMP) {
-   {
-   struct netlink_dump_control c = {
-   .dump = smc_diag_dump,
-   .min_dump_alloc = SKB_WITH_OVERHEAD(32768),
-   };
-   return netlink_dump_start(net->diag_nlsk, skb, h, &c);
-   }
+   if (h->nlmsg_seq >= MAGIC_SEQ_V2)
+   c.dump = smc_diag_dump_ext;
+   else
+   c.dump = smc_diag_dump;
+   return netlink_dump_start(net->diag_nlsk, skb, h, &c);
}
return 0;
 }
-- 
2.17.1

[PATCH net-next v2 11/15] net/smc: Add SMC-D Linkgroup diagnostic support

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Deliver SMCD Linkgroup information via netlink based
diagnostic interface.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 include/uapi/linux/smc_diag.h |   7 +++
 net/smc/smc_diag.c| 108 ++
 net/smc/smc_ism.c |   2 +
 3 files changed, 117 insertions(+)

diff --git a/include/uapi/linux/smc_diag.h b/include/uapi/linux/smc_diag.h
index a57df0296aa4..5a80172df757 100644
--- a/include/uapi/linux/smc_diag.h
+++ b/include/uapi/linux/smc_diag.h
@@ -81,6 +81,7 @@ enum {
 enum {
SMC_DIAG_LGR_INFO_SMCR = 1,
SMC_DIAG_LGR_INFO_SMCR_LINK,
+   SMC_DIAG_LGR_INFO_SMCD,
 };
 
 #define SMC_DIAG_MAX (__SMC_DIAG_MAX - 1)
@@ -155,6 +156,12 @@ struct smcd_diag_dmbinfo { /* SMC-D Socket 
internals */
__aligned_u64   my_gid; /* My GID */
__aligned_u64   token;  /* Token of DMB */
__aligned_u64   peer_token; /* Token of remote DMBE */
+   /* Fields above used by legacy v1 code */
+   __u8pnet_id[SMC_MAX_PNETID_LEN]; /* Pnet ID */
+   __u32   conns_num;  /* Number of connections */
+   __u16   chid;   /* Linkgroup CHID */
+   __u8vlan_id;/* Linkgroup vlan id */
+   struct smc_diag_v2_lgr_info v2_lgr_info; /* SMCv2 info */
 };
 
 struct smc_diag_lgr {
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index 6885814b6e4f..fcff07a9ea47 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -21,6 +21,7 @@
 
 #include "smc.h"
 #include "smc_ib.h"
+#include "smc_ism.h"
 #include "smc_core.h"
 
 struct smc_diag_dump_ctx {
@@ -252,6 +253,53 @@ static int smc_diag_fill_lgr_link(struct smc_link_group 
*lgr,
return -EMSGSIZE;
 }
 
+static int smc_diag_fill_smcd_lgr(struct smc_link_group *lgr,
+ struct sk_buff *skb,
+ struct netlink_callback *cb,
+ struct smc_diag_req_v2 *req)
+{
+   struct smcd_diag_dmbinfo smcd_lgr;
+   struct nlmsghdr *nlh;
+   int dummy = 0;
+   int rc = 0;
+
+   nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, MAGIC_SEQ_V2_ACK,
+   cb->nlh->nlmsg_type, 0, NLM_F_MULTI);
+   if (!nlh)
+   return -EMSGSIZE;
+
+   memset(&smcd_lgr, 0, sizeof(smcd_lgr));
+   memcpy(&smcd_lgr.linkid, lgr->id, sizeof(lgr->id));
+   smcd_lgr.conns_num = lgr->conns_num;
+   smcd_lgr.vlan_id = lgr->vlan_id;
+   smcd_lgr.peer_gid = lgr->peer_gid;
+   smcd_lgr.my_gid = lgr->smcd->local_gid;
+   smcd_lgr.chid = smc_ism_get_chid(lgr->smcd);
+   memcpy(&smcd_lgr.v2_lgr_info.negotiated_eid, lgr->negotiated_eid,
+  sizeof(smcd_lgr.v2_lgr_info.negotiated_eid));
+   memcpy(&smcd_lgr.v2_lgr_info.peer_hostname, lgr->peer_hostname,
+  sizeof(smcd_lgr.v2_lgr_info.peer_hostname));
+   smcd_lgr.v2_lgr_info.peer_os = lgr->peer_os;
+   smcd_lgr.v2_lgr_info.peer_smc_release = lgr->peer_smc_release;
+   smcd_lgr.v2_lgr_info.smc_version = lgr->smc_version;
+   snprintf(smcd_lgr.pnet_id, sizeof(smcd_lgr.pnet_id), "%s",
+lgr->smcd->pnetid);
+
+   /* Just a command place holder to signal back the command reply type */
+   if (nla_put(skb, SMC_DIAG_GET_LGR_INFO, sizeof(dummy), &dummy) < 0)
+   goto errout;
+
+   if (nla_put(skb, SMC_DIAG_LGR_INFO_SMCD,
+   sizeof(smcd_lgr), &smcd_lgr) < 0)
+   goto errout;
+
+   nlmsg_end(skb, nlh);
+   return rc;
+errout:
+   nlmsg_cancel(skb, nlh);
+   return -EMSGSIZE;
+}
+
 static int smc_diag_fill_lgr(struct smc_link_group *lgr,
 struct sk_buff *skb,
 struct netlink_callback *cb,
@@ -343,6 +391,63 @@ static int smc_diag_fill_lgr_list(struct smc_lgr_list 
*smc_lgr,
return rc;
 }
 
+static int smc_diag_handle_smcd_lgr(struct smcd_dev *dev,
+   struct sk_buff *skb,
+   struct netlink_callback *cb,
+   struct smc_diag_req_v2 *req)
+{
+   struct smc_diag_dump_ctx *cb_ctx = smc_dump_context(cb);
+   struct smc_link_group *lgr;
+   int snum = cb_ctx->pos[1];
+   int rc = 0, num = 0;
+
+   spin_lock_bh(&dev->lgr_lock);
+   list_for_each_entry(lgr, &dev->lgr_list, list) {
+   if (lgr->is_smcd) {
+   if (num < snum)
+   goto next;
+   rc = smc_diag_fill_smcd_lgr(lgr, skb, cb, req);
+   if (rc < 0)
+   goto errout;
+next:
+   num++;
+   }
+   }
+errout:
+   spin_unlock_bh(&dev->lgr_lock);
+   cb_ctx->pos[1] = num;
+   return rc;
+}
+
+static int smc_diag_fill_smcd_dev(struct smcd_dev_list *dev_list,
+

[PATCH net-next v2 15/15] net/smc: Add support for obtaining system information

2020-11-03 Thread Karsten Graul

From: Guvenc Gulce 

Add new netlink command to obtain system information
of the smc module.

Signed-off-by: Guvenc Gulce 
Signed-off-by: Karsten Graul 
---
 include/uapi/linux/smc.h  |  1 +
 include/uapi/linux/smc_diag.h | 18 ++
 net/smc/smc_clc.c |  6 
 net/smc/smc_clc.h |  1 +
 net/smc/smc_diag.c| 62 +++
 net/smc/smc_ism.c |  1 +
 6 files changed, 89 insertions(+)

diff --git a/include/uapi/linux/smc.h b/include/uapi/linux/smc.h
index 736e8b98c8a5..04385a98037a 100644
--- a/include/uapi/linux/smc.h
+++ b/include/uapi/linux/smc.h
@@ -38,6 +38,7 @@ enum {/* SMC PNET Table 
commands */
 #define SMC_LGR_ID_SIZE4
 #define SMC_MAX_HOSTNAME_LEN   32 /* Max length of hostname */
 #define SMC_MAX_EID_LEN32 /* Max length of eid */
+#define SMC_MAX_EID8 /* Max number of eids */
 #define SMC_MAX_PORTS  2 /* Max # of ports per ib device */
 #define SMC_PCI_ID_STR_LEN 16 /* Max length of pci id string */
 #endif /* _UAPI_LINUX_SMC_H */
diff --git a/include/uapi/linux/smc_diag.h b/include/uapi/linux/smc_diag.h
index 4c6332785533..7409e7a854df 100644
--- a/include/uapi/linux/smc_diag.h
+++ b/include/uapi/linux/smc_diag.h
@@ -75,6 +75,7 @@ enum {
 enum {
SMC_DIAG_GET_LGR_INFO = SMC_DIAG_EXTS_PER_CMD,
SMC_DIAG_GET_DEV_INFO,
+   SMC_DIAG_GET_SYS_INFO,
__SMC_DIAG_EXT_MAX,
 };
 
@@ -91,6 +92,11 @@ enum {
SMC_DIAG_DEV_INFO_SMCR,
 };
 
+/* SMC_DIAG_GET_SYS_INFO command extensions */
+enum {
+   SMC_DIAG_SYS_INFO = 1,
+};
+
 #define SMC_DIAG_MAX (__SMC_DIAG_MAX - 1)
 #define SMC_DIAG_EXT_MAX (__SMC_DIAG_EXT_MAX - 1)
 
@@ -131,6 +137,18 @@ struct smc_diag_v2_lgr_info {
__u8peer_hostname[SMC_MAX_HOSTNAME_LEN]; /* Peer host */
 };
 
+
+struct smc_system_info {
+   __u8smc_version;/* SMC Version */
+   __u8smc_release;/* SMC Release */
+   __u8ueid_count; /* Number of UEIDs */
+   __u8smc_ism_is_v2;  /* Is ISM SMC v2 capable */
+   __u32   reserved;   /* Reserved for future use */
+   __u8local_hostname[SMC_MAX_HOSTNAME_LEN]; /* Hostnames */
+   __u8seid[SMC_MAX_EID_LEN];  /* System EID */
+   __u8ueid[SMC_MAX_EID][SMC_MAX_EID_LEN]; /* User EIDs */
+};
+
 /* SMC_DIAG_LINKINFO */
 
 struct smc_diag_linkinfo {
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c
index 696d89c2dce4..ca887ee6b249 100644
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -772,6 +772,12 @@ int smc_clc_send_accept(struct smc_sock *new_smc, bool 
srv_first_contact,
return len > 0 ? 0 : len;
 }
 
+void smc_clc_get_hostname(u8 **host)
+{
+   *host = &smc_hostname[0];
+}
+EXPORT_SYMBOL_GPL(smc_clc_get_hostname);
+
 void __init smc_clc_init(void)
 {
struct new_utsname *u;
diff --git a/net/smc/smc_clc.h b/net/smc/smc_clc.h
index e7ab05683bc9..9ed9eb3abe46 100644
--- a/net/smc/smc_clc.h
+++ b/net/smc/smc_clc.h
@@ -334,5 +334,6 @@ int smc_clc_send_confirm(struct smc_sock *smc, bool 
clnt_first_contact,
 int smc_clc_send_accept(struct smc_sock *smc, bool srv_first_contact,
u8 version);
 void smc_clc_init(void) __init;
+void smc_clc_get_hostname(u8 **host);
 
 #endif
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c
index 58bfbe0bef4d..a69a401329ab 100644
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -23,6 +23,7 @@
 #include "smc_ib.h"
 #include "smc_ism.h"
 #include "smc_core.h"
+#include "smc_clc.h"
 
 struct smc_diag_dump_ctx {
int pos[2];
@@ -650,6 +651,63 @@ static int smc_diag_prep_smcr_dev(struct smc_ib_devices 
*dev_list,
return rc;
 }
 
+static int smc_diag_prep_sys_info(struct smcd_dev_list *dev_list,
+ struct sk_buff *skb,
+ struct netlink_callback *cb,
+ struct smc_diag_req_v2 *req)
+{
+   struct smc_diag_dump_ctx *cb_ctx = smc_dump_context(cb);
+   struct smc_system_info smc_sys_info;
+   int dummy = 0, rc = 0, num = 0;
+   struct smcd_dev *smcd_dev;
+   int snum = cb_ctx->pos[0];
+   struct nlmsghdr *nlh;
+   u8 *seid = NULL;
+   u8 *host = NULL;
+
+   nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, MAGIC_SEQ_V2_ACK,
+   cb->nlh->nlmsg_type, 0, NLM_F_MULTI);
+   if (!nlh)
+   return -EMSGSIZE;
+
+   if (snum > num)
+   goto errout;
+
+   memset(&smc_sys_info, 0, sizeof(smc_sys_info));
+   smc_sys_info.smc_ism_is_v2 = smc_ism_is_v2_capable();
+   smc_sys_info.smc_version = SMC_V2;
+   smc_sys_info.smc_release = SMC_RELEASE;
+   smc_clc_get_hostname(&host);
+
+   if (host)
+   memcpy(smc_sys_info.local_hos

[PATCH] IPv6: Set SIT tunnel hard_header_len to zero

2020-11-03 Thread Oliver Herms

Due to the legacy usage of hard_header_len for SIT tunnels while
already using infrastructure from net/ipv4/ip_tunnel.c the
calculation of the path MTU in tnl_update_pmtu is incorrect.
This leads to unnecessary creation of MTU exceptions for any
flow going over a SIT tunnel.

As SIT tunnels do not have a header themsevles other than their
transport (L3, L2) headers we're leaving hard_header_len set to zero
as tnl_update_pmtu is already taking care of the transport headers
sizes.

This will also help avoiding unnecessary IPv6 GC runs and spinlock
contention seen when using SIT tunnels and for more than
net.ipv6.route.gc_thresh flows.

Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Oliver Herms 
---
 net/ipv6/sit.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 5e2c34c0ac97..5e7983cb6154 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1128,7 +1128,6 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev)
if (tdev && !netif_is_l3_master(tdev)) {
int t_hlen = tunnel->hlen + sizeof(struct iphdr);
 
-   dev->hard_header_len = tdev->hard_header_len + sizeof(struct 
iphdr);
dev->mtu = tdev->mtu - t_hlen;
if (dev->mtu < IPV6_MIN_MTU)
dev->mtu = IPV6_MIN_MTU;
@@ -1426,7 +1425,6 @@ static void ipip6_tunnel_setup(struct net_device *dev)
dev->priv_destructor= ipip6_dev_free;
 
dev->type   = ARPHRD_SIT;
-   dev->hard_header_len= LL_MAX_HEADER + t_hlen;
dev->mtu= ETH_DATA_LEN - t_hlen;
dev->min_mtu= IPV6_MIN_MTU;
dev->max_mtu= IP6_MAX_MTU - t_hlen;
-- 
2.25.1

[PATCH net] net/tls: Fix kernel panic when socket is in TLS ULP

2020-11-03 Thread Vinay Kumar Yadav

user can initialize tls ulp using setsockopt call on socket
before listen() in case of tls-toe (TLS_HW_RECORD) and same
setsockopt call on connected socket in case of kernel tls (TLS_SW).
In presence of tls-toe devices, TLS ulp is initialized, tls context
is allocated per listen socket and socket is listening at adapter
as well as kernel tcp stack. now consider the scenario, connections
are established in kernel stack.
on every connection close which is established in kernel stack,
it clears tls context which is created on listen socket causing
kernel panic.
Addressed the issue by setting child socket to base (non TLS ULP)
when tls ulp is initialized on parent socket (listen socket).

Fixes: 76f7164d02d4 ("net/tls: free ctx in sock destruct")
Signed-off-by: Vinay Kumar Yadav 
---
 .../chelsio/inline_crypto/chtls/chtls_cm.c|  3 +++
 net/tls/tls_main.c| 23 ++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c 
b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
index 63aacc184f68..c56cd9c1e40c 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
@@ -1206,6 +1206,9 @@ static struct sock *chtls_recv_sock(struct sock *lsk,
sk_setup_caps(newsk, dst);
ctx = tls_get_ctx(lsk);
newsk->sk_destruct = ctx->sk_destruct;
+   newsk->sk_prot = lsk->sk_prot;
+   inet_csk(newsk)->icsk_ulp_ops = inet_csk(lsk)->icsk_ulp_ops;
+   rcu_assign_pointer(inet_csk(newsk)->icsk_ulp_data, ctx);
csk->sk = newsk;
csk->passive_reap_next = oreq;
csk->tx_chan = cxgb4_port_chan(ndev);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 8d93cea99f2c..9682dacae30c 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -715,7 +715,7 @@ static int tls_init(struct sock *sk)
tls_build_proto(sk);
 
 #ifdef CONFIG_TLS_TOE
-   if (tls_toe_bypass(sk))
+   if (sk->sk_state == TCP_CLOSE && tls_toe_bypass(sk))
return 0;
 #endif
 
@@ -744,6 +744,24 @@ static int tls_init(struct sock *sk)
return rc;
 }
 
+#ifdef CONFIG_TLS_TOE
+static void tls_clone(const struct request_sock *req,
+ struct sock *newsk, const gfp_t priority)
+{
+   struct tls_context *ctx = tls_get_ctx(newsk);
+   struct inet_connection_sock *icsk = inet_csk(newsk);
+
+   /* In presence of TLS TOE devices, TLS ulp is initialized on listen
+* socket so lets child socket back to non tls ULP mode because tcp
+* connections can happen in non TLS TOE mode.
+*/
+   newsk->sk_prot = ctx->sk_proto;
+   newsk->sk_destruct = ctx->sk_destruct;
+   icsk->icsk_ulp_ops = NULL;
+   rcu_assign_pointer(icsk->icsk_ulp_data, NULL);
+}
+#endif
+
 static void tls_update(struct sock *sk, struct proto *p,
   void (*write_space)(struct sock *sk))
 {
@@ -857,6 +875,9 @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = {
.update = tls_update,
.get_info   = tls_get_info,
.get_info_size  = tls_get_info_size,
+#ifdef CONFIG_TLS_TOE
+   .clone  = tls_clone
+#endif
 };
 
 static int __init tls_register(void)
-- 
2.18.1

Re: [PATCH v3 net-next 09/12] net: dsa: tag_brcm: let DSA core deal with TX reallocation

2020-11-03 Thread Vladimir Oltean

On Mon, Nov 02, 2020 at 12:34:11PM -0800, Florian Fainelli wrote:
> On 11/1/2020 11:16 AM, Vladimir Oltean wrote:
> > Now that we have a central TX reallocation procedure that accounts for
> > the tagger's needed headroom in a generic way, we can remove the
> > skb_cow_head call.
> >
> > Cc: Florian Fainelli 
> > Signed-off-by: Vladimir Oltean 
>
> Reviewed-by: Florian Fainelli 

Florian, I just noticed that tag_brcm.c has an __skb_put_padto call,
even though it is not a tail tagger. This comes from commit:

commit bf08c34086d159edde5c54902dfa2caa4d9fbd8c
Author: Florian Fainelli 
Date:   Wed Jan 3 22:13:00 2018 -0800

net: dsa: Move padding into Broadcom tagger

Instead of having the different master network device drivers
potentially used by DSA/Broadcom tags, move the padding necessary for
the switches to accept short packets where it makes most sense: within
tag_brcm.c. This avoids multiplying the number of similar commits to
e.g: bgmac, bcmsysport, etc.

Signed-off-by: Florian Fainelli 
Signed-off-by: David S. Miller 

Do you remember why this was needed?
As far as I understand, either the DSA master driver or the MAC itself
should pad frames automatically. Is that not happening on Broadcom SoCs,
or why do you need to pad from DSA?
How should we deal with this? Having tag_brcm.c still do some potential
reallocation defeats the purpose of doing it centrally, in a way. I was
trying to change the prototype of struct dsa_device_ops::xmit to stop
returning a struct sk_buff *, and I stumbled upon this.
Should we just go ahead and pad everything unconditionally in DSA?

Re: [PATCH v2 0/8] slab: provide and use krealloc_array()

2020-11-03 Thread Andy Shevchenko

On Tue, Nov 3, 2020 at 12:13 PM Bartosz Golaszewski  wrote:
> On Tue, Nov 3, 2020 at 5:14 AM Joe Perches  wrote:
> > On Mon, 2020-11-02 at 16:20 +0100, Bartosz Golaszewski wrote:
> > > From: Bartosz Golaszewski 

> Yeah so I had this concern for devm_krealloc() and even sent a patch
> that extended it to honor __GFP_ZERO before I noticed that regular
> krealloc() silently ignores __GFP_ZERO. I'm not sure if this is on
> purpose. Maybe we should either make krealloc() honor __GFP_ZERO or
> explicitly state in its documentation that it ignores it?

And my voice here is to ignore for the same reasons: respect
realloc(3) and making common sense with the idea of REallocating
(capital letters on purpose).

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH] fsl/fman: add missing put_devcie() call in fman_port_probe()

2020-11-03 Thread yukuai (C)




On 2020/11/03 9:30, Jakub Kicinski wrote:

On Sat, 31 Oct 2020 18:54:18 +0800 Yu Kuai wrote:

if of_find_device_by_node() succeed, fman_port_probe() doesn't have a
corresponding put_device(). Thus add jump target to fix the exception
handling for this function implementation.

Fixes: 0572054617f3 ("fsl/fman: fix dereference null return value")
Signed-off-by: Yu Kuai 



diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c 
b/drivers/net/ethernet/freescale/fman/fman_port.c
index d9baac0dbc7d..576ce6df3fce 100644
--- a/drivers/net/ethernet/freescale/fman/fman_port.c
+++ b/drivers/net/ethernet/freescale/fman/fman_port.c
@@ -1799,13 +1799,13 @@ static int fman_port_probe(struct platform_device 
*of_dev)
of_node_put(fm_node);
if (!fm_pdev) {
err = -EINVAL;
-   goto return_err;
+   goto put_device;
}



@@ -1898,6 +1898,8 @@ static int fman_port_probe(struct platform_device *of_dev)
  
  return_err:

of_node_put(port_node);
+put_device:
+   put_device(&fm_pdev->dev);
  free_port:
kfree(port);
return err;


This does not look right. You're jumping to put_device() when fm_pdev
is NULL?


Hi,

oops, it's a silly mistake. Will fix it in V2 patch.

Thanks,
Yu Kuai


The order of error handling should be the reverse of the order of
execution of the function.
.

[PATCH V2] fsl/fman: add missing put_devcie() call in fman_port_probe()

2020-11-03 Thread Yu Kuai

if of_find_device_by_node() succeed, fman_port_probe() doesn't have a
corresponding put_device(). Thus add jump target to fix the exception
handling for this function implementation.

Fixes: 0572054617f3 ("fsl/fman: fix dereference null return value")
Signed-off-by: Yu Kuai 
---
 .../net/ethernet/freescale/fman/fman_port.c   | 23 +--
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fman/fman_port.c 
b/drivers/net/ethernet/freescale/fman/fman_port.c
index d9baac0dbc7d..4ae5d844d1f5 100644
--- a/drivers/net/ethernet/freescale/fman/fman_port.c
+++ b/drivers/net/ethernet/freescale/fman/fman_port.c
@@ -1792,20 +1792,21 @@ static int fman_port_probe(struct platform_device 
*of_dev)
if (!fm_node) {
dev_err(port->dev, "%s: of_get_parent() failed\n", __func__);
err = -ENODEV;
-   goto return_err;
+   goto free_port;
}
 
+   of_node_put(port_node);
fm_pdev = of_find_device_by_node(fm_node);
of_node_put(fm_node);
if (!fm_pdev) {
err = -EINVAL;
-   goto return_err;
+   goto free_port;
}
 
fman = dev_get_drvdata(&fm_pdev->dev);
if (!fman) {
err = -EINVAL;
-   goto return_err;
+   goto put_device;
}
 
err = of_property_read_u32(port_node, "cell-index", &val);
@@ -1813,7 +1814,7 @@ static int fman_port_probe(struct platform_device *of_dev)
dev_err(port->dev, "%s: reading cell-index for %pOF failed\n",
__func__, port_node);
err = -EINVAL;
-   goto return_err;
+   goto put_device;
}
port_id = (u8)val;
port->dts_params.id = port_id;
@@ -1847,7 +1848,7 @@ static int fman_port_probe(struct platform_device *of_dev)
}  else {
dev_err(port->dev, "%s: Illegal port type\n", __func__);
err = -EINVAL;
-   goto return_err;
+   goto put_device;
}
 
port->dts_params.type = port_type;
@@ -1861,7 +1862,7 @@ static int fman_port_probe(struct platform_device *of_dev)
dev_err(port->dev, "%s: incorrect qman-channel-id\n",
__func__);
err = -EINVAL;
-   goto return_err;
+   goto put_device;
}
port->dts_params.qman_channel_id = qman_channel_id;
}
@@ -1871,20 +1872,18 @@ static int fman_port_probe(struct platform_device 
*of_dev)
dev_err(port->dev, "%s: of_address_to_resource() failed\n",
__func__);
err = -ENOMEM;
-   goto return_err;
+   goto put_device;
}
 
port->dts_params.fman = fman;
 
-   of_node_put(port_node);
-
dev_res = __devm_request_region(port->dev, &res, res.start,
resource_size(&res), "fman-port");
if (!dev_res) {
dev_err(port->dev, "%s: __devm_request_region() failed\n",
__func__);
err = -EINVAL;
-   goto free_port;
+   goto put_device;
}
 
port->dts_params.base_addr = devm_ioremap(port->dev, res.start,
@@ -1896,8 +1895,8 @@ static int fman_port_probe(struct platform_device *of_dev)
 
return 0;
 
-return_err:
-   of_node_put(port_node);
+put_device:
+   put_device(&fm_pdev->dev);
 free_port:
kfree(port);
return err;
-- 
2.25.4

RE: [PATCH v2 net-next 0/3] fsl/qbman: in_interrupt() cleanup.

2020-11-03 Thread Madalin Bucur

> -Original Message-
> From: Sebastian Andrzej Siewior 
> Sent: 02 November 2020 01:23
> To: netdev@vger.kernel.org
> Cc: Horia Geanta ; Aymen Sghaier
> ; Herbert Xu ; David S.
> Miller ; Madalin Bucur ; Jakub
> Kicinski ; Leo Li ; Thomas Gleixner
> ; Sebastian Andrzej Siewior 
> Subject: [PATCH v2 net-next 0/3] fsl/qbman: in_interrupt() cleanup.
> 
> This is the in_interrupt() clean for FSL DPAA framework and the two
> users.
> 
> The `napi' parameter has been renamed to `sched_napi', the other parts
> are same as in the previous post [0].
> 
> https://lore.kernel.org/linux-arm-kernel/20201027225454.3492351-1-bige...@linutronix.de/
> 
> Sebastian

For the series,

Reviewed-by: Madalin Bucur

[PATCH net-next 0/2] net: axienet: Dynamically enable MDIO interface

2020-11-03 Thread Radhey Shyam Pandey



This patchset dynamically enable MDIO interface. The background for this
change is coming from Cadence GEM controller(macb) in which MDC is active 
only during MDIO read or write operations while the PHY registers are
read or written. It is implemented as an IP feature. 

For axiethernet as dynamic MDC enable/disable is not supported in hw
we are implementing it in sw. This change doesn't affect any existing
functionality.

Clayton Rayment (1):
  net: xilinx: axiethernet: Enable dynamic MDIO MDC

Radhey Shyam Pandey (1):
  net: xilinx: axiethernet: Introduce helper functions for MDC
enable/disable

 drivers/net/ethernet/xilinx/xilinx_axienet.h  |  2 +
 drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 21 ++---
 drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c | 56 ++-
 3 files changed, 51 insertions(+), 28 deletions(-)

-- 
2.7.4

[PATCH net-next 1/2] net: xilinx: axiethernet: Introduce helper functions for MDC enable/disable

2020-11-03 Thread Radhey Shyam Pandey

Introduce helper functions to enable/disable MDIO interface clock. This
change serves a preparatory patch for the coming feature to dynamically
control the management bus clock.

Signed-off-by: Radhey Shyam Pandey 
---
 drivers/net/ethernet/xilinx/xilinx_axienet.h  |  2 ++
 drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c | 29 +++
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h 
b/drivers/net/ethernet/xilinx/xilinx_axienet.h
index 7326ad4..a03c3ca 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
@@ -378,6 +378,7 @@ struct axidma_bd {
  * @dev:   Pointer to device structure
  * @phy_node:  Pointer to device node structure
  * @mii_bus:   Pointer to MII bus structure
+ * @mii_clk_div: MII bus clock divider value
  * @regs_start: Resource start for axienet device addresses
  * @regs:  Base address for the axienet_local device address space
  * @dma_regs:  Base address for the axidma device address space
@@ -427,6 +428,7 @@ struct axienet_local {
 
/* MDIO bus data */
struct mii_bus *mii_bus;/* MII bus reference */
+   u8 mii_clk_div; /* MII bus clock divider value */
 
/* IO registers, dma functions and IRQs */
resource_size_t regs_start;
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c 
b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
index 435ed30..84d06bf 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
@@ -30,6 +30,23 @@ static int axienet_mdio_wait_until_ready(struct 
axienet_local *lp)
  1, 2);
 }
 
+/* Enable the MDIO MDC. Called prior to a read/write operation */
+static void axienet_mdio_mdc_enable(struct axienet_local *lp)
+{
+   axienet_iow(lp, XAE_MDIO_MC_OFFSET,
+   ((u32)lp->mii_clk_div | XAE_MDIO_MC_MDIOEN_MASK));
+}
+
+/* Disable the MDIO MDC. Called after a read/write operation*/
+static void axienet_mdio_mdc_disable(struct axienet_local *lp)
+{
+   u32 mc_reg;
+
+   mc_reg = axienet_ior(lp, XAE_MDIO_MC_OFFSET);
+   axienet_iow(lp, XAE_MDIO_MC_OFFSET,
+   (mc_reg & ~XAE_MDIO_MC_MDIOEN_MASK));
+}
+
 /**
  * axienet_mdio_read - MDIO interface read function
  * @bus:   Pointer to mii bus structure
@@ -124,7 +141,9 @@ static int axienet_mdio_write(struct mii_bus *bus, int 
phy_id, int reg,
  **/
 int axienet_mdio_enable(struct axienet_local *lp)
 {
-   u32 clk_div, host_clock;
+   u32 host_clock;
+
+   lp->mii_clk_div = 0;
 
if (lp->clk) {
host_clock = clk_get_rate(lp->clk);
@@ -176,19 +195,19 @@ int axienet_mdio_enable(struct axienet_local *lp)
 * "clock-frequency" from the CPU
 */
 
-   clk_div = (host_clock / (MAX_MDIO_FREQ * 2)) - 1;
+   lp->mii_clk_div = (host_clock / (MAX_MDIO_FREQ * 2)) - 1;
/* If there is any remainder from the division of
 * fHOST / (MAX_MDIO_FREQ * 2), then we need to add
 * 1 to the clock divisor or we will surely be above 2.5 MHz
 */
if (host_clock % (MAX_MDIO_FREQ * 2))
-   clk_div++;
+   lp->mii_clk_div++;
 
netdev_dbg(lp->ndev,
   "Setting MDIO clock divisor to %u/%u Hz host clock.\n",
-  clk_div, host_clock);
+  lp->mii_clk_div, host_clock);
 
-   axienet_iow(lp, XAE_MDIO_MC_OFFSET, clk_div | XAE_MDIO_MC_MDIOEN_MASK);
+   axienet_iow(lp, XAE_MDIO_MC_OFFSET, lp->mii_clk_div | 
XAE_MDIO_MC_MDIOEN_MASK);
 
return axienet_mdio_wait_until_ready(lp);
 }
-- 
2.7.4

[PATCH net-next 2/2] net: xilinx: axiethernet: Enable dynamic MDIO MDC

2020-11-03 Thread Radhey Shyam Pandey

From: Clayton Rayment 

MDIO spec does not require an MDC at all times, only when MDIO
transactions are occurring. This patch allows the xilinx_axienet
driver to disable the MDC when not in use, and re-enable it when
needed. It also simplifies the driver by removing MDC disable
and enable in device reset sequence.

Signed-off-by: Clayton Rayment 
Signed-off-by: Radhey Shyam Pandey 
---
 drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 21 --
 drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c | 27 ++-
 2 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c 
b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 529c167..6fea980 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1049,20 +1049,13 @@ static int axienet_open(struct net_device *ndev)
 
dev_dbg(&ndev->dev, "axienet_open()\n");
 
-   /* Disable the MDIO interface till Axi Ethernet Reset is completed.
-* When we do an Axi Ethernet reset, it resets the complete core
-* including the MDIO. MDIO must be disabled before resetting
-* and re-enabled afterwards.
+   /* When we do an Axi Ethernet reset, it resets the complete core
+* including the MDIO. MDIO must be disabled before resetting.
 * Hold MDIO bus lock to avoid MDIO accesses during the reset.
 */
mutex_lock(&lp->mii_bus->mdio_lock);
-   axienet_mdio_disable(lp);
ret = axienet_device_reset(ndev);
-   if (ret == 0)
-   ret = axienet_mdio_enable(lp);
mutex_unlock(&lp->mii_bus->mdio_lock);
-   if (ret < 0)
-   return ret;
 
ret = phylink_of_phy_connect(lp->phylink, lp->dev->of_node, 0);
if (ret) {
@@ -1156,9 +1149,7 @@ static int axienet_stop(struct net_device *ndev)
 
/* Do a reset to ensure DMA is really stopped */
mutex_lock(&lp->mii_bus->mdio_lock);
-   axienet_mdio_disable(lp);
__axienet_device_reset(lp);
-   axienet_mdio_enable(lp);
mutex_unlock(&lp->mii_bus->mdio_lock);
 
cancel_work_sync(&lp->dma_err_task);
@@ -1669,16 +1660,12 @@ static void axienet_dma_err_handler(struct work_struct 
*work)
 
axienet_setoptions(ndev, lp->options &
   ~(XAE_OPTION_TXEN | XAE_OPTION_RXEN));
-   /* Disable the MDIO interface till Axi Ethernet Reset is completed.
-* When we do an Axi Ethernet reset, it resets the complete core
-* including the MDIO. MDIO must be disabled before resetting
-* and re-enabled afterwards.
+   /* When we do an Axi Ethernet reset, it resets the complete core
+* including the MDIO. MDIO must be disabled before resetting.
 * Hold MDIO bus lock to avoid MDIO accesses during the reset.
 */
mutex_lock(&lp->mii_bus->mdio_lock);
-   axienet_mdio_disable(lp);
__axienet_device_reset(lp);
-   axienet_mdio_enable(lp);
mutex_unlock(&lp->mii_bus->mdio_lock);
 
for (i = 0; i < lp->tx_bd_num; i++) {
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c 
b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
index 84d06bf..9c014ce 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c
@@ -65,9 +65,13 @@ static int axienet_mdio_read(struct mii_bus *bus, int 
phy_id, int reg)
int ret;
struct axienet_local *lp = bus->priv;
 
+   axienet_mdio_mdc_enable(lp);
+
ret = axienet_mdio_wait_until_ready(lp);
-   if (ret < 0)
+   if (ret < 0) {
+   axienet_mdio_mdc_disable(lp);
return ret;
+   }
 
axienet_iow(lp, XAE_MDIO_MCR_OFFSET,
(((phy_id << XAE_MDIO_MCR_PHYAD_SHIFT) &
@@ -78,14 +82,17 @@ static int axienet_mdio_read(struct mii_bus *bus, int 
phy_id, int reg)
 XAE_MDIO_MCR_OP_READ_MASK));
 
ret = axienet_mdio_wait_until_ready(lp);
-   if (ret < 0)
+   if (ret < 0) {
+   axienet_mdio_mdc_disable(lp);
return ret;
+   }
 
rc = axienet_ior(lp, XAE_MDIO_MRD_OFFSET) & 0x;
 
dev_dbg(lp->dev, "axienet_mdio_read(phy_id=%i, reg=%x) == %x\n",
phy_id, reg, rc);
 
+   axienet_mdio_mdc_disable(lp);
return rc;
 }
 
@@ -111,9 +118,13 @@ static int axienet_mdio_write(struct mii_bus *bus, int 
phy_id, int reg,
dev_dbg(lp->dev, "axienet_mdio_write(phy_id=%i, reg=%x, val=%x)\n",
phy_id, reg, val);
 
+   axienet_mdio_mdc_enable(lp);
+
ret = axienet_mdio_wait_until_ready(lp);
-   if (ret < 0)
+   if (ret < 0) {
+   axienet_mdio_mdc_disable(lp);
return ret;
+   }
 
axienet_iow(lp, XAE_MDIO_MWD_OFFSET, (u32) val);
axienet_iow(lp, XAE_MDIO_MCR_OFFSET,
@@ -125,8 +136,11 @@ static i

Re: [PATCH v7 0/6] CTU CAN FD open-source IP core SocketCAN driver, PCI, platform integration and documentation

2020-11-03 Thread Marc Kleine-Budde

On 11/3/20 11:00 AM, Pavel Pisa wrote:
> On Saturday 31 of October 2020 12:35:11 Marc Kleine-Budde wrote:
>> On 10/30/20 11:19 PM, Pavel Pisa wrote:
>>> This driver adds support for the CTU CAN FD open-source IP core.
>>
>> Please fix the following checkpatch warnings/errors:
> 
> Yes I recheck with actual checkpatch, I have used 5.4 one
> and may it be overlooked something during last upadates.

I used the lastest one from linus/master :)

>> -
>> drivers/net/can/ctucanfd/ctucanfd_frame.h
>> -
>> CHECK: Please don't use multiple blank lines
>> #46: FILE: drivers/net/can/ctucanfd/ctucanfd_frame.h:46:
> 
> OK, we find a reason for this blank line in header generator.
> 
>> CHECK: Prefer kernel type 'u32' over 'uint32_t'
>> #49: FILE: drivers/net/can/ctucanfd/ctucanfd_frame.h:49:
>> +uint32_t u32;
> 
> In this case, please confirm that even your personal opinion
> is against uint32_t in headers, you request the change.

confirmed :)

> uint32_t is used in many kernel headers and in this case
> allows our tooling to use headers for mutual test of HDL
> design match with HW access in the C.

It's probably historically related :)

> If the reasons to remove uint32_t prevails, we need to
> separate Linux generator from the one used for other
> purposes. When we add Linux mode then we can revamp
> headers even more and in such case we can even invest
> time to switch from structure bitfields to plain bitmask
> defines.

This is another point I wanted to address. Obviously checkpatch doesn't complain
about bitfields, but it's frowned upon.

> It is quite lot of work and takes some time,
> but if there is consensus I do it during next weeks,
> I would like to see what is preferred way to define
> registers bitfields. I personally like RTEMS approach
> for which we have prepared generator from parsed PDFs
> when we added BSP for TMS570 
> 
>   
> https://git.rtems.org/rtems/tree/bsps/arm/tms570/include/bsp/ti_herc/reg_dcan.h#n152

The current Linux way is to define bitmask with GENMASK() and single bit mask
with BIT().

For example the mcp251xfd driver:

First the register offset:
> #define MCP251XFD_REG_CON 0x00

Then a bitmask:
> #define MCP251XFD_REG_CON_TXBWS_MASK GENMASK(31, 28)

And a single bit:
> #define MCP251XFD_REG_CON_ABAT BIT(27)

see:
https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/net/can/spi/mcp251xfd/mcp251xfd.h#L24

The masks are used with FIELD_GET, FIELD_PREP.

For example:
https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c#L1386

> Other solution I like (biased, because I have even designed it)
> is
> 
>   #define __val2mfld(mask,val) (((mask)&~((mask)<<1))*(val)&(mask))
>   #define __mfld2val(mask,val) (((val)&(mask))/((mask)&~((mask)<<1)))
>   
> https://gitlab.com/pikron/sw-base/sysless/-/blob/master/arch/arm/generic/defines/cpu_def.h#L314
> 
> Which allows to use simple masks, i.e.
>   #define SSP_CR0_DSS_m  0x000f  /* Data Size Select (num bits - 1) */
>   #define SSP_CR0_FRF_m  0x0030  /* Frame Format: 0 SPI, 1 TI, 2 Microwire */
>   #define SSP_CR0_CPOL_m 0x0040  /* SPI Clock Polarity. 0 low between frames, 
> 1 high */ #
> 
>   
> https://gitlab.com/pikron/sw-base/sysless/-/blob/master/libs4c/spi/spi_lpcssp.c#L46
> 
> in the sources
> 
>   lpcssp_drv->ssp_regs->CR0 =
> __val2mfld(SSP_CR0_DSS_m, lpcssp_drv->data16_fl? 16 - 1 : 
> 8 - 1) |
> __val2mfld(SSP_CR0_FRF_m, 0) |
> (msg->size_mode & SPI_MODE_CPOL? SSP_CR0_CPOL_m: 0) |
> (msg->size_mode & SPI_MODE_CPHA? SSP_CR0_CPHA_m: 0) |
> __val2mfld(SSP_CR0_SCR_m, rate);
> 
>   
> https://gitlab.com/pikron/sw-base/sysless/-/blob/master/libs4c/spi/spi_lpcssp.c#L217
> 
> If you have some preferred Linux style then please send us pointers.
> In the fact, Ondrej Ille has based his structure bitfileds style
> on the other driver included in the Linux kernel and it seems
> to be a problem now. So when I invest my time, I want to use style
> which pleases me and others.

Hope that helps,
Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |



signature.asc
Description: OpenPGP digital signature

[PATCH -next] dpaa_eth: use false and true for bool variables

2020-11-03 Thread Zou Wei

Fix coccicheck warnings:

./dpaa_eth.c:2549:2-22: WARNING: Assignment of 0/1 to bool variable
./dpaa_eth.c:2562:2-22: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot 
Signed-off-by: Zou Wei 
---
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index d9c2859..31407c1 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -2546,7 +2546,7 @@ static void dpaa_eth_napi_enable(struct dpaa_priv *priv)
for_each_online_cpu(i) {
percpu_priv = per_cpu_ptr(priv->percpu_priv, i);
 
-   percpu_priv->np.down = 0;
+   percpu_priv->np.down = false;
napi_enable(&percpu_priv->np.napi);
}
 }
@@ -2559,7 +2559,7 @@ static void dpaa_eth_napi_disable(struct dpaa_priv *priv)
for_each_online_cpu(i) {
percpu_priv = per_cpu_ptr(priv->percpu_priv, i);
 
-   percpu_priv->np.down = 1;
+   percpu_priv->np.down = true;
napi_disable(&percpu_priv->np.napi);
}
 }
-- 
2.6.2

Re: [PATCH ipsec] xfrm: Pass template address family to xfrm_state_look_at

2020-11-03 Thread Herbert Xu

On Mon, Nov 02, 2020 at 06:32:19PM -0800, Anthony DeRossi wrote:
> This fixes a regression where valid selectors are incorrectly skipped
> when xfrm_state_find is called with a non-matching address family (e.g.
> when using IPv6-in-IPv4 ESP in transport mode).
> 
> The state's address family is matched against the template's family
> (encap_family) in xfrm_state_find before checking the selector in
> xfrm_state_look_at.  The template's family should also be used for
> selector matching, otherwise valid selectors may be skipped.
> 
> Fixes: e94ee171349d ("xfrm: Use correct address family in xfrm_state_find")
> Signed-off-by: Anthony DeRossi 
> ---
>  net/xfrm/xfrm_state.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Your patch reintroduces the same bug that my patch was trying to
fix, namely that when you do the comparison on flow you must use
the original family and not some other value.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH ipsec] xfrm: Pass template address family to xfrm_state_look_at

2020-11-03 Thread Herbert Xu

On Mon, Nov 02, 2020 at 06:32:19PM -0800, Anthony DeRossi wrote:
> This fixes a regression where valid selectors are incorrectly skipped
> when xfrm_state_find is called with a non-matching address family (e.g.
> when using IPv6-in-IPv4 ESP in transport mode).

Why are we even allowing v6-over-v4 in transport mode? Isn't that
the whole point of BEET mode?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

RE: [PATCH -next] dpaa_eth: use false and true for bool variables

2020-11-03 Thread Madalin Bucur

> -Original Message-
> From: Zou Wei 
> Sent: 03 November 2020 14:05
> To: Madalin Bucur ; da...@davemloft.net;
> k...@kernel.org
> Cc: netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Zou Wei
> 
> Subject: [PATCH -next] dpaa_eth: use false and true for bool variables
> 
> Fix coccicheck warnings:
> 
> ./dpaa_eth.c:2549:2-22: WARNING: Assignment of 0/1 to bool variable
> ./dpaa_eth.c:2562:2-22: WARNING: Assignment of 0/1 to bool variable
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Zou Wei 
> ---
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> index d9c2859..31407c1 100644
> --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> @@ -2546,7 +2546,7 @@ static void dpaa_eth_napi_enable(struct dpaa_priv
> *priv)
>   for_each_online_cpu(i) {
>   percpu_priv = per_cpu_ptr(priv->percpu_priv, i);
> 
> - percpu_priv->np.down = 0;
> + percpu_priv->np.down = false;
>   napi_enable(&percpu_priv->np.napi);
>   }
>  }
> @@ -2559,7 +2559,7 @@ static void dpaa_eth_napi_disable(struct dpaa_priv
> *priv)
>   for_each_online_cpu(i) {
>   percpu_priv = per_cpu_ptr(priv->percpu_priv, i);
> 
> - percpu_priv->np.down = 1;
> + percpu_priv->np.down = true;
>   napi_disable(&percpu_priv->np.napi);
>   }
>  }
> --
> 2.6.2

Acked-by: Madalin Bucur

RE: [PATCH net v3 0/2] dpaa_eth: buffer layout fixes

2020-11-03 Thread Madalin Bucur (OSS)

> -Original Message-
> From: Camelia Groza 
> Sent: 02 November 2020 20:35
> To: willemdebruijn.ker...@gmail.com; Madalin Bucur (OSS)
> ; da...@davemloft.net; k...@kernel.org
> Cc: netdev@vger.kernel.org; Camelia Alexandra Groza 
> Subject: [PATCH net v3 0/2] dpaa_eth: buffer layout fixes
> 
> The patches are related to the software workaround for the A050385 erratum.
> The first patch ensures optimal buffer usage for non-erratum scenarios.
> The
> second patch fixes a currently inconsequential discrepancy between the
> FMan and Ethernet drivers.
> 
> Changes in v3:
> - refactor defines for clarity in 1/2
> - add more details on the user impact in 1/2
> - remove unnecessary inline identifier in 2/2
> 
> Changes in v2:
> - make the returned value for TX ports explicit in 2/2
> - simplify the buf_layout reference in 2/2
> 
> Camelia Groza (2):
>   dpaa_eth: update the buffer layout for non-A050385 erratum scenarios
>   dpaa_eth: fix the RX headroom size alignment
> 
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 28 +
> -
>  1 file changed, 18 insertions(+), 10 deletions(-)
> 
> --
> 1.9.1

For the series,

Acked-by: Madalin Bucur

Re: [PATCH bpf-next 0/5] selftests/xsk: xsk selftests

2020-11-03 Thread Weqaar Janjua

On Mon, 2 Nov 2020 at 23:08, Daniel Borkmann  wrote:
>
> On 10/30/20 1:13 PM, Weqaar Janjua wrote:
> > This patch set adds AF_XDP selftests based on veth to selftests/xsk/.
> >
> > # Topology:
> > # -
> > # ---
> > #   _ | Process | _
> > #  /  ---  \
> > # /|\
> > #/ | \
> > #  --- | ---
> > #  | Thread1 | | | Thread2 |
> > #  --- | ---
> > #   |  |  |
> > #  --- | ---
> > #  |  xskX   | | |  xskY   |
> > #  --- | ---
> > #   |  |  |
> > #  --- | --
> > #  |  vethX  | - |  vethY |
> > #  ---   peer--
> > #   |  |  |
> > #  namespaceX  | namespaceY
> >
> > These selftests test AF_XDP SKB and Native/DRV modes using veth Virtual
> > Ethernet interfaces.
> >
> > The test program contains two threads, each thread is single socket with
> > a unique UMEM. It validates in-order packet delivery and packet content
> > by sending packets to each other.
> >
> > Prerequisites setup by script TEST_PREREQUISITES.sh:
> >
> > Set up veth interfaces as per the topology shown ^^:
> > * setup two veth interfaces and one namespace
> > ** veth in root namespace
> > ** veth in af_xdp namespace
> > ** namespace af_xdp
> > * create a spec file veth.spec that includes this run-time configuration
> >   that is read by test scripts - filenames prefixed with TEST_XSK
> > ***  and  are randomly generated 4 digit numbers used to avoid
> > conflict with any existing interface.
> >
> > The following tests are provided:
> >
> > 1. AF_XDP SKB mode
> > Generic mode XDP is driver independent, used when the driver does
> > not have support for XDP. Works on any netdevice using sockets and
> > generic XDP path. XDP hook from netif_receive_skb().
> > a. nopoll - soft-irq processing
> > b. poll - using poll() syscall
> > c. Socket Teardown
> >Create a Tx and a Rx socket, Tx from one socket, Rx on another.
> >Destroy both sockets, then repeat multiple times. Only nopoll mode
> > is used
> > d. Bi-directional Sockets
> >Configure sockets as bi-directional tx/rx sockets, sets up fill
> > and completion rings on each socket, tx/rx in both directions.
> > Only nopoll mode is used
> >
> > 2. AF_XDP DRV/Native mode
> > Works on any netdevice with XDP_REDIRECT support, driver dependent.
> > Processes packets before SKB allocation. Provides better performance
> > than SKB. Driver hook available just after DMA of buffer descriptor.
> > a. nopoll
> > b. poll
> > c. Socket Teardown
> > d. Bi-directional Sockets
> > * Only copy mode is supported because veth does not currently support
> >   zero-copy mode
> >
> > Total tests: 8.
> >
> > Flow:
> > * Single process spawns two threads: Tx and Rx
> > * Each of these two threads attach to a veth interface within their
> >assigned namespaces
> > * Each thread creates one AF_XDP socket connected to a unique umem
> >for each veth interface
> > * Tx thread transmits 10k packets from veth to veth
> > * Rx thread verifies if all 10k packets were received and delivered
> >in-order, and have the right content
> >
> > Structure of the patch set:
> >
> > Patch 1: This patch adds XSK Selftests framework under
> >   tools/testing/selftests/xsk, and README
> > Patch 2: Adds tests: SKB poll and nopoll mode, mac-ip-udp debug,
> >   and README updates
> > Patch 3: Adds tests: DRV poll and nopoll mode, and README updates
> > Patch 4: Adds tests: SKB and DRV Socket Teardown, and README updates
> > Patch 5: Adds tests: SKB and DRV Bi-directional Sockets, and README
> >   updates
> >
> > Thanks: Weqaar
> >
> > Weqaar Janjua (5):
> >selftests/xsk: xsk selftests framework
> >selftests/xsk: xsk selftests - SKB POLL, NOPOLL
> >selftests/xsk: xsk selftests - DRV POLL, NOPOLL
> >selftests/xsk: xsk selftests - Socket Teardown - SKB, DRV
> >selftests/xsk: xsk selftests - Bi-directional Sockets - SKB, DRV
>
> Thanks a lot for adding the selftests, Weqaar! Given this needs to copy quite
> a bit of BPF selftest base infra e.g. from Makefiles I'd prefer if you could
> place these under selftests/bpf/ instead to avoid duplicating changes into two
> locations. I understand that these tests don't integrate well into test_progs,
> but for example see test_tc_redirect.sh or test_tc_edt.sh for stand-alone 
> tests
> which could be done similarly with the xsk ones. Would be great if you could
> integrate them and spin a v2 with that.
>
> Thanks,
> Daniel

Hi Daniel,

Appreciate the pointers and suggestions which I will re-evaluate
against merg

RE: [PATCH v2 net-next 0/3] fsl/qbman: in_interrupt() cleanup.

2020-11-03 Thread Camelia Alexandra Groza

> -Original Message-
> From: Sebastian Andrzej Siewior 
> Sent: Monday, November 2, 2020 01:23
> To: netdev@vger.kernel.org
> Cc: Horia Geanta ; Aymen Sghaier
> ; Herbert Xu ;
> David S. Miller ; Madalin Bucur
> ; Jakub Kicinski ; Leo Li
> ; Thomas Gleixner ; Sebastian
> Andrzej Siewior 
> Subject: [PATCH v2 net-next 0/3] fsl/qbman: in_interrupt() cleanup.
> 
> This is the in_interrupt() clean for FSL DPAA framework and the two
> users.
> 
> The `napi' parameter has been renamed to `sched_napi', the other parts
> are same as in the previous post [0].
> 
> [0] 
> https://lore.kernel.org/linux-arm-kernel/20201027225454.3492351-1-bige...@linutronix.de/
> 
> Sebastian

Tested-by: Camelia Groza

Re: lan78xx: /sys/class/net/eth0/carrier stuck at 1

2020-11-03 Thread Juerg Haefliger

On Fri, 23 Oct 2020 15:05:19 +0200
Andrew Lunn  wrote:

> On Fri, Oct 23, 2020 at 08:29:59AM +0200, Juerg Haefliger wrote:
> > On Wed, 21 Oct 2020 21:35:48 +0200
> > Andrew Lunn  wrote:
> >   
> > > On Wed, Oct 21, 2020 at 05:00:53PM +0200, Juerg Haefliger wrote:  
> > > > Hi,
> > > > 
> > > > If the lan78xx driver is compiled into the kernel and the network cable 
> > > > is
> > > > plugged in at boot, /sys/class/net/eth0/carrier is stuck at 1 and 
> > > > doesn't
> > > > toggle if the cable is unplugged and replugged.
> > > > 
> > > > If the network cable is *not* plugged in at boot, all seems to work 
> > > > fine.
> > > > I.e., post-boot cable plugs and unplugs toggle the carrier flag.
> > > > 
> > > > Also, everything seems to work fine if the driver is compiled as a 
> > > > module.
> > > > 
> > > > There's an older ticket for the raspi kernel [1] but I've just tested 
> > > > this
> > > > with a 5.8 kernel on a Pi 3B+ and still see that behavior.
> > > 
> > > Hi Jürg  
> > 
> > Hi Andrew,
> > 
> >   
> > > Could you check if a different PHY driver is being used when it is
> > > built and broken vs module or built in and working.
> > > 
> > > Look at /sys/class/net/eth0/phydev/driver  
> > 
> > There's no such file.  
> 
> I _think_ that means it is using genphy, the generic PHY driver, not a
> specific vendor PHY driver? What does
> 
> /sys/class/net/eth0/phydev/phy_id contain.

There is no directory /sys/class/net/eth0/phydev.

$ ls /sys/class/net/eth0/
addr_assign_type  broadcastcarrier_down_count  dev_port  duplex 
ifalias  link_mode netdev_group  phys_port_name  proto_down  
statisticstype
addr_len  carrier  carrier_up_countdeviceflags  
ifindex  mtu   operstate phys_switch_id  queues  
subsystem uevent
address   carrier_changes  dev_id  dormant   
gro_flush_timeout  iflink   name_assign_type  phys_port_id  power   
speed   tx_queue_len


> > Given that all works fine as long as the cable is unplugged at boot points
> > more towards a race at boot or incorrect initialization sequence or 
> > something.  
> 
> Could be. Could you run
> 
> mii-tool -vv eth0

Hrm. Running that command unlocks the carrier flag and it starts toggling on
cable unplug/plug. First invocation:

$ sudo mii-tool -vv eth0
Using SIOCGMIIPHY=0x8947
eth0: negotiated 1000baseT-FD flow-control, link ok
  registers for MII PHY 1: 
1040 79ed 0007 c132 05e1 cde1 000f 
 0200 0800     3000
  0088    3200 0004
0040 a000 a000  a035   
  product info: vendor 00:01:f0, model 19 rev 2
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
flow-control
  link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
flow-control

Subsequent invocation:

$ sudo mii-tool -vv eth0
Using SIOCGMIIPHY=0x8947
eth0: negotiated 1000baseT-FD flow-control, link ok
  registers for MII PHY 1: 
1040 79ed 0007 c132 05e1 cde1 000d 
 0200 0800     3000
  0088    3200 0004
0040 a000   a035   
  product info: vendor 00:01:f0, model 19 rev 2
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
flow-control
  link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
flow-control

In the first invocation, register 0x1a shows a pending link-change interrupt
(0xa000) which wasn't serviced (and cleared) for some reason. Dumping the
registers cleared that interrupt bit and things start working correctly
afterwards. Nor sure yet why that first interrupt is ignored.

...Juerg

 
> in the good and bad case.
> 
>Andrew
> 



pgpEo77BWLfwu.pgp
Description: OpenPGP digital signature

[net-next,v1,0/5] seg6: add support for SRv6 End.DT4 behavior

2020-11-03 Thread Andrea Mayer

This patchset provides support for the SRv6 End.DT4 behavior.

The SRv6 End.DT4 is used to implement multi-tenant IPv4 L3 VPN. It
decapsulates the received packets and performs IPv4 routing lookup in
the routing table of the tenant. The SRv6 End.DT4 Linux implementation
leverages a VRF device. The SRv6 End.DT4 is defined in the SRv6 Network
Programming [1].

- Patch 1/5 is needed to solve a pre-existing issue with tunneled packets
  when a sniffer is attached;

- Patch 2/5 improves the management of the seg6local attributes used by the
  SRv6 behaviors;

- Patch 3/5 introduces two callbacks used for customizing the
  creation/destruction of a SRv6 behavior;

- Patch 4/5 is the core patch that adds support for the SRv6 End.DT4 behavior;

- Patch 5/5 adds the selftest for SRv6 End.DT4 behavior.

I would like to thank David Ahern for his support during the development of
this patch set.

Comments, suggestions and improvements are very welcome!

Thanks,
Andrea Mayer

v1
 improve comments;

 add new patch 2/5 titled: seg6: improve management of behavior attributes

 seg6: add support for the SRv6 End.DT4 behavior 
  - remove the inline keyword in the definition of fib6_config_get_net().

 selftests: add selftest for the SRv6 End.DT4 behavior
  - add check for the vrf sysctl

[1] https://tools.ietf.org/html/draft-ietf-spring-srv6-network-programming

Andrea Mayer (5):
  vrf: add mac header for tunneled packets when sniffer is attached
  seg6: improve management of behavior attributes
  seg6: add callbacks for customizing the creation/destruction of a
behavior
  seg6: add support for the SRv6 End.DT4 behavior
  selftests: add selftest for the SRv6 End.DT4 behavior

 drivers/net/vrf.c |  78 ++-
 net/ipv6/seg6_local.c | 370 -
 .../selftests/net/srv6_end_dt4_l3vpn_test.sh  | 494 ++
 3 files changed, 927 insertions(+), 15 deletions(-)
 create mode 100755 tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh

-- 
2.20.1

[net-next,v1,1/5] vrf: add mac header for tunneled packets when sniffer is attached

2020-11-03 Thread Andrea Mayer

Before this patch, a sniffer attached to a VRF used as the receiving
interface of L3 tunneled packets detects them as malformed packets and
it complains about that (i.e.: tcpdump shows bogus packets).

The reason is that a tunneled L3 packet does not carry any L2
information and when the VRF is set as the receiving interface of a
decapsulated L3 packet, no mac header is currently set or valid.
Therefore, the purpose of this patch consists of adding a MAC header to
any packet which is directly received on the VRF interface ONLY IF:

 i) a sniffer is attached on the VRF and ii) the mac header is not set.

In this case, the mac address of the VRF is copied in both the
destination and the source address of the ethernet header. The protocol
type is set either to IPv4 or IPv6, depending on which L3 packet is
received.

Signed-off-by: Andrea Mayer 
---
 drivers/net/vrf.c | 78 +++
 1 file changed, 72 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 60c1aadece89..26f2ed02a5c1 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -1263,6 +1263,61 @@ static void vrf_ip6_input_dst(struct sk_buff *skb, 
struct net_device *vrf_dev,
skb_dst_set(skb, &rt6->dst);
 }
 
+static int vrf_prepare_mac_header(struct sk_buff *skb,
+ struct net_device *vrf_dev, u16 proto)
+{
+   struct ethhdr *eth;
+   int err;
+
+   /* in general, we do not know if there is enough space in the head of
+* the packet for hosting the mac header.
+*/
+   err = skb_cow_head(skb, LL_RESERVED_SPACE(vrf_dev));
+   if (unlikely(err))
+   /* no space in the skb head */
+   return -ENOBUFS;
+
+   __skb_push(skb, ETH_HLEN);
+   eth = (struct ethhdr *)skb->data;
+
+   skb_reset_mac_header(skb);
+
+   /* we set the ethernet destination and the source addresses to the
+* address of the VRF device.
+*/
+   ether_addr_copy(eth->h_dest, vrf_dev->dev_addr);
+   ether_addr_copy(eth->h_source, vrf_dev->dev_addr);
+   eth->h_proto = htons(proto);
+
+   /* the destination address of the Ethernet frame corresponds to the
+* address set on the VRF interface; therefore, the packet is intended
+* to be processed locally.
+*/
+   skb->protocol = eth->h_proto;
+   skb->pkt_type = PACKET_HOST;
+
+   skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
+
+   skb_pull_inline(skb, ETH_HLEN);
+
+   return 0;
+}
+
+/* prepare and add the mac header to the packet if it was not set previously.
+ * In this way, packet sniffers such as tcpdump can parse the packet correctly.
+ * If the mac header was already set, the original mac header is left
+ * untouched and the function returns immediately.
+ */
+static int vrf_add_mac_header_if_unset(struct sk_buff *skb,
+  struct net_device *vrf_dev,
+  u16 proto)
+{
+   if (skb_mac_header_was_set(skb))
+   return 0;
+
+   return vrf_prepare_mac_header(skb, vrf_dev, proto);
+}
+
 static struct sk_buff *vrf_ip6_rcv(struct net_device *vrf_dev,
   struct sk_buff *skb)
 {
@@ -1289,9 +1344,15 @@ static struct sk_buff *vrf_ip6_rcv(struct net_device 
*vrf_dev,
skb->skb_iif = vrf_dev->ifindex;
 
if (!list_empty(&vrf_dev->ptype_all)) {
-   skb_push(skb, skb->mac_len);
-   dev_queue_xmit_nit(skb, vrf_dev);
-   skb_pull(skb, skb->mac_len);
+   int err;
+
+   err = vrf_add_mac_header_if_unset(skb, vrf_dev,
+ ETH_P_IPV6);
+   if (likely(!err)) {
+   skb_push(skb, skb->mac_len);
+   dev_queue_xmit_nit(skb, vrf_dev);
+   skb_pull(skb, skb->mac_len);
+   }
}
 
IP6CB(skb)->flags |= IP6SKB_L3SLAVE;
@@ -1334,9 +1395,14 @@ static struct sk_buff *vrf_ip_rcv(struct net_device 
*vrf_dev,
vrf_rx_stats(vrf_dev, skb->len);
 
if (!list_empty(&vrf_dev->ptype_all)) {
-   skb_push(skb, skb->mac_len);
-   dev_queue_xmit_nit(skb, vrf_dev);
-   skb_pull(skb, skb->mac_len);
+   int err;
+
+   err = vrf_add_mac_header_if_unset(skb, vrf_dev, ETH_P_IP);
+   if (likely(!err)) {
+   skb_push(skb, skb->mac_len);
+   dev_queue_xmit_nit(skb, vrf_dev);
+   skb_pull(skb, skb->mac_len);
+   }
}
 
skb = vrf_rcv_nfhook(NFPROTO_IPV4, NF_INET_PRE_ROUTING, skb, vrf_dev);
-- 
2.20.1

[net-next,v1,2/5] seg6: improve management of behavior attributes

2020-11-03 Thread Andrea Mayer

Depending on the attribute (i.e.: SEG6_LOCAL_SRH, SEG6_LOCAL_TABLE, etc),
the parse() callback performs some validity checks on the provided input
and updates the tunnel state (slwt) with the result of the parsing
operation. However, an attribute may also need to reserve some additional
resources (i.e.: memory or setting up an eBPF program) in the parse()
callback to complete the parsing operation.

The parse() callbacks are invoked by the parse_nla_action() for each
attribute belonging to a specific behavior. Given a behavior with N
attributes, if the parsing of the i-th attribute fails, the
parse_nla_action() returns immediately with an error. Nonetheless, the
resources acquired during the parsing of the i-1 attributes are not freed
by the parse_nla_action().

Attributes which acquire resources must release them *in an explicit way*
in both the seg6_local_{build/destroy}_state(). However, adding a new
attribute of this type requires changes to
seg6_local_{build/destroy}_state() to release the resources correctly.

The seg6local infrastructure still lacks a simple and structured way to
release the resources acquired in the parse() operations.

We introduced a new callback in the struct seg6_action_param named
destroy(). This callback releases any resource which may have been acquired
in the parse() counterpart. Each attribute may or may not implement the
destroy() callback depending on whether it needs to free some acquired
resources.

The destroy() callback comes with several of advantages:

 1) we can have many attributes as we want for a given behavior with no
need to explicitly free the taken resources;

 2) As in case of the seg6_local_build_state(), the
seg6_local_destroy_state() does not need to handle the release of
resources directly. Indeed, it calls the destroy_attrs() function which
is in charge of calling the destroy() callback for every set attribute.
We do not need to patch seg6_local_{build/destroy}_state() anymore as
we add new attributes;

 3) the code is more readable and better structured. Indeed, all the
information needed to handle a given attribute are contained in only
one place;

 4) it facilitates the integration with new features introduced in further
patches.

Signed-off-by: Andrea Mayer 
---
 net/ipv6/seg6_local.c | 103 ++
 1 file changed, 93 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index eba23279912d..63a82e2fdea9 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -710,6 +710,12 @@ static int cmp_nla_srh(struct seg6_local_lwt *a, struct 
seg6_local_lwt *b)
return memcmp(a->srh, b->srh, len);
 }
 
+static void destroy_attr_srh(struct seg6_local_lwt *slwt)
+{
+   kfree(slwt->srh);
+   slwt->srh = NULL;
+}
+
 static int parse_nla_table(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 {
slwt->table = nla_get_u32(attrs[SEG6_LOCAL_TABLE]);
@@ -901,16 +907,33 @@ static int cmp_nla_bpf(struct seg6_local_lwt *a, struct 
seg6_local_lwt *b)
return strcmp(a->bpf.name, b->bpf.name);
 }
 
+static void destroy_attr_bpf(struct seg6_local_lwt *slwt)
+{
+   kfree(slwt->bpf.name);
+   if (slwt->bpf.prog)
+   bpf_prog_put(slwt->bpf.prog);
+
+   slwt->bpf.name = NULL;
+   slwt->bpf.prog = NULL;
+}
+
 struct seg6_action_param {
int (*parse)(struct nlattr **attrs, struct seg6_local_lwt *slwt);
int (*put)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
int (*cmp)(struct seg6_local_lwt *a, struct seg6_local_lwt *b);
+
+   /* optional destroy() callback useful for releasing resources which
+* have been previously acquired in the corresponding parse()
+* function.
+*/
+   void (*destroy)(struct seg6_local_lwt *slwt);
 };
 
 static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = {
[SEG6_LOCAL_SRH]= { .parse = parse_nla_srh,
.put = put_nla_srh,
-   .cmp = cmp_nla_srh },
+   .cmp = cmp_nla_srh,
+   .destroy = destroy_attr_srh },
 
[SEG6_LOCAL_TABLE]  = { .parse = parse_nla_table,
.put = put_nla_table,
@@ -934,13 +957,68 @@ static struct seg6_action_param 
seg6_action_params[SEG6_LOCAL_MAX + 1] = {
 
[SEG6_LOCAL_BPF]= { .parse = parse_nla_bpf,
.put = put_nla_bpf,
-   .cmp = cmp_nla_bpf },
+   .cmp = cmp_nla_bpf,
+   .destroy = destroy_attr_bpf },
 
 };
 
+/* call the destroy() callback (if available) for each set attribute in
+ * @parsed_attrs, starting from attribute index @start up to @end excluded.
+ */
+static void __destroy_attrs(unsigned long parsed_attrs, int start, int end,
+

[net-next,v1,5/5] selftests: add selftest for the SRv6 End.DT4 behavior

2020-11-03 Thread Andrea Mayer

this selftest is designed for evaluating the new SRv6 End.DT4 behavior
used, in this example, for implementing IPv4 L3 VPN use cases.

Signed-off-by: Andrea Mayer 
---
 .../selftests/net/srv6_end_dt4_l3vpn_test.sh  | 494 ++
 1 file changed, 494 insertions(+)
 create mode 100755 tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh

diff --git a/tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh 
b/tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh
new file mode 100755
index ..a5547fed5048
--- /dev/null
+++ b/tools/testing/selftests/net/srv6_end_dt4_l3vpn_test.sh
@@ -0,0 +1,494 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# author: Andrea Mayer 
+
+# This test is designed for evaluating the new SRv6 End.DT4 behavior used for
+# implementing IPv4 L3 VPN use cases.
+#
+# Hereafter a network diagram is shown, where two different tenants (named 100
+# and 200) offer IPv4 L3 VPN services allowing hosts to communicate with each
+# other across an IPv6 network.
+#
+# Only hosts belonging to the same tenant (and to the same VPN) can communicate
+# with each other. Instead, the communication among hosts of different tenants
+# is forbidden.
+# In other words, hosts hs-t100-1 and hs-t100-2 are connected through the IPv4
+# L3 VPN of tenant 100 while hs-t200-3 and hs-t200-4 are connected using the
+# IPv4 L3 VPN of tenant 200. Cross connection between tenant 100 and tenant 200
+# is forbidden and thus, for example, hs-t100-1 cannot reach hs-t200-3 and vice
+# versa.
+#
+# Routers rt-1 and rt-2 implement IPv4 L3 VPN services leveraging the SRv6
+# architecture. The key components for such VPNs are: a) SRv6 Encap behavior,
+# b) SRv6 End.DT4 behavior and c) VRF.
+#
+# To explain how an IPv4 L3 VPN based on SRv6 works, let us briefly consider an
+# example where, within the same domain of tenant 100, the host hs-t100-1 pings
+# the host hs-t100-2.
+#
+# First of all, L2 reachability of the host hs-t100-2 is taken into account by
+# the router rt-1 which acts as an arp proxy.
+#
+# When the host hs-t100-1 sends an IPv4 packet destined to hs-t100-2, the
+# router rt-1 receives the packet on the internal veth-t100 interface. Such
+# interface is enslaved to the VRF vrf-100 whose associated table contains the
+# SRv6 Encap route for encapsulating any IPv4 packet in a IPv6 plus the Segment
+# Routing Header (SRH) packet. This packet is sent through the (IPv6) core
+# network up to the router rt-2 that receives it on veth0 interface.
+#
+# The rt-2 router uses the 'localsid' routing table to process incoming
+# IPv6+SRH packets which belong to the VPN of the tenant 100. For each of these
+# packets, the SRv6 End.DT4 behavior removes the outer IPv6+SRH headers and
+# performs the lookup on the vrf-100 table using the destination address of
+# the decapsulated IPv4 packet. Afterwards, the packet is sent to the host
+# hs-t100-2 through the veth-t100 interface.
+#
+# The ping response follows the same processing but this time the role of rt-1
+# and rt-2 are swapped.
+#
+# Of course, the IPv4 L3 VPN for tenant 200 works exactly as the IPv4 L3 VPN
+# for tenant 100. In this case, only hosts hs-t200-3 and hs-t200-4 are able to
+# connect with each other.
+#
+#
+# +---+   +---+
+# |   |   |   |
+# |  hs-t100-1 netns  |   |  hs-t100-2 netns  |
+# |   |   |   |
+# |  +-+  |   |  +-+  |
+# |  |veth0|  |   |  |veth0|  |
+# |  | 10.0.0.1/24 |  |   |  | 10.0.0.2/24 |  |
+# |  +-+  |   |  +-+  |
+# |.  |   | . |
+# +---+   +---+
+#  ..
+#  ..
+#  ..
+# +---+   +---+
+# |.  |   | . |
+# | +---+ |   | + |
+# | |   veth-t100   | |   | |   veth-t100   | |
+# | | 10.0.0.254/24 |+--+ |   | +--+| 10.0.0.254/24 | |
+# | +---+---+| localsid | |   | | localsid |+---+ |
+# | ||   table  | |   | |   table  || |
+# |+++   +--+ |   | +--+   +++|
+# || vrf-100 ||   || vrf-100 ||
+# |

[net-next,v1,4/5] seg6: add support for the SRv6 End.DT4 behavior

2020-11-03 Thread Andrea Mayer

SRv6 End.DT4 is defined in the SRv6 Network Programming [1].

The SRv6 End.DT4 is used to implement IPv4 L3VPN use-cases in
multi-tenants environments. It decapsulates the received packets and it
performs IPv4 routing lookup in the routing table of the tenant.

The SRv6 End.DT4 Linux implementation leverages a VRF device in order to
force the routing lookup into the associated routing table.

To make the End.DT4 work properly, it must be guaranteed that the routing
table used for routing lookup operations is bound to one and only one
VRF during the tunnel creation. Such constraint has to be enforced by
enabling the VRF strict_mode sysctl parameter, i.e:
 $ sysctl -wq net.vrf.strict_mode=1.

At JANOG44, LINE corporation presented their multi-tenant DC architecture
using SRv6 [2]. In the slides, they reported that the Linux kernel is
missing the support of SRv6 End.DT4 behavior.

The iproute2 counterpart required for configuring the SRv6 End.DT4
behavior is already implemented along with the other supported SRv6
behaviors [3].

[1] https://tools.ietf.org/html/draft-ietf-spring-srv6-network-programming
[2] 
https://speakerdeck.com/line_developers/line-data-center-networking-with-srv6
[3] https://patchwork.ozlabs.org/patch/799837/

Signed-off-by: Andrea Mayer 
---
 net/ipv6/seg6_local.c | 205 ++
 1 file changed, 205 insertions(+)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 4b0f155d641d..a41074acd43e 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -57,6 +57,14 @@ struct bpf_lwt_prog {
char *name;
 };
 
+struct seg6_end_dt4_info {
+   struct net *net;
+   /* VRF device associated to the routing table used by the SRv6 End.DT4
+* behavior for routing IPv4 packets.
+*/
+   int vrf_ifindex;
+};
+
 struct seg6_local_lwt {
int action;
struct ipv6_sr_hdr *srh;
@@ -66,6 +74,7 @@ struct seg6_local_lwt {
int iif;
int oif;
struct bpf_lwt_prog bpf;
+   struct seg6_end_dt4_info dt4_info;
 
int headroom;
struct seg6_action_desc *desc;
@@ -413,6 +422,194 @@ static int input_action_end_dx4(struct sk_buff *skb,
return -EINVAL;
 }
 
+#ifdef CONFIG_NET_L3_MASTER_DEV
+
+static struct net *fib6_config_get_net(const struct fib6_config *fib6_cfg)
+{
+   const struct nl_info *nli = &fib6_cfg->fc_nlinfo;
+
+   return nli->nl_net;
+}
+
+static int seg6_end_dt4_build(struct seg6_local_lwt *slwt, const void *cfg,
+ struct netlink_ext_ack *extack)
+{
+   struct seg6_end_dt4_info *info = &slwt->dt4_info;
+   int vrf_ifindex;
+   struct net *net;
+
+   net = fib6_config_get_net(cfg);
+
+   vrf_ifindex = l3mdev_ifindex_lookup_by_table_id(L3MDEV_TYPE_VRF, net,
+   slwt->table);
+   if (vrf_ifindex < 0) {
+   if (vrf_ifindex == -EPERM) {
+   NL_SET_ERR_MSG(extack,
+  "Strict mode for VRF is disabled");
+   } else if (vrf_ifindex == -ENODEV) {
+   NL_SET_ERR_MSG(extack, "No such device");
+   } else {
+   NL_SET_ERR_MSG(extack, "Unknown error");
+
+   pr_debug("seg6local: SRv6 End.DT4 creation error=%d\n",
+vrf_ifindex);
+   }
+
+   return vrf_ifindex;
+   }
+
+   info->net = net;
+   info->vrf_ifindex = vrf_ifindex;
+
+   return 0;
+}
+
+/* The SRv6 End.DT4 behavior extracts the inner (IPv4) packet and routes the
+ * IPv4 packet by looking at the configured routing table.
+ *
+ * In the SRv6 End.DT4 use case, we can receive traffic (IPv6+Segment Routing
+ * Header packets) from several interfaces and the IPv6 destination address 
(DA)
+ * is used for retrieving the specific instance of the End.DT4 behavior that
+ * should process the packets.
+ *
+ * However, the inner IPv4 packet is not really bound to any receiving
+ * interface and thus the End.DT4 sets the VRF (associated with the
+ * corresponding routing table) as the *receiving* interface.
+ * In other words, the End.DT4 processes a packet as if it has been received
+ * directly by the VRF (and not by one of its slave devices, if any).
+ * In this way, the VRF interface is used for routing the IPv4 packet in
+ * according to the routing table configured by the End.DT4 instance.
+ *
+ * This design allows you to get some interesting features like:
+ *  1) the statistics on rx packets;
+ *  2) the possibility to install a packet sniffer on the receiving interface
+ * (the VRF one) for looking at the incoming packets;
+ *  3) the possibility to leverage the netfilter prerouting hook for the inner
+ * IPv4 packet.
+ *
+ * This function returns:
+ *  - the sk_buff* when the VRF rcv handler has processed the packet correctly;
+ *  - NULL when the skb is consumed by the VRF rcv

[net-next,v1,3/5] seg6: add callbacks for customizing the creation/destruction of a behavior

2020-11-03 Thread Andrea Mayer

We introduce two callbacks used for customizing the creation/destruction of
a SRv6 behavior. Such callbacks are defined in the new struct
seg6_local_lwtunnel_ops and hereafter we provide a brief description of
them:

 - build_state(...): used for calling the custom constructor of the
   behavior during its initialization phase and after all the attributes
   have been parsed successfully;

 - destroy_state(...): used for calling the custom destructor of the
   behavior before it is completely destroyed.

Signed-off-by: Andrea Mayer 
---
 net/ipv6/seg6_local.c | 64 +++
 1 file changed, 64 insertions(+)

diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 63a82e2fdea9..4b0f155d641d 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -33,11 +33,23 @@
 
 struct seg6_local_lwt;
 
+typedef int (*slwt_build_state_t)(struct seg6_local_lwt *slwt, const void *cfg,
+ struct netlink_ext_ack *extack);
+typedef void (*slwt_destroy_state_t)(struct seg6_local_lwt *slwt);
+
+/* callbacks used for customizing the creation and destruction of a behavior */
+struct seg6_local_lwtunnel_ops {
+   slwt_build_state_t build_state;
+   slwt_destroy_state_t destroy_state;
+};
+
 struct seg6_action_desc {
int action;
unsigned long attrs;
int (*input)(struct sk_buff *skb, struct seg6_local_lwt *slwt);
int static_headroom;
+
+   struct seg6_local_lwtunnel_ops slwt_ops;
 };
 
 struct bpf_lwt_prog {
@@ -1015,6 +1027,45 @@ static void destroy_attrs(struct seg6_local_lwt *slwt)
__destroy_attrs(attrs, 0, SEG6_LOCAL_MAX + 1, slwt);
 }
 
+/* call the custom constructor of the behavior during its initialization phase
+ * and after that all its attributes have been parsed successfully.
+ */
+static int
+seg6_local_lwtunnel_build_state(struct seg6_local_lwt *slwt, const void *cfg,
+   struct netlink_ext_ack *extack)
+{
+   slwt_build_state_t build_func;
+   struct seg6_action_desc *desc;
+   int err = 0;
+
+   desc = slwt->desc;
+   if (!desc)
+   return -EINVAL;
+
+   build_func = desc->slwt_ops.build_state;
+   if (build_func)
+   err = build_func(slwt, cfg, extack);
+
+   return err;
+}
+
+/* call the custom destructor of the behavior which is invoked before the
+ * tunnel is going to be destroyed.
+ */
+static void seg6_local_lwtunnel_destroy_state(struct seg6_local_lwt *slwt)
+{
+   slwt_destroy_state_t destroy_func;
+   struct seg6_action_desc *desc;
+
+   desc = slwt->desc;
+   if (!desc)
+   return;
+
+   destroy_func = desc->slwt_ops.destroy_state;
+   if (destroy_func)
+   destroy_func(slwt);
+}
+
 static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt)
 {
struct seg6_action_param *param;
@@ -1090,8 +1141,16 @@ static int seg6_local_build_state(struct net *net, 
struct nlattr *nla,
 
err = parse_nla_action(tb, slwt);
if (err < 0)
+   /* In case of error, the parse_nla_action() takes care of
+* releasing resources which have been acquired during the
+* processing of attributes.
+*/
goto out_free;
 
+   err = seg6_local_lwtunnel_build_state(slwt, cfg, extack);
+   if (err < 0)
+   goto free_attrs;
+
newts->type = LWTUNNEL_ENCAP_SEG6_LOCAL;
newts->flags = LWTUNNEL_STATE_INPUT_REDIRECT;
newts->headroom = slwt->headroom;
@@ -1100,6 +1159,9 @@ static int seg6_local_build_state(struct net *net, struct 
nlattr *nla,
 
return 0;
 
+free_attrs:
+   destroy_attrs(slwt);
+
 out_free:
kfree(newts);
return err;
@@ -1109,6 +1171,8 @@ static void seg6_local_destroy_state(struct 
lwtunnel_state *lwt)
 {
struct seg6_local_lwt *slwt = seg6_local_lwtunnel(lwt);
 
+   seg6_local_lwtunnel_destroy_state(slwt);
+
destroy_attrs(slwt);
 
return;
-- 
2.20.1

Re: [PATCH] cfg80211: make wifi driver probe

2020-11-03 Thread Kalle Valo

Kelvin Cheung  writes:

> We are preparing the Wi-Fi driver for Unisoc WCN chips. Please ignore
> this draft version. There will be a formal version soon.

Ok, I'll drop this. But please don't use HTML in mails, more info in the
wiki page below. I recommend reading it all very carefully.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Re: KASAN: use-after-free Read in decode_session6

2020-11-03 Thread Xin Long

On Sun, Nov 1, 2020 at 1:40 PM syzbot
 wrote:
>
> syzbot has bisected this issue to:
>
> commit bcd623d8e9fa5f82bbd8cd464dc418d24139157b
> Author: Xin Long 
> Date:   Thu Oct 29 07:05:05 2020 +
>
> sctp: call sk_setup_caps in sctp_packet_transmit instead
>
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14df9cb850
> start commit:   68bb4665 Merge branch 'l2-multicast-forwarding-for-ocelot-..
> git tree:   net-next
> final oops: https://syzkaller.appspot.com/x/report.txt?x=16df9cb850
> console output: https://syzkaller.appspot.com/x/log.txt?x=12df9cb850
> kernel config:  https://syzkaller.appspot.com/x/.config?x=eac680ae76558a0e
> dashboard link: https://syzkaller.appspot.com/bug?extid=5be8aebb1b7dfa90ef31
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1128639850
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11bbf39850
>
> Reported-by: syzbot+5be8aebb1b7dfa90e...@syzkaller.appspotmail.com
> Fixes: bcd623d8e9fa ("sctp: call sk_setup_caps in sctp_packet_transmit 
> instead")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
I'm looking into this, Thanks.

[PATCH net-next] net: emaclite: Add error handling for of_address_ and phy read functions

2020-11-03 Thread Radhey Shyam Pandey

From: Shravya Kumbham 

Add ret variable, conditions to check the return value and it's error
path for of_address_to_resource() and phy_read() functions.

Addresses-Coverity: Event check_return value.
Signed-off-by: Shravya Kumbham 
Signed-off-by: Radhey Shyam Pandey 
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c 
b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 0c26f5b..fc5ccd1 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -820,7 +820,7 @@ static int xemaclite_mdio_write(struct mii_bus *bus, int 
phy_id, int reg,
 static int xemaclite_mdio_setup(struct net_local *lp, struct device *dev)
 {
struct mii_bus *bus;
-   int rc;
+   int rc, ret;
struct resource res;
struct device_node *np = of_get_parent(lp->phy_node);
struct device_node *npp;
@@ -834,7 +834,13 @@ static int xemaclite_mdio_setup(struct net_local *lp, 
struct device *dev)
}
npp = of_get_parent(np);
 
-   of_address_to_resource(npp, 0, &res);
+   ret = of_address_to_resource(npp, 0, &res);
+   if (ret) {
+   dev_err(dev, "%s resource error!\n",
+   dev->of_node->full_name);
+   of_node_put(lp->phy_node);
+   return ret;
+   }
if (lp->ndev->mem_start != res.start) {
struct phy_device *phydev;
phydev = of_phy_find_device(lp->phy_node);
@@ -923,7 +929,7 @@ static int xemaclite_open(struct net_device *dev)
xemaclite_disable_interrupts(lp);
 
if (lp->phy_node) {
-   u32 bmcr;
+   int bmcr;
 
lp->phy_dev = of_phy_connect(lp->ndev, lp->phy_node,
 xemaclite_adjust_link, 0,
@@ -945,6 +951,13 @@ static int xemaclite_open(struct net_device *dev)
 
/* Restart auto negotiation */
bmcr = phy_read(lp->phy_dev, MII_BMCR);
+   if (bmcr < 0) {
+   dev_err(&lp->ndev->dev, "phy_read failed\n");
+   phy_disconnect(lp->phy_dev);
+   lp->phy_dev = NULL;
+
+   return bmcr;
+   }
bmcr |= (BMCR_ANENABLE | BMCR_ANRESTART);
phy_write(lp->phy_dev, MII_BMCR, bmcr);
 
-- 
2.7.4

[PATCH v6] lib: optimize cpumask_local_spread()

2020-11-03 Thread Shaokun Zhang

From: Yuqi Jin 

In multi-processor and NUMA system, I/O driver will find cpu cores that
which shall be bound IRQ.  When cpu cores in the local numa have been
used, it is better to find the node closest to the local numa node for
performance, instead of choosing any online cpu immediately.

Currently, Intel DDIO affects only local sockets, so its performance
improvement is due to the relative difference in performance between the
local socket I/O and remote socket I/O.To ensure that Intel DDIO’s
benefits are available to applications where they are most useful, the
irq can be pinned to particular sockets using Intel DDIO.
This arrangement is called socket affinityi. So this patch can help
Intel DDIO work. The same I/O stash function for most processors

On Huawei Kunpeng 920 server, there are 4 NUMA node(0 - 3) in the 2-cpu
system(0 - 1). The topology of this server is followed:
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
node 0 size: 63379 MB
node 0 free: 61899 MB
node 1 cpus: 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
46 47
node 1 size: 64509 MB
node 1 free: 63942 MB
node 2 cpus: 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 
70 71
node 2 size: 64509 MB
node 2 free: 63056 MB
node 3 cpus: 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 
94 95
node 3 size: 63997 MB
node 3 free: 63420 MB
node distances:
node   0   1   2   3
  0:  10  16  32  33
  1:  16  10  25  32
  2:  32  25  10  16
  3:  33  32  16  10

We perform PS (parameter server) business test, the behavior of the
service is that the client initiates a request through the network card,
the server responds to the request after calculation.  When two PS
processes run on node2 and node3 separately and the network card is
located on 'node2' which is in cpu1, the performance of node2 (26W QPS)
and node3 (22W QPS) is different.

It is better that the NIC queues are bound to the cpu1 cores in turn, then
XPS will also be properly initialized, while cpumask_local_spread only
considers the local node.  When the number of NIC queues exceeds the
number of cores in the local node, it returns to the online core directly.
So when PS runs on node3 sending a calculated request, the performance is
not as good as the node2.

The IRQ from 369-392 will be bound from NUMA node0 to NUMA node3 with this
patch, before the patch:

Euler:/sys/bus/pci # cat /proc/irq/369/smp_affinity_list
0
Euler:/sys/bus/pci # cat /proc/irq/370/smp_affinity_list
1
...
Euler:/sys/bus/pci # cat /proc/irq/391/smp_affinity_list
22
Euler:/sys/bus/pci # cat /proc/irq/392/smp_affinity_list
23
After the patch:
Euler:/sys/bus/pci # cat /proc/irq/369/smp_affinity_list
72
Euler:/sys/bus/pci # cat /proc/irq/370/smp_affinity_list
73
...
Euler:/sys/bus/pci # cat /proc/irq/391/smp_affinity_list
94
Euler:/sys/bus/pci # cat /proc/irq/392/smp_affinity_list
95

So the performance of the node3 is the same as node2 that is 26W QPS when
the network card is still in 'node2' with the patch.

It is considered that the NIC and other I/O devices shall initialize the
interrupt binding, if the cores of the local node are used up, it is
reasonable to return the node closest to it.  Let's optimize it and find
the nearest node through NUMA distance for the non-local NUMA nodes.

Cc: Rusty Russell 
Cc: Andrew Morton 
Cc: Juergen Gross 
Cc: Paul Burton 
Cc: Michal Hocko 
Cc: Michael Ellerman 
Cc: Mike Rapoport 
Cc: Anshuman Khandual 
Signed-off-by: Yuqi Jin 
Signed-off-by: Shaokun Zhang 
---
Hi Andrew,

I rebased this patch later following this thread [1]

ChangeLog from v5:
1. Rebase to 5.10-rc2

ChangeLog from v4:
1. Rebase to 5.6-rc3 

ChangeLog from v3:
1. Make spread_lock local to cpumask_local_spread();
2. Add more descriptions on the affinities change in log;

ChangeLog from v2:
1. Change the variables as static and use spinlock to protect;
2. Give more explantation on test and performance;

[1]https://lkml.org/lkml/2020/6/30/1300

 lib/cpumask.c | 66 +--
 1 file changed, 55 insertions(+), 11 deletions(-)

diff --git a/lib/cpumask.c b/lib/cpumask.c
index 85da6ab4fbb5..baecaf271770 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -193,6 +193,38 @@ void __init free_bootmem_cpumask_var(cpumask_var_t mask)
 }
 #endif
 
+static void calc_node_distance(int *node_dist, int node)
+{
+   int i;
+
+   for (i = 0; i < nr_node_ids; i++)
+   node_dist[i] = node_distance(node, i);
+}
+
+static int find_nearest_node(int *node_dist, bool *used)
+{
+   int i, min_dist = node_dist[0], node_id = -1;
+
+   /* Choose the first unused node to compare */
+   for (i = 0; i < nr_node_ids; i++) {
+   if (used[i] == 0) {
+   min_dist = node_dist[i];
+   node_id = i;
+   break;
+   }
+   }
+
+   /* Compare and return the neares

Re: [net-next PATCH 2/3] octeontx2-af: Add devlink health reporters for NPA

2020-11-03 Thread Willem de Bruijn

> > >  static int rvu_devlink_info_get(struct devlink *devlink, struct
> > devlink_info_req *req,
> > > struct netlink_ext_ack *extack)  { @@
> > > -53,7 +483,8 @@ int rvu_register_dl(struct rvu *rvu)
> > > rvu_dl->dl = dl;
> > > rvu_dl->rvu = rvu;
> > > rvu->rvu_dl = rvu_dl;
> > > -   return 0;
> > > +
> > > +   return rvu_health_reporters_create(rvu);
> >
> > when would this be called with rvu->rvu_dl == NULL?
>
> During initialization.

This is the only caller, and it is only reached if rvu_dl is non-zero.

Re: KASAN: use-after-free Read in decode_session6

2020-11-03 Thread Xin Long

On Tue, Nov 3, 2020 at 9:14 PM Xin Long  wrote:
>
> On Sun, Nov 1, 2020 at 1:40 PM syzbot
>  wrote:
> >
> > syzbot has bisected this issue to:
> >
> > commit bcd623d8e9fa5f82bbd8cd464dc418d24139157b
> > Author: Xin Long 
> > Date:   Thu Oct 29 07:05:05 2020 +
> >
> > sctp: call sk_setup_caps in sctp_packet_transmit instead
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=14df9cb850
> > start commit:   68bb4665 Merge branch 'l2-multicast-forwarding-for-ocelot-..
> > git tree:   net-next
> > final oops: https://syzkaller.appspot.com/x/report.txt?x=16df9cb850
> > console output: https://syzkaller.appspot.com/x/log.txt?x=12df9cb850
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=eac680ae76558a0e
> > dashboard link: https://syzkaller.appspot.com/bug?extid=5be8aebb1b7dfa90ef31
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1128639850
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11bbf39850
> >
> > Reported-by: syzbot+5be8aebb1b7dfa90e...@syzkaller.appspotmail.com
> > Fixes: bcd623d8e9fa ("sctp: call sk_setup_caps in sctp_packet_transmit 
> > instead")
> >
> > For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> I'm looking into this, Thanks.
This was actually caused by:

commit a1dd2cf2f1aedabc2ca9bb4f90231a521c52d8eb
Author: Xin Long 
Date:   Thu Oct 29 15:05:03 2020 +0800

sctp: allow changing transport encap_port by peer packets

where the IP6CB was overwritten by SCTP_INPUT_CB.

inet6_skb_parmI will fix it by bringing inet6_skb_parm back to sctp_input_cb:

 struct sctp_input_cb {
+   union {
+   struct inet_skb_parmh4;
+#if IS_ENABLED(CONFIG_IPV6)
+   struct inet6_skb_parm   h6;
+#endif
+   } header;
+   __be16 encap_port;
struct sctp_chunk *chunk;
struct sctp_af *af;
-   __be16 encap_port;
 };

Will post it soon, Thanks.

[PATCH net-next] enetc: Remove Tx checksumming offload code

2020-11-03 Thread Claudiu Manoil

Tx checksumming has been defeatured and completely removed
from the h/w reference manual. Made a little cleanup for the
TSE case as this is complementary code.

Signed-off-by: Claudiu Manoil 
---
 drivers/net/ethernet/freescale/enetc/enetc.c  | 51 ++-
 drivers/net/ethernet/freescale/enetc/enetc.h  |  5 +-
 .../net/ethernet/freescale/enetc/enetc_hw.h   | 47 -
 .../net/ethernet/freescale/enetc/enetc_pf.c   | 10 +---
 .../net/ethernet/freescale/enetc/enetc_vf.c   | 10 +---
 5 files changed, 32 insertions(+), 91 deletions(-)

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c 
b/drivers/net/ethernet/freescale/enetc/enetc.c
index 52be6e315752..01089c30b462 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -47,40 +47,6 @@ netdev_tx_t enetc_xmit(struct sk_buff *skb, struct 
net_device *ndev)
return NETDEV_TX_OK;
 }
 
-static bool enetc_tx_csum(struct sk_buff *skb, union enetc_tx_bd *txbd)
-{
-   int l3_start, l3_hsize;
-   u16 l3_flags, l4_flags;
-
-   if (skb->ip_summed != CHECKSUM_PARTIAL)
-   return false;
-
-   switch (skb->csum_offset) {
-   case offsetof(struct tcphdr, check):
-   l4_flags = ENETC_TXBD_L4_TCP;
-   break;
-   case offsetof(struct udphdr, check):
-   l4_flags = ENETC_TXBD_L4_UDP;
-   break;
-   default:
-   skb_checksum_help(skb);
-   return false;
-   }
-
-   l3_start = skb_network_offset(skb);
-   l3_hsize = skb_network_header_len(skb);
-
-   l3_flags = 0;
-   if (skb->protocol == htons(ETH_P_IPV6))
-   l3_flags = ENETC_TXBD_L3_IPV6;
-
-   /* write BD fields */
-   txbd->l3_csoff = enetc_txbd_l3_csoff(l3_start, l3_hsize, l3_flags);
-   txbd->l4_csoff = l4_flags;
-
-   return true;
-}
-
 static void enetc_unmap_tx_buff(struct enetc_bdr *tx_ring,
struct enetc_tx_swbd *tx_swbd)
 {
@@ -146,22 +112,16 @@ static int enetc_map_tx_buffs(struct enetc_bdr *tx_ring, 
struct sk_buff *skb,
if (do_vlan || do_tstamp)
flags |= ENETC_TXBD_FLAGS_EX;
 
-   if (enetc_tx_csum(skb, &temp_bd))
-   flags |= ENETC_TXBD_FLAGS_CSUM | ENETC_TXBD_FLAGS_L4CS;
-   else if (tx_ring->tsd_enable)
+   if (tx_ring->tsd_enable)
flags |= ENETC_TXBD_FLAGS_TSE | ENETC_TXBD_FLAGS_TXSTART;
 
/* first BD needs frm_len and offload flags set */
temp_bd.frm_len = cpu_to_le16(skb->len);
temp_bd.flags = flags;
 
-   if (flags & ENETC_TXBD_FLAGS_TSE) {
-   u32 temp;
-
-   temp = (skb->skb_mstamp_ns >> 5 & ENETC_TXBD_TXSTART_MASK)
-   | (flags << ENETC_TXBD_FLAGS_OFFSET);
-   temp_bd.txstart = cpu_to_le32(temp);
-   }
+   if (flags & ENETC_TXBD_FLAGS_TSE)
+   temp_bd.txstart = enetc_txbd_set_tx_start(skb->skb_mstamp_ns,
+ flags);
 
if (flags & ENETC_TXBD_FLAGS_EX) {
u8 e_flags = 0;
@@ -1897,8 +1857,7 @@ static void enetc_kfree_si(struct enetc_si *si)
 static void enetc_detect_errata(struct enetc_si *si)
 {
if (si->pdev->revision == ENETC_REV1)
-   si->errata = ENETC_ERR_TXCSUM | ENETC_ERR_VLAN_ISOL |
-ENETC_ERR_UCMCSWP;
+   si->errata = ENETC_ERR_VLAN_ISOL | ENETC_ERR_UCMCSWP;
 }
 
 int enetc_pci_probe(struct pci_dev *pdev, const char *name, int sizeof_priv)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h 
b/drivers/net/ethernet/freescale/enetc/enetc.h
index dd0fb0c066d7..8532d23b54f5 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc.h
@@ -147,9 +147,8 @@ struct enetc_msg_swbd {
 
 #define ENETC_REV1 0x1
 enum enetc_errata {
-   ENETC_ERR_TXCSUM= BIT(0),
-   ENETC_ERR_VLAN_ISOL = BIT(1),
-   ENETC_ERR_UCMCSWP   = BIT(2),
+   ENETC_ERR_VLAN_ISOL = BIT(0),
+   ENETC_ERR_UCMCSWP   = BIT(1),
 };
 
 #define ENETC_SI_F_QBV BIT(0)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h 
b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
index 17cf7c94fdb5..68ef4f959982 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h
+++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h
@@ -374,8 +374,7 @@ union enetc_tx_bd {
__le16 frm_len;
union {
struct {
-   __le16 l3_csoff;
-   u8 l4_csoff;
+   u8 reserved[3];
u8 flags;
}; /* default layout */
__le32 txstart;
@@ -398,41 +397,37 @@ union enetc_tx_bd {
} wb; /* writeback descriptor */
 };
 
-#define ENETC_TXBD_FLAGS_L4CS  BIT(0)
-#define ENETC_TXBD_FLAGS_TSE

Re: [RFC PATCH ethtool] ethtool: Improve compatibility between netlink and ioctl interfaces

2020-11-03 Thread Ido Schimmel

On Mon, Nov 02, 2020 at 11:58:03PM +0100, Michal Kubecek wrote:
> On Mon, Nov 02, 2020 at 08:40:36PM +0200, Ido Schimmel wrote:
> > +static int linkmodes_reply_adver_all_cb(const struct nlmsghdr *nlhdr,
> 
>   ^ advert?
> 
> > +   void *data)
> > +{
> > +   const struct nlattr *bitset_tb[ETHTOOL_A_BITSET_MAX + 1] = {};
> > +   const struct nlattr *tb[ETHTOOL_A_LINKMODES_MAX + 1] = {};
> > +   DECLARE_ATTR_TB_INFO(bitset_tb);
> > +   struct nl_context *nlctx = data;
> > +   struct nl_msg_buff *msgbuff;
> > +   DECLARE_ATTR_TB_INFO(tb);
> > +   struct nl_socket *nlsk;
> > +   struct nlattr *nest;
> > +   int ret;
> > +
> > +   ret = mnl_attr_parse(nlhdr, GENL_HDRLEN, attr_cb, &tb_info);
> > +   if (ret < 0)
> > +   return ret;
> > +   if (!tb[ETHTOOL_A_LINKMODES_OURS])
> > +   return -EINVAL;
> > +
> > +   ret = mnl_attr_parse_nested(tb[ETHTOOL_A_LINKMODES_OURS], attr_cb,
> > +   &bitset_tb_info);
> > +   if (ret < 0)
> > +   return ret;
> > +   if (!bitset_tb[ETHTOOL_A_BITSET_SIZE] ||
> > +   !bitset_tb[ETHTOOL_A_BITSET_VALUE] ||
> > +   !bitset_tb[ETHTOOL_A_BITSET_MASK])
> > +   return -EINVAL;
> > +
> > +   ret = netlink_init_ethnl2_socket(nlctx);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > +   nlsk = nlctx->ethnl2_socket;
> > +   msgbuff = &nlsk->msgbuff;
> > +
> > +   ret = msg_init(nlctx, msgbuff, ETHTOOL_MSG_LINKMODES_SET,
> > +  NLM_F_REQUEST | NLM_F_ACK);
> > +   if (ret < 0)
> > +   return ret;
> > +   if (ethnla_fill_header(msgbuff, ETHTOOL_A_LINKMODES_HEADER,
> > +  nlctx->devname, 0))
> > +   return -EMSGSIZE;
> > +
> > +   if (ethnla_put_u8(msgbuff, ETHTOOL_A_LINKMODES_AUTONEG, AUTONEG_ENABLE))
> > +   return -EMSGSIZE;
> > +
> > +   /* Use the size and mask from the reply and set the value to the mask,
> > +* so that all supported link modes will be advertised.
> > +*/
> > +   ret = -EMSGSIZE;
> > +   nest = ethnla_nest_start(msgbuff, ETHTOOL_A_LINKMODES_OURS);
> > +   if (!nest)
> > +   return -EMSGSIZE;
> > +
> > +   if (ethnla_put_u32(msgbuff, ETHTOOL_A_BITSET_SIZE,
> > +  mnl_attr_get_u32(bitset_tb[ETHTOOL_A_BITSET_SIZE])))
> > +   goto err;
> > +
> > +   if (ethnla_put(msgbuff, ETHTOOL_A_BITSET_VALUE,
> > +  
> > mnl_attr_get_payload_len(bitset_tb[ETHTOOL_A_BITSET_MASK]),
> > +  mnl_attr_get_payload(bitset_tb[ETHTOOL_A_BITSET_MASK])))
> > +   goto err;
> > +
> > +   if (ethnla_put(msgbuff, ETHTOOL_A_BITSET_MASK,
> > +  
> > mnl_attr_get_payload_len(bitset_tb[ETHTOOL_A_BITSET_MASK]),
> > +  mnl_attr_get_payload(bitset_tb[ETHTOOL_A_BITSET_MASK])))
> > +   goto err;
> > +
> > +   ethnla_nest_end(msgbuff, nest);
> 
> To fully replicate ioctl code behaviour, we should only set the bits
> corresponding to "real" link modes, not "special" ones (e.g.
> ETHTOOL_LINK_MODE_TP_BIT).

Michal,

I have the changes you requested here:
https://github.com/idosch/ethtool/commit/b34d15839f2662808c566c04eda726113e20ee59

Do you want to integrate it with your nl_parse() rework or should I?

Thanks

Re: [PATCH net-next 0/5] net: add and use dev_get_tstats64

2020-11-03 Thread Heiner Kallweit

On 02.11.2020 23:36, Saeed Mahameed wrote:
> On Sun, 2020-11-01 at 13:33 +0100, Heiner Kallweit wrote:
>> It's a frequent pattern to use netdev->stats for the less frequently
>> accessed counters and per-cpu counters for the frequently accessed
>> counters (rx/tx bytes/packets). Add a default ndo_get_stats64()
>> implementation for this use case. Subsequently switch more drivers
>> to use this pattern.
>>
>> Heiner Kallweit (5):
>>   net: core: add dev_get_tstats64 as a ndo_get_stats64 implementation
>>   net: make ip_tunnel_get_stats64 an alias for dev_get_tstats64
>>   ip6_tunnel: use ip_tunnel_get_stats64 as ndo_get_stats64 callback
>>   net: dsa: use net core stats64 handling
>>   tun: switch to net core provided statistics counters
>>
> 
> not many left,
> 
> $ git grep dev_fetch_sw_netstats drivers/
> 
> drivers/infiniband/hw/hfi1/ipoib_main.c:dev_fetch_sw_netstats(s
> torage, priv->netstats);
> drivers/net/macsec.c:   dev_fetch_sw_netstats(s, dev->tstats);
> drivers/net/usb/qmi_wwan.c: dev_fetch_sw_netstats(stats, priv-
>> stats64);
> drivers/net/usb/usbnet.c:   dev_fetch_sw_netstats(stats, dev-
>> stats64);
> drivers/net/wireless/quantenna/qtnfmac/core.c:  dev_fetch_sw_netstats(s
> tats, vif->stats64);
> 
> Why not convert them as well ?
> macsec has a different implementation, but all others can be converted.
> 
OK, I can do this. Then the series becomes somewhat bigger.
@Jakub: Would it be ok to apply the current series and I provide the
additionally requested migrations as follow-up series?

Re: [PATCH ipsec] xfrm: Pass template address family to xfrm_state_look_at

2020-11-03 Thread Anthony DeRossi

On Tue, Nov 3, 2020 at 4:05 AM Herbert Xu  wrote:
>
> On Mon, Nov 02, 2020 at 06:32:19PM -0800, Anthony DeRossi wrote:
> > This fixes a regression where valid selectors are incorrectly skipped
> > when xfrm_state_find is called with a non-matching address family (e.g.
> > when using IPv6-in-IPv4 ESP in transport mode).
> >
> > The state's address family is matched against the template's family
> > (encap_family) in xfrm_state_find before checking the selector in
> > xfrm_state_look_at.  The template's family should also be used for
> > selector matching, otherwise valid selectors may be skipped.
> >
> > Fixes: e94ee171349d ("xfrm: Use correct address family in xfrm_state_find")
> > Signed-off-by: Anthony DeRossi 
> > ---
> >  net/xfrm/xfrm_state.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Your patch reintroduces the same bug that my patch was trying to
> fix, namely that when you do the comparison on flow you must use
> the original family and not some other value.

My mistake, I misunderstood the original bug.

Anthony

[PATCH v5 5/5] ARM: defconfig: Enable ax88796c driver

2020-11-03 Thread Łukasz Stelmach

Enable ax88796c driver for the ethernet chip on Exynos3250-based
ARTIK5 boards.

Signed-off-by: Łukasz Stelmach 
---
 arch/arm/configs/exynos_defconfig   | 2 ++
 arch/arm/configs/multi_v7_defconfig | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm/configs/exynos_defconfig 
b/arch/arm/configs/exynos_defconfig
index cf82c9d23a08..1ee902d01eef 100644
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -107,6 +107,8 @@ CONFIG_MD=y
 CONFIG_BLK_DEV_DM=y
 CONFIG_DM_CRYPT=m
 CONFIG_NETDEVICES=y
+CONFIG_NET_VENDOR_ASIX=y
+CONFIG_SPI_AX88796C=y
 CONFIG_SMSC911X=y
 CONFIG_USB_RTL8150=m
 CONFIG_USB_RTL8152=y
diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index e731cdf7c88c..dad53846f58f 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -243,6 +243,8 @@ CONFIG_SATA_HIGHBANK=y
 CONFIG_SATA_MV=y
 CONFIG_SATA_RCAR=y
 CONFIG_NETDEVICES=y
+CONFIG_NET_VENDOR_ASIX=y
+CONFIG_SPI_AX88796C=m
 CONFIG_VIRTIO_NET=y
 CONFIG_B53_SPI_DRIVER=m
 CONFIG_B53_MDIO_DRIVER=m
-- 
2.26.2

[PATCH v5 0/5] AX88796C SPI Ethernet Adapter

2020-11-03 Thread Łukasz Stelmach

This is a driver for AX88796C Ethernet Adapter connected in SPI mode as
found on ARTIK5 evaluation board. The driver has been ported from a
v3.10.9 vendor kernel for ARTIK5 board.

Changes in v5:
  - coding style (local variable declarations)
  - added spi0 node in the DT binding example and removed
interrupt-parent
  - removed comp module parameter
  - added CONFIG_SPI_AX88796C_COMPRESSION option to set the initial
state of SPI compression
  - introduced new ethtool tunable "spi-compression" to controll SPI
transfer compression
  - removed unused fields in struct ax88796c_device
  - switched from using buffers allocated on stack for SPI transfers
to DMA safe ones embedded in struct ax_spi and allocated with
kmalloc()

Changes in v4:
  - fixed compilation problems in asix,ax88796c.yaml and in
  ax88796c_main.c introduced in v3

Changes in v3:
  - modify vendor-prefixes.yaml in a separate patch
  - fix several problems in the dt binding
- removed unnecessary descriptions and properties
- changed the order of entries
- fixed problems with missing defines in the example
  - change (1 << N) to BIT(N), left a few (0 << N)
  - replace ax88796c_get_link(), ax88796c_get_link_ksettings(),
ax88796c_set_link_ksettings(), ax88796c_nway_reset(),
ax88796c_set_mac_address() with appropriate kernel functions.
  - disable PHY auto-polling in MAC and use PHYLIB to track the state
of PHY and configure MAC
  - propagate return values instead of returning constants in several
places
  - add WARN_ON() for unlocked mutex
  - remove local work queue and use the system_wq
  - replace phy_connect_direct() with phy_connect() and move
devm_register_netdev() to the end of ax88796c_probe()
(Unlike phy_connect_direct() phy_connect() does not crash if the
network device isn't registered yet.)
  - remove error messages on ENOMEM
  - move free_irq() to the end of ax88796c_close() to avoid race
condition
  - implement flow-control

Changes in v2:
  - use phylib
  - added DT bindings
  - moved #includes to *.c files
  - used mutex instead of a semaphore for locking
  - renamed some constants
  - added error propagation for several functions
  - used ethtool for dumping registers
  - added control over checksum offloading
  - remove vendor specific PM
  - removed macaddr module parameter and added support for reading a MAC
address from platform data (e.g. DT)
  - removed dependency on SPI from NET_VENDOR_ASIX
  - added an entry in the MAINTAINERS file
  - simplified logging with appropriate netif_* and netdev_* helpers
  - lots of style fixes

Łukasz Stelmach (5):
  dt-bindings: vendor-prefixes: Add asix prefix
  dt-bindings: net: Add bindings for AX88796C SPI Ethernet Adapter
  net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver
  ARM: dts: exynos: Add Ethernet to Artik 5 board
  ARM: defconfig: Enable ax88796c driver

 .../bindings/net/asix,ax88796c.yaml   |   73 ++
 .../devicetree/bindings/vendor-prefixes.yaml  |2 +
 MAINTAINERS   |6 +
 arch/arm/boot/dts/exynos3250-artik5-eval.dts  |   29 +
 arch/arm/configs/exynos_defconfig |2 +
 arch/arm/configs/multi_v7_defconfig   |2 +
 drivers/net/ethernet/Kconfig  |1 +
 drivers/net/ethernet/Makefile |1 +
 drivers/net/ethernet/asix/Kconfig |   35 +
 drivers/net/ethernet/asix/Makefile|6 +
 drivers/net/ethernet/asix/ax88796c_ioctl.c|  235 
 drivers/net/ethernet/asix/ax88796c_ioctl.h|   26 +
 drivers/net/ethernet/asix/ax88796c_main.c | 1132 +
 drivers/net/ethernet/asix/ax88796c_main.h |  561 
 drivers/net/ethernet/asix/ax88796c_spi.c  |  109 ++
 drivers/net/ethernet/asix/ax88796c_spi.h  |   70 +
 include/uapi/linux/ethtool.h  |1 +
 net/ethtool/common.c  |1 +
 18 files changed, 2292 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/asix,ax88796c.yaml
 create mode 100644 drivers/net/ethernet/asix/Kconfig
 create mode 100644 drivers/net/ethernet/asix/Makefile
 create mode 100644 drivers/net/ethernet/asix/ax88796c_ioctl.c
 create mode 100644 drivers/net/ethernet/asix/ax88796c_ioctl.h
 create mode 100644 drivers/net/ethernet/asix/ax88796c_main.c
 create mode 100644 drivers/net/ethernet/asix/ax88796c_main.h
 create mode 100644 drivers/net/ethernet/asix/ax88796c_spi.c
 create mode 100644 drivers/net/ethernet/asix/ax88796c_spi.h

-- 
2.26.2

[PATCH v5 1/5] dt-bindings: vendor-prefixes: Add asix prefix

2020-11-03 Thread Łukasz Stelmach

Add the prefix for ASIX Electronics Corporation.

Signed-off-by: Łukasz Stelmach 
Reviewed-by: Krzysztof Kozlowski 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml 
b/Documentation/devicetree/bindings/vendor-prefixes.yaml
index 2735be1a8470..ce3b3f6c9728 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@ -117,6 +117,8 @@ patternProperties:
 description: Asahi Kasei Corp.
   "^asc,.*":
 description: All Sensors Corporation
+  "^asix,.*":
+description: ASIX Electronics Corporation
   "^aspeed,.*":
 description: ASPEED Technology Inc.
   "^asus,.*":
-- 
2.26.2

[PATCH v5 4/5] ARM: dts: exynos: Add Ethernet to Artik 5 board

2020-11-03 Thread Łukasz Stelmach

Add node for ax88796c ethernet chip.

Signed-off-by: Łukasz Stelmach 
---
 arch/arm/boot/dts/exynos3250-artik5-eval.dts | 29 
 1 file changed, 29 insertions(+)

diff --git a/arch/arm/boot/dts/exynos3250-artik5-eval.dts 
b/arch/arm/boot/dts/exynos3250-artik5-eval.dts
index 20446a846a98..a91e09a7d3fa 100644
--- a/arch/arm/boot/dts/exynos3250-artik5-eval.dts
+++ b/arch/arm/boot/dts/exynos3250-artik5-eval.dts
@@ -37,3 +37,32 @@ &mshc_2 {
 &serial_2 {
status = "okay";
 };
+
+&spi_0 {
+   status = "okay";
+   cs-gpios = <&gpx3 4 GPIO_ACTIVE_LOW>, <0>;
+
+   assigned-clocks = <&cmu CLK_MOUT_MPLL>, <&cmu CLK_DIV_MPLL_PRE>,
+   <&cmu CLK_MOUT_SPI0>, <&cmu CLK_DIV_SPI0>,
+   <&cmu CLK_DIV_SPI0_PRE>, <&cmu CLK_SCLK_SPI0>;
+   assigned-clock-parents =
+   <&cmu CLK_FOUT_MPLL>,/* for: CLK_MOUT_MPLL */
+   <&cmu CLK_MOUT_MPLL>,/* for: CLK_DIV_MPLL_PRE */
+   <&cmu CLK_DIV_MPLL_PRE>, /* for: CLK_MOUT_SPI0 */
+   <&cmu CLK_MOUT_SPI0>,/* for: CLK_DIV_SPI0 */
+   <&cmu CLK_DIV_SPI0>, /* for: CLK_DIV_SPI0_PRE */
+   <&cmu CLK_DIV_SPI0_PRE>; /* for: CLK_SCLK_SPI0 */
+
+   ethernet@0 {
+   compatible = "asix,ax88796c";
+   reg = <0x0>;
+   local-mac-address = [00 00 00 00 00 00]; /* Filled in by a 
boot-loader */
+   interrupt-parent = <&gpx2>;
+   interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
+   spi-max-frequency = <4000>;
+   reset-gpios = <&gpe0 2 GPIO_ACTIVE_LOW>;
+   controller-data {
+   samsung,spi-feedback-delay = <2>;
+   };
+   };
+};
-- 
2.26.2

Re: [PATCH ipsec] xfrm: Pass template address family to xfrm_state_look_at

2020-11-03 Thread Anthony DeRossi

On Tue, Nov 3, 2020 at 4:08 AM Herbert Xu  wrote:
>
> On Mon, Nov 02, 2020 at 06:32:19PM -0800, Anthony DeRossi wrote:
> > This fixes a regression where valid selectors are incorrectly skipped
> > when xfrm_state_find is called with a non-matching address family (e.g.
> > when using IPv6-in-IPv4 ESP in transport mode).
>
> Why are we even allowing v6-over-v4 in transport mode? Isn't that
> the whole point of BEET mode?

I'm not sure. This is the outgoing policy that strongSwan creates for
an IPv6-in-IPv4 tunnel when compression is enabled:

src fd02::/16 dst fd02::2/128
dir out priority 326271 ptype main
tmpl src 10.0.0.8 dst 192.168.1.231
proto comp spi 0xd00e reqid 1 mode tunnel
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp spi 0xc543e950 reqid 1 mode transport

After your patch, outgoing IPv6 packets fail to match the associated state:

src 10.0.0.8 dst 192.168.1.231
proto esp spi 0xc543e950 reqid 1 mode transport
replay-window 0
auth-trunc hmac(sha256)
0x143b570f59b23eaa560905f19a922451c6dfa5694ba2e45e1b065bb1863421aa 128
enc cbc(aes) 0x526ed144ca087125ce30e36c8f20d972
encap type espinudp sport 4501 dport 4500 addr 0.0.0.0
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x
sel src 0.0.0.0/0 dst 0.0.0.0/0

Is this an invalid configuration?

Anthony

[PATCH v5 2/5] dt-bindings: net: Add bindings for AX88796C SPI Ethernet Adapter

2020-11-03 Thread Łukasz Stelmach

Add bindings for AX88796C SPI Ethernet Adapter.

Signed-off-by: Łukasz Stelmach 
---
 .../bindings/net/asix,ax88796c.yaml   | 73 +++
 1 file changed, 73 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/asix,ax88796c.yaml

diff --git a/Documentation/devicetree/bindings/net/asix,ax88796c.yaml 
b/Documentation/devicetree/bindings/net/asix,ax88796c.yaml
new file mode 100644
index ..699ebf452479
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/asix,ax88796c.yaml
@@ -0,0 +1,73 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/asix,ax88796c.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: ASIX AX88796C SPI Ethernet Adapter
+
+maintainers:
+  - Łukasz Stelmach 
+
+description: |
+  ASIX AX88796C is an Ethernet controller with a built in PHY. This
+  describes SPI mode of the chip.
+
+  The node for this driver must be a child node of an SPI controller,
+  hence all mandatory properties described in
+  ../spi/spi-controller.yaml must be specified.
+
+allOf:
+  - $ref: ethernet-controller.yaml#
+
+properties:
+  compatible:
+const: asix,ax88796c
+
+  reg:
+maxItems: 1
+
+  spi-max-frequency:
+maximum: 4000
+
+  interrupts:
+maxItems: 1
+
+  reset-gpios:
+description:
+  A GPIO line handling reset of the chip. As the line is active low,
+  it should be marked GPIO_ACTIVE_LOW.
+maxItems: 1
+
+  local-mac-address: true
+
+  mac-address: true
+
+required:
+  - compatible
+  - reg
+  - spi-max-frequency
+  - interrupts
+  - reset-gpios
+
+additionalProperties: false
+
+examples:
+  # Artik5 eval board
+  - |
+#include 
+#include 
+spi0 {
+#address-cells = <1>;
+#size-cells = <0>;
+
+ethernet@0 {
+compatible = "asix,ax88796c";
+reg = <0x0>;
+local-mac-address = [00 00 00 00 00 00]; /* Filled in by a 
bootloader */
+interrupt-parent = <&gpx2>;
+interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
+spi-max-frequency = <4000>;
+reset-gpios = <&gpe0 2 GPIO_ACTIVE_LOW>;
+};
+};
-- 
2.26.2

[PATCH v5 3/5] net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver

2020-11-03 Thread Łukasz Stelmach

ASIX AX88796[1] is a versatile ethernet adapter chip, that can be
connected to a CPU with a 8/16-bit bus or with an SPI. This driver
supports SPI connection.

The driver has been ported from the vendor kernel for ARTIK5[2]
boards. Several changes were made to adapt it to the current kernel
which include:

+ updated DT configuration,
+ clock configuration moved to DT,
+ new timer, ethtool and gpio APIs,
+ dev_* instead of pr_* and custom printk() wrappers,
+ removed awkward vendor power managemtn.
+ introduced ethtool tunable to control SPI compression

[1] 
https://www.asix.com.tw/products.php?op=pItemdetail&PItemID=104;65;86&PLine=65
[2] https://git.tizen.org/cgit/profile/common/platform/kernel/linux-3.10-artik/

The other ax88796 driver is for NE2000 compatible AX88796L chip. These
chips are not compatible. Hence, two separate drivers are required.

Signed-off-by: Łukasz Stelmach 
---
 MAINTAINERS|6 +
 drivers/net/ethernet/Kconfig   |1 +
 drivers/net/ethernet/Makefile  |1 +
 drivers/net/ethernet/asix/Kconfig  |   35 +
 drivers/net/ethernet/asix/Makefile |6 +
 drivers/net/ethernet/asix/ax88796c_ioctl.c |  235 
 drivers/net/ethernet/asix/ax88796c_ioctl.h |   26 +
 drivers/net/ethernet/asix/ax88796c_main.c  | 1132 
 drivers/net/ethernet/asix/ax88796c_main.h  |  561 ++
 drivers/net/ethernet/asix/ax88796c_spi.c   |  109 ++
 drivers/net/ethernet/asix/ax88796c_spi.h   |   70 ++
 include/uapi/linux/ethtool.h   |1 +
 net/ethtool/common.c   |1 +
 13 files changed, 2184 insertions(+)
 create mode 100644 drivers/net/ethernet/asix/Kconfig
 create mode 100644 drivers/net/ethernet/asix/Makefile
 create mode 100644 drivers/net/ethernet/asix/ax88796c_ioctl.c
 create mode 100644 drivers/net/ethernet/asix/ax88796c_ioctl.h
 create mode 100644 drivers/net/ethernet/asix/ax88796c_main.c
 create mode 100644 drivers/net/ethernet/asix/ax88796c_main.h
 create mode 100644 drivers/net/ethernet/asix/ax88796c_spi.c
 create mode 100644 drivers/net/ethernet/asix/ax88796c_spi.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 14b8ec0bb58b..930dc859d4f7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2812,6 +2812,12 @@ S:   Maintained
 F: Documentation/hwmon/asc7621.rst
 F: drivers/hwmon/asc7621.c
 
+ASIX AX88796C SPI ETHERNET ADAPTER
+M: Łukasz Stelmach 
+S: Maintained
+F: Documentation/devicetree/bindings/net/asix,ax99706c-spi.yaml
+F: drivers/net/ethernet/asix/ax88796c_*
+
 ASPEED PINCTRL DRIVERS
 M: Andrew Jeffery 
 L: linux-asp...@lists.ozlabs.org (moderated for non-subscribers)
diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index de50e8b9e656..f3b218e45ea5 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -32,6 +32,7 @@ source "drivers/net/ethernet/apm/Kconfig"
 source "drivers/net/ethernet/apple/Kconfig"
 source "drivers/net/ethernet/aquantia/Kconfig"
 source "drivers/net/ethernet/arc/Kconfig"
+source "drivers/net/ethernet/asix/Kconfig"
 source "drivers/net/ethernet/atheros/Kconfig"
 source "drivers/net/ethernet/aurora/Kconfig"
 source "drivers/net/ethernet/broadcom/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index f8f38dcb5f8a..9eb368d93607 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -18,6 +18,7 @@ obj-$(CONFIG_NET_XGENE) += apm/
 obj-$(CONFIG_NET_VENDOR_APPLE) += apple/
 obj-$(CONFIG_NET_VENDOR_AQUANTIA) += aquantia/
 obj-$(CONFIG_NET_VENDOR_ARC) += arc/
+obj-$(CONFIG_NET_VENDOR_ASIX) += asix/
 obj-$(CONFIG_NET_VENDOR_ATHEROS) += atheros/
 obj-$(CONFIG_NET_VENDOR_AURORA) += aurora/
 obj-$(CONFIG_NET_VENDOR_CADENCE) += cadence/
diff --git a/drivers/net/ethernet/asix/Kconfig 
b/drivers/net/ethernet/asix/Kconfig
new file mode 100644
index ..6211814b0446
--- /dev/null
+++ b/drivers/net/ethernet/asix/Kconfig
@@ -0,0 +1,35 @@
+#
+# Asix network device configuration
+#
+
+config NET_VENDOR_ASIX
+   bool "Asix devices"
+   default y
+   help
+ If you have a network (Ethernet, non-USB, not NE2000 compatible)
+ interface based on a chip from ASIX, say Y.
+
+if NET_VENDOR_ASIX
+
+config SPI_AX88796C
+   tristate "Asix AX88796C-SPI support"
+   select PHYLIB
+   depends on SPI
+   depends on GPIOLIB
+   help
+ Say Y here if you intend to use ASIX AX88796C attached in SPI mode.
+
+config SPI_AX88796C_COMPRESSION
+   bool "SPI transfer compression"
+   default n
+   depends on SPI_AX88796C
+   help
+ Say Y here to enable SPI transfer compression. It saves up
+ to 24 dummy cycles during each transfer which may noticably
+ speed up short transfers. This sets the default value that is
+ inherited by network interfecase during probe. It can be
+ changed in run time via spi-compression ethtool tunable.
+
+

Re: [PATCH net-next v2 0/3] net: introduce rps_default_mask

2020-11-03 Thread Paolo Abeni

On Mon, 2020-11-02 at 14:54 -0800, Jakub Kicinski wrote:
> On Fri, 30 Oct 2020 12:16:00 +0100 Paolo Abeni wrote:
> > Real-time setups try hard to ensure proper isolation between time
> > critical applications and e.g. network processing performed by the
> > network stack in softirq and RPS is used to move the softirq 
> > activity away from the isolated core.
> > 
> > If the network configuration is dynamic, with netns and devices
> > routinely created at run-time, enforcing the correct RPS setting
> > on each newly created device allowing to transient bad configuration
> > became complex.
> > 
> > These series try to address the above, introducing a new
> > sysctl knob: rps_default_mask. The new sysctl entry allows
> > configuring a systemwide RPS mask, to be enforced since receive 
> > queue creation time without any fourther per device configuration
> > required.
> > 
> > Additionally, a simple self-test is introduced to check the 
> > rps_default_mask behavior.
> 
> RPS is disabled by default, the processing is going to happen wherever
> the IRQ is mapped, and one would hope that the IRQ is not mapped to the
> core where the critical processing runs.
> 
> Would you mind elaborating further on the use case?

On Mon, 2020-11-02 at 15:27 -0800, Saeed Mahameed wrote:
> The whole thing can be replaced with a user daemon scripts that
> monitors all newly created devices and assign to them whatever rps mask
> (call it default).
> 
> So why do we need this special logic in kernel ? 
> 
> I am not sure about this, but if rps queues sysfs are available before
> the netdev is up, then you can also use udevd to assign the rps masks
> before such devices are even brought up, so you would avoid the race
> conditions that you described, which are not really clear to me to be
> honest.

Thank you for the feedback.

Please allow me to answer you both here, as your questions are related.

The relevant use case is an host running containers (with the related
orchestration tools) in a RT environment. Virtual devices (veths, ovs
ports, etc.) are created by the orchestration tools at run-time.
Critical processes are allowed to send packets/generate outgoing
network traffic - but any interrupt is moved away from the related
cores, so that usual incoming network traffic processing does not
happen there.

Still an xmit operation on a virtual devices may be transmitted via ovs
or veth, with the relevant forwarding operation happening in a softirq
on the same CPU originating the packet. 

RPS is configured (even) on such virtual devices to move away the
forwarding from the relevant CPUs.

As Saeed noted, such configuration could be possibly performed via some
user-space daemon monitoring network devices and network namespaces
creation. That will be anyway prone to some race: the orchestation tool
may create and enable the netns and virtual devices before the daemon
has properly set the RPS mask.

In the latter scenario some packet forwarding could still slip in the
relevant CPU, causing measurable latency. In all non RT scenarios the
above will be likely irrelevant, but in the RT context that is not
acceptable - e.g. it causes in real environments latency above the
defined limits, while the proposed patches avoid the issue.

Do you see any other simple way to avoid the above race?

Please let me know if the above answers your doubts,

Paolo

Re: [PATCH 30/33] docs: ABI: cleanup several ABI documents

2020-11-03 Thread Bjorn Andersson

On Wed 28 Oct 09:23 CDT 2020, Mauro Carvalho Chehab wrote:
[..]
>  .../ABI/testing/sysfs-class-remoteproc|  14 +-

for this:

Acked-by: Bjorn Andersson 

Thanks,
Bjorn

Re: lan78xx: /sys/class/net/eth0/carrier stuck at 1

2020-11-03 Thread Andrew Lunn

On Tue, Nov 03, 2020 at 01:47:12PM +0100, Juerg Haefliger wrote:
> On Fri, 23 Oct 2020 15:05:19 +0200
> Andrew Lunn  wrote:
> 
> > On Fri, Oct 23, 2020 at 08:29:59AM +0200, Juerg Haefliger wrote:
> > > On Wed, 21 Oct 2020 21:35:48 +0200
> > > Andrew Lunn  wrote:
> > >   
> > > > On Wed, Oct 21, 2020 at 05:00:53PM +0200, Juerg Haefliger wrote:  
> > > > > Hi,
> > > > > 
> > > > > If the lan78xx driver is compiled into the kernel and the network 
> > > > > cable is
> > > > > plugged in at boot, /sys/class/net/eth0/carrier is stuck at 1 and 
> > > > > doesn't
> > > > > toggle if the cable is unplugged and replugged.
> > > > > 
> > > > > If the network cable is *not* plugged in at boot, all seems to work 
> > > > > fine.
> > > > > I.e., post-boot cable plugs and unplugs toggle the carrier flag.
> > > > > 
> > > > > Also, everything seems to work fine if the driver is compiled as a 
> > > > > module.
> > > > > 
> > > > > There's an older ticket for the raspi kernel [1] but I've just tested 
> > > > > this
> > > > > with a 5.8 kernel on a Pi 3B+ and still see that behavior.
> > > > 
> > > > Hi Jürg  
> > > 
> > > Hi Andrew,
> > > 
> > >   
> > > > Could you check if a different PHY driver is being used when it is
> > > > built and broken vs module or built in and working.
> > > > 
> > > > Look at /sys/class/net/eth0/phydev/driver  
> > > 
> > > There's no such file.  
> > 
> > I _think_ that means it is using genphy, the generic PHY driver, not a
> > specific vendor PHY driver? What does
> > 
> > /sys/class/net/eth0/phydev/phy_id contain.
> 
> There is no directory /sys/class/net/eth0/phydev.

[Goes and looks at the code]

The symbolic link is only created if the PHY is connected to the MAC
if the MAC has been registered with the core first. lan78xx does it
the other way around:

ret = lan78xx_phy_init(dev);
if (ret < 0)
goto out4;

ret = register_netdev(netdev);
if (ret != 0) {
netif_err(dev, probe, netdev, "couldn't register the device\n");
goto out5;
}

The register dump you show below indicates an ID of 007c132, which
fits the drivers drivers/net/phy/microchip.c : "Microchip
LAN88xx". Any mention of that in dmesg, do you see the module loaded?

> 
> > > Given that all works fine as long as the cable is unplugged at boot points
> > > more towards a race at boot or incorrect initialization sequence or 
> > > something.  
> > 
> > Could be. Could you run
> > 
> > mii-tool -vv eth0
> 
> Hrm. Running that command unlocks the carrier flag and it starts toggling on
> cable unplug/plug. First invocation:
> 
> $ sudo mii-tool -vv eth0
> Using SIOCGMIIPHY=0x8947
> eth0: negotiated 1000baseT-FD flow-control, link ok
>   registers for MII PHY 1: 
> 1040 79ed 0007 c132 05e1 cde1 000f 
>  0200 0800     3000
>   0088    3200 0004
> 0040 a000 a000  a035   
>   product info: vendor 00:01:f0, model 19 rev 2
>   basic mode:   autonegotiation enabled
>   basic status: autonegotiation complete, link ok
>   capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
>   advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
> flow-control
>   link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
> flow-control
> 
> Subsequent invocation:
> 
> $ sudo mii-tool -vv eth0
> Using SIOCGMIIPHY=0x8947
> eth0: negotiated 1000baseT-FD flow-control, link ok
>   registers for MII PHY 1: 
> 1040 79ed 0007 c132 05e1 cde1 000d 
>  0200 0800     3000
>   0088    3200 0004
> 0040 a000   a035   
>   product info: vendor 00:01:f0, model 19 rev 2
>   basic mode:   autonegotiation enabled
>   basic status: autonegotiation complete, link ok
>   capabilities: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
>   advertising:  1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
> flow-control
>   link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
> flow-control
> 
> In the first invocation, register 0x1a shows a pending link-change interrupt
> (0xa000) which wasn't serviced (and cleared) for some reason. Dumping the
> registers cleared that interrupt bit and things start working correctly
> afterwards. Nor sure yet why that first interrupt is ignored.

So, 0x1a is the interrupt status, and 0x19 is the interrupt mask.

This should really be interpreted as a level interrupt. But it appears
the hardware the interrupt is connected to is actually doing edge. And
the edge has been missed, and so the interrupt is never serviced.

I think the call sequence goes something like this, if i'm reading the
code correct:

lan78xx_probe() calls lan78xx_bind:

lan78xx_bind() registers an interrupt domain. This allows USB
status messages indicating an interrupt to be dispatched using the
normal interrupt mechanism

Re: [bpf-next PATCH v2 5/5] selftest/bpf: Use global variables instead of maps for test_tcpbpf_kern

2020-11-03 Thread Alexander Duyck

On Mon, Nov 2, 2020 at 5:26 PM Martin KaFai Lau  wrote:
>
> On Sat, Oct 31, 2020 at 11:52:37AM -0700, Alexander Duyck wrote:
> [ ... ]
>
> > +struct tcpbpf_globals global = { 0 };
> >  int _version SEC("version") = 1;
> >
> >  SEC("sockops")
> > @@ -105,29 +72,15 @@ int bpf_testcb(struct bpf_sock_ops *skops)
> >
> >   op = (int) skops->op;
> >
> > - update_event_map(op);
> > + global.event_map |= (1 << op);
> >
> >   switch (op) {
> >   case BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB:
> >   /* Test failure to set largest cb flag (assumes not defined) 
> > */
> > - bad_call_rv = bpf_sock_ops_cb_flags_set(skops, 0x80);
> > + global.bad_cb_test_rv = bpf_sock_ops_cb_flags_set(skops, 
> > 0x80);
> >   /* Set callback */
> > - good_call_rv = bpf_sock_ops_cb_flags_set(skops,
> > + global.good_cb_test_rv = bpf_sock_ops_cb_flags_set(skops,
> >BPF_SOCK_OPS_STATE_CB_FLAG);
> > - /* Update results */
> > - {
> > - __u32 key = 0;
> > - struct tcpbpf_globals g, *gp;
> > -
> > - gp = bpf_map_lookup_elem(&global_map, &key);
> > - if (!gp)
> > - break;
> > - g = *gp;
> > - g.bad_cb_test_rv = bad_call_rv;
> > - g.good_cb_test_rv = good_call_rv;
> > - bpf_map_update_elem(&global_map, &key, &g,
> > - BPF_ANY);
> > - }
> >   break;
> >   case BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB:
> >   skops->sk_txhash = 0x12345f;
> > @@ -143,10 +96,8 @@ int bpf_testcb(struct bpf_sock_ops *skops)
> >
> >   thdr = (struct tcphdr *)(header + offset);
> >   v = thdr->syn;
> > - __u32 key = 1;
> >
> > - bpf_map_update_elem(&sockopt_results, &key, 
> > &v,
> > - BPF_ANY);
> > + global.tcp_saved_syn = v;
> >   }
> >   }
> >   break;
> > @@ -156,25 +107,16 @@ int bpf_testcb(struct bpf_sock_ops *skops)
> >   break;
> >   case BPF_SOCK_OPS_STATE_CB:
> >   if (skops->args[1] == BPF_TCP_CLOSE) {
> > - __u32 key = 0;
> > - struct tcpbpf_globals g, *gp;
> > -
> > - gp = bpf_map_lookup_elem(&global_map, &key);
> > - if (!gp)
> > - break;
> > - g = *gp;
> >   if (skops->args[0] == BPF_TCP_LISTEN) {
> > - g.num_listen++;
> > + global.num_listen++;
> >   } else {
> > - g.total_retrans = skops->total_retrans;
> > - g.data_segs_in = skops->data_segs_in;
> > - g.data_segs_out = skops->data_segs_out;
> > - g.bytes_received = skops->bytes_received;
> > - g.bytes_acked = skops->bytes_acked;
> > + global.total_retrans = skops->total_retrans;
> > + global.data_segs_in = skops->data_segs_in;
> > + global.data_segs_out = skops->data_segs_out;
> > + global.bytes_received = skops->bytes_received;
> > + global.bytes_acked = skops->bytes_acked;
> >   }
> > - g.num_close_events++;
> > - bpf_map_update_elem(&global_map, &key, &g,
> > - BPF_ANY);
> It is interesting that there is no race in the original "g.num_close_events++"
> followed by the bpf_map_update_elem().  It seems quite fragile though.

How would it race with the current code though? At this point we are
controlling the sockets in a single thread. As such the close events
should already be serialized shouldn't they? This may have been a
problem with the old code, but even then it was only two sockets so I
don't think there was much risk of them racing against each other
since the two sockets were linked anyway.

> > + global.num_close_events++;
> There is __sync_fetch_and_add().
>
> not sure about the global.event_map though, may be use an individual
> variable for each _CB.  Thoughts?

I think this may be overkill for what we actually need. Since we are
closing the sockets in a single threaded application there isn't much
risk of the sockets all racing against each other in the close is
there?

Re: [bpf-next PATCH v2 4/5] selftests/bpf: Migrate tcpbpf_user.c to use BPF skeleton

2020-11-03 Thread Alexander Duyck

On Mon, Nov 2, 2020 at 4:55 PM Martin KaFai Lau  wrote:
>
> On Sat, Oct 31, 2020 at 11:52:31AM -0700, Alexander Duyck wrote:
> > From: Alexander Duyck 
> >
> > Update tcpbpf_user.c to make use of the BPF skeleton. Doing this we can
> > simplify test_tcpbpf_user and reduce the overhead involved in setting up
> > the test.
> >
> > In addition we can clean up the remaining bits such as the one remaining
> > CHECK_FAIL at the end of test_tcpbpf_user so that the function only makes
> > use of CHECK as needed.
> >
> > Acked-by: Andrii Nakryiko 
> > Signed-off-by: Alexander Duyck 
> Acked-by: Martin KaFai Lau 
>
> > ---
> >  .../testing/selftests/bpf/prog_tests/tcpbpf_user.c |   48 
> > 
> >  1 file changed, 18 insertions(+), 30 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c 
> > b/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c
> > index d96f4084d2f5..c7a61b0d616a 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/tcpbpf_user.c
> > @@ -4,6 +4,7 @@
> >  #include 
> >
> >  #include "test_tcpbpf.h"
> > +#include "test_tcpbpf_kern.skel.h"
> >
> >  #define LO_ADDR6 "::1"
> >  #define CG_NAME "/tcpbpf-user-test"
> > @@ -133,44 +134,31 @@ static void run_test(int map_fd, int sock_map_fd)
> >
> >  void test_tcpbpf_user(void)
> >  {
> > - const char *file = "test_tcpbpf_kern.o";
> > - int prog_fd, map_fd, sock_map_fd;
> > - int error = EXIT_FAILURE;
> > - struct bpf_object *obj;
> > + struct test_tcpbpf_kern *skel;
> > + int map_fd, sock_map_fd;
> >   int cg_fd = -1;
> > - int rv;
> > -
> > - cg_fd = test__join_cgroup(CG_NAME);
> > - if (cg_fd < 0)
> > - goto err;
> >
> > - if (bpf_prog_load(file, BPF_PROG_TYPE_SOCK_OPS, &obj, &prog_fd)) {
> > - fprintf(stderr, "FAILED: load_bpf_file failed for: %s\n", 
> > file);
> > - goto err;
> > - }
> > + skel = test_tcpbpf_kern__open_and_load();
> > + if (CHECK(!skel, "open and load skel", "failed"))
> > + return;
> >
> > - rv = bpf_prog_attach(prog_fd, cg_fd, BPF_CGROUP_SOCK_OPS, 0);
> > - if (rv) {
> > - fprintf(stderr, "FAILED: bpf_prog_attach: %d (%s)\n",
> > -errno, strerror(errno));
> > - goto err;
> > - }
> > + cg_fd = test__join_cgroup(CG_NAME);
> > + if (CHECK(cg_fd < 0, "test__join_cgroup(" CG_NAME ")",
> > +   "cg_fd:%d errno:%d", cg_fd, errno))
> > + goto cleanup_skel;
> >
> > - map_fd = bpf_find_map(__func__, obj, "global_map");
> > - if (map_fd < 0)
> > - goto err;
> > + map_fd = bpf_map__fd(skel->maps.global_map);
> > + sock_map_fd = bpf_map__fd(skel->maps.sockopt_results);
> >
> > - sock_map_fd = bpf_find_map(__func__, obj, "sockopt_results");
> > - if (sock_map_fd < 0)
> > - goto err;
> > + skel->links.bpf_testcb = 
> > bpf_program__attach_cgroup(skel->progs.bpf_testcb, cg_fd);
> > + if (ASSERT_OK_PTR(skel->links.bpf_testcb, 
> > "attach_cgroup(bpf_testcb)"))
> > + goto cleanup_namespace;
> >
> >   run_test(map_fd, sock_map_fd);
> >
> > - error = 0;
> > -err:
> > - bpf_prog_detach(cg_fd, BPF_CGROUP_SOCK_OPS);
> > +cleanup_namespace:
> nit.
>
> may be "cleanup_cgroup" instead?
>
> or only have one jump label to handle failure since "cg_fd != -1" has been
> tested already.

Good point. I can go through and just drop the second label and
simplify this. Will fix for v3.

Re: [PATCH mlx5-next v1 06/11] vdpa/mlx5: Connect mlx5_vdpa to auxiliary bus

2020-11-03 Thread Jason Gunthorpe

On Sun, Nov 01, 2020 at 10:15:37PM +0200, Leon Romanovsky wrote:
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 6c218b47b9f1..5316e51e72d4 100644
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1,18 +1,27 @@
>  // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
>  /* Copyright (c) 2020 Mellanox Technologies Ltd. */
> 
> +#include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> -#include 
>  #include 
> -#include "mlx5_vnet.h"
>  #include "mlx5_vdpa.h"
> 
> +MODULE_AUTHOR("Eli Cohen ");
> +MODULE_DESCRIPTION("Mellanox VDPA driver");
> +MODULE_LICENSE("Dual BSD/GPL");
> +
> +#define to_mlx5_vdpa_ndev(__mvdev) container_of(__mvdev, struct 
> mlx5_vdpa_net, mvdev)
>  #define to_mvdev(__vdev) container_of((__vdev), struct mlx5_vdpa_dev, vdev)
> 
>  #define VALID_FEATURES_MASK  
>   \
> @@ -159,6 +168,11 @@ static bool mlx5_vdpa_debug;
>   mlx5_vdpa_info(mvdev, "%s\n", #_status);
>\
>   } while (0)
> 
> +static inline u32 mlx5_vdpa_max_qps(int max_vqs)
> +{
> + return max_vqs / 2;
> +}
> +
>  static void print_status(struct mlx5_vdpa_dev *mvdev, u8 status, bool set)
>  {
>   if (status & ~VALID_STATUS_MASK)
> @@ -1928,8 +1942,11 @@ static void init_mvqs(struct mlx5_vdpa_net *ndev)
>   }
>  }
> 
> -void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
> +static int mlx5v_probe(struct auxiliary_device *adev,
> +const struct auxiliary_device_id *id)
>  {
> + struct mlx5_adev *madev = container_of(adev, struct mlx5_adev, adev);
> + struct mlx5_core_dev *mdev = madev->mdev;
>   struct virtio_net_config *config;
>   struct mlx5_vdpa_dev *mvdev;
>   struct mlx5_vdpa_net *ndev;
> @@ -1943,7 +1960,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>   ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, 
> mdev->device, &mlx5_vdpa_ops,
>2 * mlx5_vdpa_max_qps(max_vqs));
>   if (IS_ERR(ndev))
> - return ndev;
> + return PTR_ERR(ndev);
> 
>   ndev->mvdev.max_vqs = max_vqs;
>   mvdev = &ndev->mvdev;
> @@ -1972,7 +1989,8 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>   if (err)
>   goto err_reg;
> 
> - return ndev;
> + dev_set_drvdata(&adev->dev, ndev);
> + return 0;
> 
>  err_reg:
>   free_resources(ndev);
> @@ -1981,10 +1999,29 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>  err_mtu:
>   mutex_destroy(&ndev->reslock);
>   put_device(&mvdev->vdev.dev);
> - return ERR_PTR(err);
> + return err;
>  }
> 
> -void mlx5_vdpa_remove_dev(struct mlx5_vdpa_dev *mvdev)
> +static int mlx5v_remove(struct auxiliary_device *adev)
>  {
> + struct mlx5_vdpa_dev *mvdev = dev_get_drvdata(&adev->dev);
> +
>   vdpa_unregister_device(&mvdev->vdev);
> + return 0;
>  }
> +
> +static const struct auxiliary_device_id mlx5v_id_table[] = {
> + { .name = MLX5_ADEV_NAME ".vnet", },
> + {},
> +};
> +
> +MODULE_DEVICE_TABLE(auxiliary, mlx5v_id_table);
> +
> +static struct auxiliary_driver mlx5v_driver = {
> + .name = "vnet",
> + .probe = mlx5v_probe,
> + .remove = mlx5v_remove,
> + .id_table = mlx5v_id_table,
> +};

It is hard to see from the diff, but when this patch is applied the
vdpa module looks like I imagined things would look with the auxiliary
bus. It is very similar in structure to a PCI driver with the probe()
function cleanly registering with its subsystem. This is what I'd like
to see from the new Intel RDMA driver.

Greg, I think this patch is the best clean usage example.

I've looked over this series and it has the right idea and
parts. There is definitely more that can be done to improve mlx5 in
this area, but this series is well scoped and cleans a good part of
it.

Jason

1 2 3 4 >

1 - 100 of 368 matches

Mail list logo