[PATCH 1/2] Documentation: hyperv: Update spelling and fix typo

2024-05-07 Thread mhkelley58
From: Michael Kelley 

Update spelling from "VMbus" to "VMBus" to match Hyper-V product
documentation. Also correct typo: "SNP-SEV" should be "SEV-SNP".

Signed-off-by: Michael Kelley 
---
 Documentation/virt/hyperv/overview.rst | 22 +++
 Documentation/virt/hyperv/vmbus.rst| 82 +-
 2 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/Documentation/virt/hyperv/overview.rst 
b/Documentation/virt/hyperv/overview.rst
index cd493332c88a..77408a89d1a4 100644
--- a/Documentation/virt/hyperv/overview.rst
+++ b/Documentation/virt/hyperv/overview.rst
@@ -40,7 +40,7 @@ Linux guests communicate with Hyper-V in four different ways:
   arm64, these synthetic registers must be accessed using explicit
   hypercalls.
 
-* VMbus: VMbus is a higher-level software construct that is built on
+* VMBus: VMBus is a higher-level software construct that is built on
   the other 3 mechanisms.  It is a message passing interface between
   the Hyper-V host and the Linux guest.  It uses memory that is shared
   between Hyper-V and the guest, along with various signaling
@@ -54,8 +54,8 @@ x86/x64 architecture only.
 
 .. _Hyper-V Top Level Functional Spec (TLFS): 
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
 
-VMbus is not documented.  This documentation provides a high-level
-overview of VMbus and how it works, but the details can be discerned
+VMBus is not documented.  This documentation provides a high-level
+overview of VMBus and how it works, but the details can be discerned
 only from the code.
 
 Sharing Memory
@@ -74,7 +74,7 @@ follows:
   physical address space.  How Hyper-V is told about the GPA or list
   of GPAs varies.  In some cases, a single GPA is written to a
   synthetic register.  In other cases, a GPA or list of GPAs is sent
-  in a VMbus message.
+  in a VMBus message.
 
 * Hyper-V translates the GPAs into "real" physical memory addresses,
   and creates a virtual mapping that it can use to access the memory.
@@ -133,9 +133,9 @@ only the CPUs actually present in the VM, so Linux does not 
report
 any hot-add CPUs.
 
 A Linux guest CPU may be taken offline using the normal Linux
-mechanisms, provided no VMbus channel interrupts are assigned to
-the CPU.  See the section on VMbus Interrupts for more details
-on how VMbus channel interrupts can be re-assigned to permit
+mechanisms, provided no VMBus channel interrupts are assigned to
+the CPU.  See the section on VMBus Interrupts for more details
+on how VMBus channel interrupts can be re-assigned to permit
 taking a CPU offline.
 
 32-bit and 64-bit
@@ -169,14 +169,14 @@ and functionality. Hyper-V indicates feature/function 
availability
 via flags in synthetic MSRs that Hyper-V provides to the guest,
 and the guest code tests these flags.
 
-VMbus has its own protocol version that is negotiated during the
-initial VMbus connection from the guest to Hyper-V. This version
+VMBus has its own protocol version that is negotiated during the
+initial VMBus connection from the guest to Hyper-V. This version
 number is also output to dmesg during boot.  This version number
 is checked in a few places in the code to determine if specific
 functionality is present.
 
-Furthermore, each synthetic device on VMbus also has a protocol
-version that is separate from the VMbus protocol version. Device
+Furthermore, each synthetic device on VMBus also has a protocol
+version that is separate from the VMBus protocol version. Device
 drivers for these synthetic devices typically negotiate the device
 protocol version, and may test that protocol version to determine
 if specific device functionality is present.
diff --git a/Documentation/virt/hyperv/vmbus.rst 
b/Documentation/virt/hyperv/vmbus.rst
index d2012d9022c5..f0d83ebda626 100644
--- a/Documentation/virt/hyperv/vmbus.rst
+++ b/Documentation/virt/hyperv/vmbus.rst
@@ -1,8 +1,8 @@
 .. SPDX-License-Identifier: GPL-2.0
 
-VMbus
+VMBus
 =
-VMbus is a software construct provided by Hyper-V to guest VMs.  It
+VMBus is a software construct provided by Hyper-V to guest VMs.  It
 consists of a control path and common facilities used by synthetic
 devices that Hyper-V presents to guest VMs.   The control path is
 used to offer synthetic devices to the guest VM and, in some cases,
@@ -12,9 +12,9 @@ and the synthetic device implementation that is part of 
Hyper-V, and
 signaling primitives to allow Hyper-V and the guest to interrupt
 each other.
 
-VMbus is modeled in Linux as a bus, with the expected /sys/bus/vmbus
-entry in a running Linux guest.  The VMbus driver (drivers/hv/vmbus_drv.c)
-establishes the VMbus control path with the Hyper-V host, then
+VMBus is modeled in Linux as a bus, with the expected /sys/bus/vmbus
+entry in a running Linux guest.  The VMBus driver (drivers/hv/vmbus_drv.c)
+establishes the VMBus control path with the Hyper-V host, then
 registers itself as a Linux bus driver.  It implements the standard
 bus functions for adding and removing

[PATCH 2/2] Documentation: hyperv: Improve synic and interrupt handling description

2024-05-07 Thread mhkelley58
From: Michael Kelley 

Current documentation does not describe how Linux handles the synthetic
interrupt controller (synic) that Hyper-V provides to guest VMs, nor how
VMBus or timer interrupts are handled. Add text describing the synic and
reorganize existing text to make this more clear.

Signed-off-by: Michael Kelley 
---
 Documentation/virt/hyperv/clocks.rst | 21 +---
 Documentation/virt/hyperv/vmbus.rst  | 79 ++--
 2 files changed, 66 insertions(+), 34 deletions(-)

diff --git a/Documentation/virt/hyperv/clocks.rst 
b/Documentation/virt/hyperv/clocks.rst
index a56f4837d443..919bb92d6d9d 100644
--- a/Documentation/virt/hyperv/clocks.rst
+++ b/Documentation/virt/hyperv/clocks.rst
@@ -62,12 +62,21 @@ shared page with scale and offset values into user space.  
User
 space code performs the same algorithm of reading the TSC and
 applying the scale and offset to get the constant 10 MHz clock.
 
-Linux clockevents are based on Hyper-V synthetic timer 0. While
-Hyper-V offers 4 synthetic timers for each CPU, Linux only uses
-timer 0. Interrupts from stimer0 are recorded on the "HVS" line in
-/proc/interrupts.  Clockevents based on the virtualized PIT and
-local APIC timer also work, but the Hyper-V synthetic timer is
-preferred.
+Linux clockevents are based on Hyper-V synthetic timer 0 (stimer0).
+While Hyper-V offers 4 synthetic timers for each CPU, Linux only uses
+timer 0. In older versions of Hyper-V, an interrupt from stimer0
+results in a VMBus control message that is demultiplexed by
+vmbus_isr() as described in the VMBus documentation. In newer versions
+of Hyper-V, stimer0 interrupts can be mapped to an architectural
+interrupt, which is referred to as "Direct Mode". Linux prefers
+to use Direct Mode when available. Since x86/x64 doesn't support
+per-CPU interrupts, Direct Mode statically allocates an x86 interrupt
+vector (HYPERV_STIMER0_VECTOR) across all CPUs and explicitly codes it
+to call the stimer0 interrupt handler. Hence interrupts from stimer0
+are recorded on the "HVS" line in /proc/interrupts rather than being
+associated with a Linux IRQ. Clockevents based on the virtualized
+PIT and local APIC timer also work, but Hyper-V stimer0
+is preferred.
 
 The driver for the Hyper-V synthetic system clock and timers is
 drivers/clocksource/hyperv_timer.c.
diff --git a/Documentation/virt/hyperv/vmbus.rst 
b/Documentation/virt/hyperv/vmbus.rst
index f0d83ebda626..1dcef6a7fda3 100644
--- a/Documentation/virt/hyperv/vmbus.rst
+++ b/Documentation/virt/hyperv/vmbus.rst
@@ -102,10 +102,10 @@ resources.  For Windows Server 2019 and later, this limit 
is
 approximately 1280 Mbytes.  For versions prior to Windows Server
 2019, the limit is approximately 384 Mbytes.
 
-VMBus messages
---
-All VMBus messages have a standard header that includes the message
-length, the offset of the message payload, some flags, and a
+VMBus channel messages
+--
+All messages sent in a VMBus channel have a standard header that includes
+the message length, the offset of the message payload, some flags, and a
 transactionID.  The portion of the message after the header is
 unique to each VSP/VSC pair.
 
@@ -137,7 +137,7 @@ control message contains a list of GPAs that describe the 
data
 buffer.  For example, the storvsc driver uses this approach to
 specify the data buffers to/from which disk I/O is done.
 
-Three functions exist to send VMBus messages:
+Three functions exist to send VMBus channel messages:
 
 1. vmbus_sendpacket():  Control-only messages and messages with
embedded data -- no GPAs
@@ -165,6 +165,37 @@ performed in this temporary buffer without the risk of 
Hyper-V
 maliciously modifying the message after it is validated but before
 it is used.
 
+Synthetic Interrupt Controller (synic)
+--
+Hyper-V provides each guest CPU with a synthetic interrupt controller
+that is used by VMBus for host-guest communication. While each synic
+defines 16 synthetic interrupts (SINT), Linux uses only one of the 16
+(VMBUS_MESSAGE_SINT). All interrupts related to communication between
+the Hyper-V host and a guest CPU use that SINT.
+
+The SINT is mapped to a single per-CPU architectural interrupt (i.e,
+an 8-bit x86/x64 interrupt vector, or an arm64 PPI INTID). Because
+each CPU in the guest has a synic and may receive VMBus interrupts,
+they are best modeled in Linux as per-CPU interrupts. This model works
+well on arm64 where a single per-CPU Linux IRQ is allocated for
+VMBUS_MESSAGE_SINT. This IRQ appears in /proc/interrupts as an IRQ labelled
+"Hyper-V VMbus". Since x86/x64 lacks support for per-CPU IRQs, an x86
+interrupt vector is statically allocated (HYPERVISOR_CALLBACK_VECTOR)
+across all CPUs and explicitly coded to call vmbus_isr(). In this case,
+there's no Linux IRQ, and the interrupts are visible in aggregate in
+/proc/interrupts on the "HYP" line.
+
+The synic provides the means to demultiplex the architectural i

Re: [PATCH] Documentation: tracing: Fix spelling mistakes

2024-05-07 Thread Jonathan Corbet
Saurav Shah  writes:

> Fix spelling mistakes in the documentation.
>
> Signed-off-by: Saurav Shah 
> ---
>  Documentation/trace/fprobetrace.rst | 4 ++--
>  Documentation/trace/ftrace.rst  | 2 +-
>  Documentation/trace/kprobetrace.rst | 2 +-
>  3 files changed, 4 insertions(+), 4 deletions(-)

Applied, thanks.

jon



Re: [PATCH net-next v12 0/4] ethtool: provide the dim profile fine-tuning channel

2024-05-07 Thread Heng Qi
On Sat,  4 May 2024 14:44:43 +0800, Heng Qi  wrote:
> The NetDIM library provides excellent acceleration for many modern
> network cards. However, the default profiles of DIM limits its maximum
> capabilities for different NICs, so providing a way which the NIC can
> be custom configured is necessary.
> 
> Currently, the way is based on the commonly used "ethtool -C".
> 
> Please review, thank you very much!

Hi,

I would like to confirm if there are still comments on the current version,
since the current series and the just merged "Remove RTNL lock protection of
CVQ" conflict with a line of code with the fourth patch, if I can collect
other comments or ack/review tags, then release the new version seems better.

Thank you very much!

> 
> Changelog
> =
> v11->v12:
>   - Remove the use of IS_ENABLED(DIMLIB).
>   - Update Simon's htmldoc hint.
> 
> v10->v11:
>   - Fix and clean up some issues from Kuba, thanks.
>   - Rebase net-next/main
> 
> v9->v10:
>   - Collect dim related flags/mode/work into one place.
>   - Use rx_profile + tx_profile instead of four profiles.
>   - Add several helps.
>   - Update commit logs.
> 
> v8->v9:
>   - Fix the compilation error of conflicting names of rx_profile in
> dim.h and ice driver: in dim.h, rx_profile is replaced with
> dim_rx_profile. So does tx_profile.
> 
> v7->v8:
>   - Use kmemdup() instead of kzalloc()/memcpy() in dev_dim_profile_init().
> 
> v6->v7:
>   - A new wrapper struct pointer is used in struct net_device.
>   - Add IS_ENABLED(CONFIG_DIMLIB) to avoid compiler warnings.
>   - Profile fields changed from u16 to u32.
> 
> v5->v6:
>   - Place the profile in netdevice to bypass the driver.
> The interaction code of ethtool <-> kernel has not changed at all,
> only the interaction part of kernel <-> driver has changed.
> 
> v4->v5:
>   - Update some snippets from Kuba.
> 
> v3->v4:
>   - Some tiny updates and patch 1 only add a new comment.
> 
> v2->v3:
>   - Break up the attributes to avoid the use of raw c structs.
>   - Use per-device profile instead of global profile in the driver.
> 
> v1->v2:
>   - Use ethtool tool instead of net-sysfs.
> 
> Heng Qi (4):
>   linux/dim: move useful macros to .h file
>   ethtool: provide customized dim profile management
>   dim: add new interfaces for initialization and getting results
>   virtio-net: support dim profile fine-tuning
> 
>  Documentation/netlink/specs/ethtool.yaml |  31 +++
>  Documentation/networking/ethtool-netlink.rst |   4 +
>  drivers/net/virtio_net.c |  47 +++-
>  include/linux/dim.h  | 114 
>  include/linux/ethtool.h  |   4 +-
>  include/linux/netdevice.h|   3 +
>  include/uapi/linux/ethtool_netlink.h |  22 ++
>  lib/dim/net_dim.c| 145 ++-
>  net/ethtool/coalesce.c   | 259 ++-
>  9 files changed, 613 insertions(+), 16 deletions(-)
> 
> -- 
> 2.32.0.3.g01195cf9f
> 
> 



Re: [PATCH net-next v12 0/4] ethtool: provide the dim profile fine-tuning channel

2024-05-07 Thread Jakub Kicinski
On Wed, 8 May 2024 10:12:35 +0800 Heng Qi wrote:
> I would like to confirm if there are still comments on the current version,
> since the current series and the just merged "Remove RTNL lock protection of
> CVQ" conflict with a line of code with the fourth patch, if I can collect
> other comments or ack/review tags, then release the new version seems better.

Looking now!

Please note that I merged a patch today which makes DIMLIB a tri-state
config, meaning it can be a module now. So please double check that
didn't break things, especially referring to dim symbols from the core
code.



Re: [PATCH net-next v12 2/4] ethtool: provide customized dim profile management

2024-05-07 Thread Jakub Kicinski
On Sat,  4 May 2024 14:44:45 +0800 Heng Qi wrote:
> @@ -1325,6 +1354,8 @@ operations:
>  - tx-aggr-max-bytes
>  - tx-aggr-max-frames
>  - tx-aggr-time-usecs
> +- rx-profile
> +- tx-profile
>dump: *coalesce-get-op
>  -
>name: coalesce-set

set probably needs to get the new attributes, too?

>  Request is rejected if it attributes declared as unsupported by driver (i.e.
> diff --git a/include/linux/dim.h b/include/linux/dim.h
> index 43398f5eade2..d848b790ca50 100644
> --- a/include/linux/dim.h
> +++ b/include/linux/dim.h
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

looks unnecessary, you just need a forward declaration of 
struct net_device, no?

> diff --git a/lib/dim/net_dim.c b/lib/dim/net_dim.c
> index 67d5beb34dc3..b3e01619f929 100644
> --- a/lib/dim/net_dim.c
> +++ b/lib/dim/net_dim.c
> @@ -4,6 +4,7 @@
>   */
>  
>  #include 
> +#include 
>  
>  /*
>   * Net DIM profiles:
> @@ -95,6 +96,76 @@ net_dim_get_def_tx_moderation(u8 cq_period_mode)
>  }
>  EXPORT_SYMBOL(net_dim_get_def_tx_moderation);
>  
> +int net_dim_init_irq_moder(struct net_device *dev, u8 profile_flags,
> +u8 coal_flags, u8 rx_mode, u8 tx_mode,
> +void (*rx_dim_work)(struct work_struct *work),
> +void (*tx_dim_work)(struct work_struct *work))
> +{
> + struct dim_cq_moder *rxp = NULL, *txp;
> + struct dim_irq_moder *moder;
> + int len;
> +
> + dev->irq_moder = kzalloc(sizeof(*dev->irq_moder), GFP_KERNEL);
> + if (!dev->irq_moder)
> + goto err_moder;

return the error directly here, no need to goto

> + moder = dev->irq_moder;
> + len = NET_DIM_PARAMS_NUM_PROFILES * sizeof(*moder->rx_profile);
> +
> + moder->coal_flags = coal_flags;
> + moder->profile_flags = profile_flags;
> +
> + if (profile_flags & DIM_PROFILE_RX) {
> + moder->rx_dim_work = rx_dim_work;
> + WRITE_ONCE(moder->dim_rx_mode, rx_mode);

why WRITE_ONCE()? The structure can't be used, yet

> + rxp = kmemdup(rx_profile[rx_mode], len, GFP_KERNEL);
> + if (!rxp)
> + goto err_rx_profile;

name the labels after the target, please, not the source

> + rcu_assign_pointer(moder->rx_profile, rxp);
> + }

> +static int ethnl_update_profile(struct net_device *dev,
> + struct dim_cq_moder __rcu **dst,
> + const struct nlattr *nests,
> + struct netlink_ext_ack *extack)

> + rcu_assign_pointer(*dst, new_profile);
> + kfree_rcu(old_profile, rcu);
> +
> + return 0;

Don't we need to inform DIM somehow that profile has switched
and it should restart itself?



Re: [PATCH net-next v12 0/4] ethtool: provide the dim profile fine-tuning channel

2024-05-07 Thread Jakub Kicinski
On Sat,  4 May 2024 14:44:43 +0800 Heng Qi wrote:
> The NetDIM library provides excellent acceleration for many modern
> network cards. However, the default profiles of DIM limits its maximum
> capabilities for different NICs, so providing a way which the NIC can
> be custom configured is necessary.
> 
> Currently, the way is based on the commonly used "ethtool -C".
> 
> Please review, thank you very much!

Good progress! Please make sure to also update
Documentation/networking/net_dim.rst in the next version.