Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

2024-08-09 Thread Pierre-Louis Bossart


 Then there's the issue of parameters, we chose to only add parameters
 for standard encoders/decoders. Post-processing is highly specific and
 the parameter definitions varies from one implementation to another -
 and usually parameters are handled in an opaque way with binary
 controls. This is best handled with a UUID that needs to be known only
 to applications and low-level firmware/hardware, the kernel code should
 not have to be modified for each and every processing and to add new
 parameters. It just does not scale and it's unmaintainable.

 At the very least if you really want to use this compress API,
 extend it
 to use a non-descript "UUID-defined" type and an opaque set of
 parameters with this UUID passed in a header.
>>>
>>> We don't need to use UUID-defined scheme for simple (A)SRC
>>> implementation. As I noted, the specific runtime controls may use
>>> existing ALSA control API.
>>
>> "Simple (A)SRC" is an oxymoron. There are multiple ways to define the
>> performance, and how the drift estimator is handled. There's nothing
>> simple if you look under the hood. The SOF implementation has for
>> example those parameters:
>>
>> uint32_t source_rate;   /**< Define fixed source rate or */
>>     /**< use 0 to indicate need to get */
>>     /**< the rate from stream */
>> uint32_t sink_rate; /**< Define fixed sink rate or */
>>     /**< use 0 to indicate need to get */
>>     /**< the rate from stream */
>> uint32_t asynchronous_mode; /**< synchronous 0, asynchronous 1 */
>>     /**< When 1 the ASRC tracks and */
>>     /**< compensates for drift. */
>> uint32_t operation_mode;    /**< push 0, pull 1, In push mode the */
>>     /**< ASRC consumes a defined number */
>>     /**< of frames at input, with varying */
>>     /**< number of frames at output. */
>>     /**< In pull mode the ASRC outputs */
>>     /**< a defined number of frames while */
>>     /**< number of input frames varies. */
>>
>> They are clearly different from what is suggested above with a 'ratio-
>> mod'.
> 
> I don't think so. The proposed (A)SRC for compress-accel is just one
> case for the above configs where the input is known and output is
> controlled by the requested rate. The I/O mechanism is abstracted enough
> in this case and the driver/hardware/firmware must follow it.

ASRC is usually added when the nominal rates are known but the clock
sources differ and the drift needs to be estimated at run-time and the
coefficients or interpolation modified dynamically

If the ratio is known exactly and there's no clock drift, then it's a
different problem where the filter coefficients are constant.

>> Same if you have a 'simple EQ'. there are dozens of ways to implement
>> the functionality with FIR, IIR or a combination of the two, and
>> multiple bands.
>>
>> The point is that you have to think upfront about a generic way to pass
>> parameters. We didn't have to do it for encoders/decoders because we
>> only catered to well-documented standard solutions only. By choosing to
>> support PCM processing, a new can of worms is now open.
>>
>> I repeat: please do not make the mistake of listing all processing with
>> an enum and a new structure for parameters every time someone needs a
>> specific transform in their pipeline. We made that mistake with SOF and
>> had to backtrack rather quickly. The only way to scale is an identifier
>> that is NOT included in the kernel code but is known to higher and
>> lower-levels only.
> 
> There are two ways - black box (UUID - as you suggested) - or well
> defined purpose (abstraction). For your example 'simple EQ', the
> parameters should be the band (frequency range) volume values. It's
> abstract and the real filters (resp. implementation) used behind may
> depend on the hardware/driver capabilities.

Indeed there is a possibility that the parameters are high-level, but
that would require firmware or hardware to be able to generate actual
coefficients from those parameters. That usually requires some advanced
math which isn't necessarily obvious to implement with fixed-point hardware.

> From my view, the really special cases may be handled as black box, but
> others like (A)SRC should follow some well-defined abstraction IMHO to
> not force user space to handle all special cases.

I am not against the high-level abstractions, e.g. along the lines of
what Android defined:
https://developer.android.com/reference/android/media/audiofx/AudioEffect

That's not sufficient however, we also need to make sure there's an
ability to provide pre-computed coefficients in an opaque manner for
processing that doesn't fit in the well-defined cases. In practice there
are very few 3rd party IP that fits in well-defined cases, everyone has
secret-sauce parameters and options.


Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

2024-08-09 Thread Shengjiu Wang
On Fri, Aug 9, 2024 at 3:25 PM Pierre-Louis Bossart
 wrote:
>
>
>  Then there's the issue of parameters, we chose to only add parameters
>  for standard encoders/decoders. Post-processing is highly specific and
>  the parameter definitions varies from one implementation to another -
>  and usually parameters are handled in an opaque way with binary
>  controls. This is best handled with a UUID that needs to be known only
>  to applications and low-level firmware/hardware, the kernel code should
>  not have to be modified for each and every processing and to add new
>  parameters. It just does not scale and it's unmaintainable.
> 
>  At the very least if you really want to use this compress API,
>  extend it
>  to use a non-descript "UUID-defined" type and an opaque set of
>  parameters with this UUID passed in a header.
> >>>
> >>> We don't need to use UUID-defined scheme for simple (A)SRC
> >>> implementation. As I noted, the specific runtime controls may use
> >>> existing ALSA control API.
> >>
> >> "Simple (A)SRC" is an oxymoron. There are multiple ways to define the
> >> performance, and how the drift estimator is handled. There's nothing
> >> simple if you look under the hood. The SOF implementation has for
> >> example those parameters:
> >>
> >> uint32_t source_rate;   /**< Define fixed source rate or */
> >> /**< use 0 to indicate need to get */
> >> /**< the rate from stream */
> >> uint32_t sink_rate; /**< Define fixed sink rate or */
> >> /**< use 0 to indicate need to get */
> >> /**< the rate from stream */
> >> uint32_t asynchronous_mode; /**< synchronous 0, asynchronous 1 */
> >> /**< When 1 the ASRC tracks and */
> >> /**< compensates for drift. */
> >> uint32_t operation_mode;/**< push 0, pull 1, In push mode the */
> >> /**< ASRC consumes a defined number */
> >> /**< of frames at input, with varying */
> >> /**< number of frames at output. */
> >> /**< In pull mode the ASRC outputs */
> >> /**< a defined number of frames while */
> >> /**< number of input frames varies. */
> >>
> >> They are clearly different from what is suggested above with a 'ratio-
> >> mod'.
> >
> > I don't think so. The proposed (A)SRC for compress-accel is just one
> > case for the above configs where the input is known and output is
> > controlled by the requested rate. The I/O mechanism is abstracted enough
> > in this case and the driver/hardware/firmware must follow it.
>
> ASRC is usually added when the nominal rates are known but the clock
> sources differ and the drift needs to be estimated at run-time and the
> coefficients or interpolation modified dynamically
>
> If the ratio is known exactly and there's no clock drift, then it's a
> different problem where the filter coefficients are constant.
>
> >> Same if you have a 'simple EQ'. there are dozens of ways to implement
> >> the functionality with FIR, IIR or a combination of the two, and
> >> multiple bands.
> >>
> >> The point is that you have to think upfront about a generic way to pass
> >> parameters. We didn't have to do it for encoders/decoders because we
> >> only catered to well-documented standard solutions only. By choosing to
> >> support PCM processing, a new can of worms is now open.
> >>
> >> I repeat: please do not make the mistake of listing all processing with
> >> an enum and a new structure for parameters every time someone needs a
> >> specific transform in their pipeline. We made that mistake with SOF and
> >> had to backtrack rather quickly. The only way to scale is an identifier
> >> that is NOT included in the kernel code but is known to higher and
> >> lower-levels only.
> >
> > There are two ways - black box (UUID - as you suggested) - or well
> > defined purpose (abstraction). For your example 'simple EQ', the
> > parameters should be the band (frequency range) volume values. It's
> > abstract and the real filters (resp. implementation) used behind may
> > depend on the hardware/driver capabilities.
>
> Indeed there is a possibility that the parameters are high-level, but
> that would require firmware or hardware to be able to generate actual
> coefficients from those parameters. That usually requires some advanced
> math which isn't necessarily obvious to implement with fixed-point hardware.
>
> > From my view, the really special cases may be handled as black box, but
> > others like (A)SRC should follow some well-defined abstraction IMHO to
> > not force user space to handle all special cases.
>
> I am not against the high-level abstractions, e.g. along the lines of
> what Android defined:
> https://developer.android.com/reference/android/media/audiofx/AudioEffect
>
> That's not sufficient however, we also need to make sure there's an
> ability to provi

Re: [PATCH] Document/kexec: Generalize crash hotplug description

2024-08-09 Thread Sourabh Jain

Hello Baoquan,

On 09/08/24 07:18, Baoquan He wrote:

On 08/05/24 at 10:38am, Sourabh Jain wrote:

Commit 79365026f869 ("crash: add a new kexec flag for hotplug support")
generalizes the crash hotplug support to allow architectures to update
multiple kexec segments on CPU/Memory hotplug and not just elfcorehdr.
Therefore, update the relevant kernel documentation to reflect the same.

No functional change.

Cc: Petr Tesarik 
Cc: Hari Bathini 
Cc: ke...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: x...@kernel.org
Signed-off-by: Sourabh Jain 
---

Discussion about the documentation update:
https://lore.kernel.org/all/68d0328d-531a-4a2b-ab26-c97fd8a12...@linux.ibm.com/

---
  .../ABI/testing/sysfs-devices-memory  |  6 ++--
  .../ABI/testing/sysfs-devices-system-cpu  |  6 ++--
  .../admin-guide/mm/memory-hotplug.rst |  5 ++--
  Documentation/core-api/cpu_hotplug.rst| 10 ---
  kernel/crash_core.c   | 29 ---
  5 files changed, 33 insertions(+), 23 deletions(-)

The overall looks good to me, except of concern from Petr. Thanks.


Thanks for the review. I will make the suggested changes in v2.

Additionally I will also generalize the error message
"kexec_trylock() failed, elfcorehdr may be inaccurate " from
functions crash_handle_hotplug_event() and crash_check_hotplug_support()
to "kexec_trylock() failed, kdump image may be inaccurate"

- Sourabh Jain




diff --git a/Documentation/ABI/testing/sysfs-devices-memory 
b/Documentation/ABI/testing/sysfs-devices-memory
index a95e0f17c35a..421acc8e2c6b 100644
--- a/Documentation/ABI/testing/sysfs-devices-memory
+++ b/Documentation/ABI/testing/sysfs-devices-memory
@@ -115,6 +115,6 @@ What:   /sys/devices/system/memory/crash_hotplug
  Date: Aug 2023
  Contact:  Linux kernel mailing list 
  Description:
-   (RO) indicates whether or not the kernel directly supports
-   modifying the crash elfcorehdr for memory hot un/plug and/or
-   on/offline changes.
+   (RO) indicates whether or not the kernel update of kexec
+   segments on memory hot un/plug and/or on/offline events,
+   avoiding the need to reload kdump kernel.
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 325873385b71..f4ada1cd2f96 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -703,9 +703,9 @@ What:   /sys/devices/system/cpu/crash_hotplug
  Date: Aug 2023
  Contact:  Linux kernel mailing list 
  Description:
-   (RO) indicates whether or not the kernel directly supports
-   modifying the crash elfcorehdr for CPU hot un/plug and/or
-   on/offline changes.
+   (RO) indicates whether or not the kernel update of kexec
+   segments on CPU hot un/plug and/or on/offline events,
+   avoiding the need to reload kdump kernel.
  
  What:		/sys/devices/system/cpu/enabled

  Date: Nov 2022
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst 
b/Documentation/admin-guide/mm/memory-hotplug.rst
index 098f14d83e99..cb2c080f400c 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -294,8 +294,9 @@ The following files are currently defined:
  ``crash_hotplug``  read-only: when changes to the system memory map
   occur due to hot un/plug of memory, this file contains
   '1' if the kernel updates the kdump capture kernel memory
-  map itself (via elfcorehdr), or '0' if userspace must 
update
-  the kdump capture kernel memory map.
+  map itself (via elfcorehdr and other relevant kexec
+  segments), or '0' if userspace must update the kdump
+  capture kernel memory map.
  
  		   Availability depends on the CONFIG_MEMORY_HOTPLUG kernel

   configuration option.
diff --git a/Documentation/core-api/cpu_hotplug.rst 
b/Documentation/core-api/cpu_hotplug.rst
index dcb0e379e5e8..a21dbf261be7 100644
--- a/Documentation/core-api/cpu_hotplug.rst
+++ b/Documentation/core-api/cpu_hotplug.rst
@@ -737,8 +737,9 @@ can process the event further.
  
  When changes to the CPUs in the system occur, the sysfs file

  /sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
-updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
-or '0' if userspace must update the kdump capture kernel list of CPUs.
+updates the kdump capture kernel list of CPUs itself (via elfcorehdr and
+other relevant kexec segment), or '0' if userspace must update the kdump
+capture kernel list of CPUs.
  
  The availability depends on the CONFIG_HOTPLUG_CPU ker

Re: [PATCH v4 4/7] mm/x86: Make pud_leaf() only care about PSE bit

2024-08-09 Thread Thomas Gleixner
On Thu, Aug 08 2024 at 10:54, Peter Xu wrote:
> On Thu, Aug 08, 2024 at 12:22:38AM +0200, Thomas Gleixner wrote:
>> On Wed, Aug 07 2024 at 15:48, Peter Xu wrote:
>> > An entry should be reported as PUD leaf even if it's PROT_NONE, in which
>> > case PRESENT bit isn't there. I hit bad pud without this when testing dax
>> > 1G on zapping a PROT_NONE PUD.
>> 
>> That does not qualify as a change log. What you hit is irrelevant unless
>> you explain the actual underlying problem. See Documentation/process/
>> including the TIP documentation.
>
> Firstly, thanks a lot for the reviews.
>
> I thought the commit message explained exactly what is the underlying
> problem, no?
>
> The problem is even if PROT_NONE, as long as the PSE bit is set on the PUD
> it should be treated as a PUD leaf.

Sure. But why should it be treated so.

> Currently, the code will return pud_leaf()==false for those PROT_NONE
> PUD entries, and IMHO that is wrong.

Your humble opinion is fine, but hardly a technical argument.

> This patch wants to make it right.  I still think that's mostly what I put
> there in the commit message..
>
> Would you please suggest something so I can try to make it better,
> otherwise?  Or it'll be helpful too if you could point out which part of
> the two documentations I should reference.

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog

  A good structure is to explain the context, the problem and the
  solution in separate paragraphs and this order

>> > diff --git a/arch/x86/include/asm/pgtable.h 
>> > b/arch/x86/include/asm/pgtable.h
>> > index e39311a89bf4..a2a3bd4c1bda 100644
>> > --- a/arch/x86/include/asm/pgtable.h
>> > +++ b/arch/x86/include/asm/pgtable.h
>> > @@ -1078,8 +1078,7 @@ static inline pmd_t *pud_pgtable(pud_t pud)
>> >  #define pud_leaf pud_leaf
>> >  static inline bool pud_leaf(pud_t pud)
>> >  {
>> > -  return (pud_val(pud) & (_PAGE_PSE | _PAGE_PRESENT)) ==
>> > -  (_PAGE_PSE | _PAGE_PRESENT);
>> > +  return pud_val(pud) & _PAGE_PSE;
>> >  }
>> 
>> And the changelog does not explain why this change is not affecting any
>> existing user of pud_leaf().
>
> That's what I want to do: I want to affect them..

Fine. Just the change log does not tell me what the actual problem is
("I hit something" does not qualify) and "should be reported" is not
helpful either as it does not explain anything

> And IMHO it's mostly fine before because mprotect() is broken with 1g
> anyway, and I guess nobody managed to populate any pud entry with PROT_NONE
> on dax 1g before, and that's what this whole series is trying to fix.

Again your humble opinion matters, but technical facts and analysis
matter way more.

Thanks,

tglx


Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

2024-08-09 Thread Pierre-Louis Bossart


> Why I use the metadata ioctl is because the ALSA controls are binding
> to the sound card.  What I want is the controls can be bound to
> snd_compr_stream, because the ASRC compress sound card can
> support multi instances ( the ASRC can support multi conversion in
> parallel).   The ALSA controls can't be used for this case,  the only
> choice in current compress API is metadata ioctl. 

I don't know if there is really a technical limitation for this, this is
for Jaroslav to comment. I am not sure why it would be a problem to e.g.
have a volume control prior to an encoder or after a decoder.

> And metadata
> ioctl can be called many times which can meet the ratio modifier
> requirement (ratio may be drift on the fly)

Interesting, that's yet another way of handling the drift with userspace
modifying the ratio dynamically. That's different to what I've seen before.

> And compress API uses codec as the unit for capability query and
> parameter setting,  So I think need to define "SND_AUDIOCODEC_SRC'
> and 'struct snd_dec_src',  for the 'snd_dec_src' just defined output
> format and output rate, channels definition just reuse the snd_codec.ch_in.

The capability query is an interesting point as well, it's not clear how
to expose to userspace what this specific implementation can do, while
at the same time *requiring* userpace to update the ratio dynamically.
For something like this to work, userspace needs to have pre-existing
information on how the SRC works.


Re: PCI: Work around PCIe link training failures

2024-08-09 Thread Maciej W. Rozycki
On Wed, 7 Aug 2024, Matthew W Carlis wrote:

> > For the quirk to trigger, the link has to be down and there has to be the
> > LBMS Link Status bit set from link management events as per the PCIe spec
> > while the link was previously up, and then both of that while rescanning
> > the PCIe device in question, so there's a lot of conditions to meet.
> 
> If there is nothing clearing the bit then why is there any expectation that
> it wouldn't be set? We have 20/30/40 endpoints in a host that can be 
> hot-removed,
> hot-added at any point in time in any combination & its often the case that
> the system uptime be hundreds of days. Eventually the bit will just become set
> as a result of time and scale.

 Well, in principle in a setup with reliable links the LBMS bit may never 
be set, e.g. this system of mine has been in 24/7 operation since the last 
reboot 410 days ago and for the devices that support Link Active reporting 
it shows:

LnkSta:Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk- DLActive+ BWMgmt- 
ABWMgmt-
LnkSta:Speed 5GT/s, Width x8, TrErr- Train- SlotClk- DLActive+ BWMgmt+ ABWMgmt+
LnkSta:Speed 5GT/s, Width x1, TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt+
LnkSta:Speed 5GT/s, Width x2, TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt+
LnkSta:Speed 5GT/s, Width x1, TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt+
LnkSta:Speed 8GT/s, Width x16, TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt+
LnkSta:Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
LnkSta:Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
LnkSta:Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
LnkSta:Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
LnkSta:Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk- DLActive+ BWMgmt- 
ABWMgmt-

so out of 11 devices 6 have the LBMS bit clear.  But then 5 have it set, 
perhaps worryingly, so of course you're right, that it will get set in the 
field, though it's not enough by itself for your problem to trigger.

 Then there is manual link retraining, which we do from time to time as 
well, which will set the bit too, which I overlooked.

 To cut the story short, it was an oversight of mine, especially as I 
don't really make any use myself of hot plug scenarios.

> > The reason for this is safety: it's better to have a link run at 2.5GT/s
> > than not at all, and from the nature of the issue there is no guarantee
> > that if you remove the speed clamp, then the link won't go back into the
> > infinite retraining loop that the workaround is supposed to break.
> 
> I guess I don't really understand why it wouldn't be safe for every device
> other than the ASMedia since they aren't the device that had the issue in the
> first place. The main problem in my mind is the system doesn't actually have 
> to
> be retraining at all, it can occur the first time you replace a device or
> power cycle the device or the first time the device goes into DPC & comes 
> back.
> If the quirk simply returned without doing anything when the ASMedia is not 
> in the
> allow-list it would make me more at ease. I can also imagine some other 
> implementations
> that would work well, but it doesn't seem correct that we could only give a 
> single
> opportunity to a device before forcing it to live at Gen1. Perhaps it should 
> be
> aware of the rate or a count or something...

 It's a complex matter.  For a starter please read the introduction at 
`pcie_failed_link_retrain'.

 When the problem triggers the link goes into an infinite link training 
loop, with the Link Training (LT) bit flipping on and off.  I've made a 
complementary workaround for U-Boot (since your bootstrap device can be 
downstream such a broken link), where a statistical probe is done for the 
LT bit flipping as discovered by polling the bit in a tight loop.  This is 
fine for a piece of firmware that has nothing better to do, but in an OS 
kernel we just can't do it.

 Also U-Boot does not remove the 2.5GT/s clamp because it does not have to 
set up the system in an optimal way, but just sufficiently to successfully 
boot.  An OS kernel can then adjust the configuration if it knows about 
the workaround, or leave it as it is.  In the latter case things will 
continue working across PCIe resets, because the TLS field is sticky.

 For Linux I went for just checking the DLLLA bit as it seems good enough 
for devices that support Link Active reporting.  And I admit that the 
workaround is quite aggressive, but this was a deliberate decision 
following the robustness principle: the end user may not be qualified 
enough to diagnose the problem and given its nature not even think it's 
something that can be made to work.  The downstream device does not show 
up in the PCIe hierarchy and just looks like it's broken.  It takes a lot 
of knowledge and experience to notice the LT bit flipping and even myself, 
quite a seasoned Linux kernel developer, only notice

Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

2024-08-09 Thread Jaroslav Kysela

On 09. 08. 24 12:14, Shengjiu Wang wrote:

On Fri, Aug 9, 2024 at 3:25 PM Pierre-Louis Bossart
 wrote:




Then there's the issue of parameters, we chose to only add parameters
for standard encoders/decoders. Post-processing is highly specific and
the parameter definitions varies from one implementation to another -
and usually parameters are handled in an opaque way with binary
controls. This is best handled with a UUID that needs to be known only
to applications and low-level firmware/hardware, the kernel code should
not have to be modified for each and every processing and to add new
parameters. It just does not scale and it's unmaintainable.

At the very least if you really want to use this compress API,
extend it
to use a non-descript "UUID-defined" type and an opaque set of
parameters with this UUID passed in a header.


We don't need to use UUID-defined scheme for simple (A)SRC
implementation. As I noted, the specific runtime controls may use
existing ALSA control API.


"Simple (A)SRC" is an oxymoron. There are multiple ways to define the
performance, and how the drift estimator is handled. There's nothing
simple if you look under the hood. The SOF implementation has for
example those parameters:

uint32_t source_rate;   /**< Define fixed source rate or */
 /**< use 0 to indicate need to get */
 /**< the rate from stream */
uint32_t sink_rate; /**< Define fixed sink rate or */
 /**< use 0 to indicate need to get */
 /**< the rate from stream */
uint32_t asynchronous_mode; /**< synchronous 0, asynchronous 1 */
 /**< When 1 the ASRC tracks and */
 /**< compensates for drift. */
uint32_t operation_mode;/**< push 0, pull 1, In push mode the */
 /**< ASRC consumes a defined number */
 /**< of frames at input, with varying */
 /**< number of frames at output. */
 /**< In pull mode the ASRC outputs */
 /**< a defined number of frames while */
 /**< number of input frames varies. */

They are clearly different from what is suggested above with a 'ratio-
mod'.


I don't think so. The proposed (A)SRC for compress-accel is just one
case for the above configs where the input is known and output is
controlled by the requested rate. The I/O mechanism is abstracted enough
in this case and the driver/hardware/firmware must follow it.


ASRC is usually added when the nominal rates are known but the clock
sources differ and the drift needs to be estimated at run-time and the
coefficients or interpolation modified dynamically

If the ratio is known exactly and there's no clock drift, then it's a
different problem where the filter coefficients are constant.


Same if you have a 'simple EQ'. there are dozens of ways to implement
the functionality with FIR, IIR or a combination of the two, and
multiple bands.

The point is that you have to think upfront about a generic way to pass
parameters. We didn't have to do it for encoders/decoders because we
only catered to well-documented standard solutions only. By choosing to
support PCM processing, a new can of worms is now open.

I repeat: please do not make the mistake of listing all processing with
an enum and a new structure for parameters every time someone needs a
specific transform in their pipeline. We made that mistake with SOF and
had to backtrack rather quickly. The only way to scale is an identifier
that is NOT included in the kernel code but is known to higher and
lower-levels only.


There are two ways - black box (UUID - as you suggested) - or well
defined purpose (abstraction). For your example 'simple EQ', the
parameters should be the band (frequency range) volume values. It's
abstract and the real filters (resp. implementation) used behind may
depend on the hardware/driver capabilities.


Indeed there is a possibility that the parameters are high-level, but
that would require firmware or hardware to be able to generate actual
coefficients from those parameters. That usually requires some advanced
math which isn't necessarily obvious to implement with fixed-point hardware.


 From my view, the really special cases may be handled as black box, but
others like (A)SRC should follow some well-defined abstraction IMHO to
not force user space to handle all special cases.


I am not against the high-level abstractions, e.g. along the lines of
what Android defined:
https://developer.android.com/reference/android/media/audiofx/AudioEffect

That's not sufficient however, we also need to make sure there's an
ability to provide pre-computed coefficients in an opaque manner for
processing that doesn't fit in the well-defined cases. In practice there
are very few 3rd party IP that fits in well-defined cases, everyone has
secret-sauce parameters and options.


Appreciate the discussion.

Let me explain the reason for the change:

Why I use the metadata 

Re: [PATCH v4 4/7] mm/x86: Make pud_leaf() only care about PSE bit

2024-08-09 Thread Peter Xu
On Fri, Aug 09, 2024 at 02:08:28PM +0200, Thomas Gleixner wrote:
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog
> 
>   A good structure is to explain the context, the problem and the
>   solution in separate paragraphs and this order

I'll try to follow, thanks.

[...]

> > And IMHO it's mostly fine before because mprotect() is broken with 1g
> > anyway, and I guess nobody managed to populate any pud entry with PROT_NONE
> > on dax 1g before, and that's what this whole series is trying to fix.
> 
> Again your humble opinion matters, but technical facts and analysis
> matter way more.

All the rest comments in the reply were about "why it's a PUD leaf".  So
let me reply in one shot.

Referring to pXd_leaf() documentation in linux/pgtable.h:

/*
 * pXd_leaf() is the API to check whether a pgtable entry is a huge page
 * mapping.  It should work globally across all archs, without any
 * dependency on CONFIG_* options.  For architectures that do not support
 * huge mappings on specific levels, below fallbacks will be used.
 *
 * A leaf pgtable entry should always imply the following:
 *
 * - It is a "present" entry.  IOW, before using this API, please check it
 *   with pXd_present() first. NOTE: it may not always mean the "present
 *   bit" is set.  For example, PROT_NONE entries are always "present".
 *
 * - It should _never_ be a swap entry of any type.  Above "present" check
 *   should have guarded this, but let's be crystal clear on this.
 *
 * - It should contain a huge PFN, which points to a huge page larger than
 *   PAGE_SIZE of the platform.  The PFN format isn't important here.
 *
 * - It should cover all kinds of huge mappings (e.g., pXd_trans_huge(),
 *   pXd_devmap(), or hugetlb mappings).
 */

It explicitly stated that PROT_NONE should be treated as a present entry,
and also a leaf. The document is for pXd_leaf(), so it should cover puds
too.  In this specific case of the zapping path, it's only possible it's a
DAX 1G thp.  But pud_leaf() should work for hugetlb too, for example, when
PROT_NONE applied on top of a 1G hugetlb with PSE set.

Unfortunately, I wrote this document in 64078b3d57.. so that's also another
way of saying "my humble opinion".. it's just nobody disagreed so far, and
please shoot if you see any issue out of it.

IOW, I don't think we must define pXd_leaf() like this - we used to define
pXd_leaf() to cover migration entries at least on x86, for example. But per
my own past mm experience, the current way is the right thing to do to make
everything much easier and less error prone.  Sorry, I can't get rid of
"IMHO" here.

Another example of "we can define pXd_leaf() in other ways" is I believe
for PPC 8XX series it's possible to make special use of pmd_leaf() by
allowing pmd_leaf() to return true even for two continuous pte pgtable
covering 8MB memory.  But that will be an extremely special use of
pmd_leaf() even if it comes, maybe worth an update above when it happens,
and it'll only be used by powerpc not any other arch.  It won't happen if
we want to drop 8MB support, though.

So in short, I don't think there's a 100% correct "technical" answer of
saying "how to define pxx_leaf()"; things just keep evolving, and "humble
opinions" keeps coming with some good reasons.  Hope that answers the
question to some extent.

Taking all things into account, I wonder whether below enriched commit
message would get me closer to your ACK on this, trying to follow the rule
you referenced on the order of how context/problem/solution should be
ordered and addressed:

When working on mprotect() on 1G dax entries, I hit an zap bad pud
error when zapping a huge pud that is with PROT_NONE permission.

Here the problem is x86's pud_leaf() requires both PRESENT and PSE bits
set to report a pud entry as a leaf, but that doesn't look right, as
it's not following the pXd_leaf() definition that we stick with so far,
where PROT_NONE entries should be reported as leaves.

To fix it, change x86's pud_leaf() implementation to only check against
PSE bit to report a leaf, irrelevant of whether PRESENT bit is set.

Thanks,

-- 
Peter Xu



Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

2024-08-09 Thread Jaroslav Kysela

On 09. 08. 24 14:52, Pierre-Louis Bossart wrote:


And metadata
ioctl can be called many times which can meet the ratio modifier
requirement (ratio may be drift on the fly)


Interesting, that's yet another way of handling the drift with userspace
modifying the ratio dynamically. That's different to what I've seen before.


Note that the "timing" is managed by the user space with this scheme.


And compress API uses codec as the unit for capability query and
parameter setting,  So I think need to define "SND_AUDIOCODEC_SRC'
and 'struct snd_dec_src',  for the 'snd_dec_src' just defined output
format and output rate, channels definition just reuse the snd_codec.ch_in.


The capability query is an interesting point as well, it's not clear how
to expose to userspace what this specific implementation can do, while
at the same time *requiring* userpace to update the ratio dynamically.
For something like this to work, userspace needs to have pre-existing
information on how the SRC works.


Yes, it's about abstraction. The user space wants to push data, read data back 
converted to the target rate and eventually modify the drift using a control 
managing clocks using own way. We can eventually assume, that if this control 
does not exist, the drift cannot be controlled. Also, nice thing is that the 
control has min and max values (range), so driver can specify the drift range, 
too.


And again, look to "PCM Rate Shift 10" control implementation in 
sound/drivers/aloop.c. It would be nice to have the base offset for the 
shift/drift/pitch value standardized.


Jaroslav

--
Jaroslav Kysela 
Linux Sound Maintainer; ALSA Project; Red Hat, Inc.



Re: [PATCHSET 00/10] perf tools: Sync tools and kernel headers for v6.11

2024-08-09 Thread Athira Rajeev



> On 9 Aug 2024, at 12:14 AM, Namhyung Kim  wrote:
> 
> Hello,
> 
> On Thu, Aug 08, 2024 at 12:14:12PM +0530, Athira Rajeev wrote:
>> 
>> 
>>> On 7 Aug 2024, at 11:42 PM, Namhyung Kim  wrote:
>>> 
>>> Hello folks,
>>> 
>>> On Tue, Aug 06, 2024 at 03:50:03PM -0700, Namhyung Kim wrote:
 Hello,
 
 This is the usual sync up in header files we keep in tools directory.
 I put a file to give the reason of this work and not to repeat it in
 every commit message.  The changes will be carried in the perf-tools
 tree.
>>> 
>>> Could you please double check what's in the tmp.perf-tools branch at the
>>> perf-tools tree so I don't break build and perf trace for arm64, powerpc
>>> and s390?  It has this patchset + arm64 unistd header revert (according
>>> to the discussion on patch 6/10) on top of v6.11-rc2.
>>> 
>>> Thanks,
>>> Namhyung
>> Hi Namhyung,
>> 
>> Can you please point to the tree. I checked in 
>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git as well as 
>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git , 
>> but didn’t find the changes. May be I am missing something. I am trying to 
>> check the build in powerpc.
> 
> Oh, sorry about that.  It's in:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git
> 
> (no -next at the end)

Hi,

I did compile test on powerpc and results are good. 

Tested-by: Athira Rajeev 

Thanks
Athira
> 
> Thanks,
> Namhyung




[PATCH 1/1] dt-bindings: soc: fsl: cpm_qe: convert network.txt to yaml

2024-08-09 Thread Frank Li
Convert binding doc newwork.txt to yaml format.

HDLC part:
- Convert to "fsl,ucc-hdlc.yaml".
- Add missed reg and interrupt property.
- Update example to pass build.

ethernet part:
- Convert to net/fsl,cpm-enet.yaml
- Add 0x in example, which should be hex value
- Add ref to ethernet-controller.yaml

mdio part:
- Convert to net/fsl,cpm-mdio.yaml
- Add 0x in example, which should be hex value
- Add ref to mdio.yaml

Signed-off-by: Frank Li 
---
This one is quite old. The detail informaiton is limited.
---
 .../devicetree/bindings/net/fsl,cpm-enet.yaml |  59 
 .../devicetree/bindings/net/fsl,cpm-mdio.yaml |  55 +++
 .../bindings/soc/fsl/cpm_qe/fsl,ucc-hdlc.yaml | 140 ++
 .../bindings/soc/fsl/cpm_qe/network.txt   | 130 
 4 files changed, 254 insertions(+), 130 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/net/fsl,cpm-enet.yaml
 create mode 100644 Documentation/devicetree/bindings/net/fsl,cpm-mdio.yaml
 create mode 100644 
Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,ucc-hdlc.yaml
 delete mode 100644 Documentation/devicetree/bindings/soc/fsl/cpm_qe/network.txt

diff --git a/Documentation/devicetree/bindings/net/fsl,cpm-enet.yaml 
b/Documentation/devicetree/bindings/net/fsl,cpm-enet.yaml
new file mode 100644
index 0..da836477e8bad
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/fsl,cpm-enet.yaml
@@ -0,0 +1,59 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/fsl,cpm-enet.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Network for cpm enet
+
+maintainers:
+  - Frank Li 
+
+properties:
+  compatible:
+oneOf:
+  - enum:
+  - fsl,cpm1-scc-enet
+  - fsl,cpm2-scc-enet
+  - fsl,cpm1-fec-enet
+  - fsl,cpm2-fcc-enet
+  - fsl,qe-enet
+  - items:
+  - enum:
+  - fsl,mpc8272-fcc-enet
+  - const: fsl,cpm2-fcc-enet
+
+  reg:
+minItems: 1
+maxItems: 3
+
+  interrupts:
+maxItems: 1
+
+  fsl,cpm-command:
+$ref: /schemas/types.yaml#/definitions/uint32
+description: cpm command
+
+required:
+  - compatible
+  - reg
+  - interrupts
+
+allOf:
+  - $ref: ethernet-controller.yaml
+
+unevaluatedProperties: false
+
+examples:
+  - |
+ethernet@11300 {
+compatible = "fsl,mpc8272-fcc-enet",
+ "fsl,cpm2-fcc-enet";
+reg = <0x11300 0x20 0x8400 0x100 0x11390 1>;
+local-mac-address = [ 00 00 00 00 00 00 ];
+interrupts = <20 8>;
+interrupt-parent = <&pic>;
+phy-handle = <&phy0>;
+fsl,cpm-command = <0x12000300>;
+};
+
diff --git a/Documentation/devicetree/bindings/net/fsl,cpm-mdio.yaml 
b/Documentation/devicetree/bindings/net/fsl,cpm-mdio.yaml
new file mode 100644
index 0..b1791a3c490e2
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/fsl,cpm-mdio.yaml
@@ -0,0 +1,55 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/fsl,cpm-mdio.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale CPM MDIO Device
+
+maintainers:
+  - Frank Li 
+
+properties:
+  compatible:
+oneOf:
+  - enum:
+  - fsl,pq1-fec-mdio
+  - fsl,cpm2-mdio-bitbang
+  - items:
+  - const: fsl,mpc8272ads-mdio-bitbang
+  - const: fsl,mpc8272-mdio-bitbang
+  - const: fsl,cpm2-mdio-bitbang
+
+  reg:
+maxItems: 1
+
+  fsl,mdio-pin:
+$ref: /schemas/types.yaml#/definitions/uint32
+description: pin of port C controlling mdio data
+
+  fsl,mdc-pin:
+$ref: /schemas/types.yaml#/definitions/uint32
+description: pin of port C controlling mdio clock
+
+required:
+  - compatible
+  - reg
+
+allOf:
+  - $ref: mdio.yaml#
+
+unevaluatedProperties: false
+
+examples:
+  - |
+mdio@10d40 {
+compatible = "fsl,mpc8272ads-mdio-bitbang",
+ "fsl,mpc8272-mdio-bitbang",
+ "fsl,cpm2-mdio-bitbang";
+reg = <0x10d40 0x14>;
+#address-cells = <1>;
+#size-cells = <0>;
+fsl,mdio-pin = <12>;
+fsl,mdc-pin = <13>;
+};
+
diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,ucc-hdlc.yaml 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,ucc-hdlc.yaml
new file mode 100644
index 0..64ffbf75dd9d2
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,ucc-hdlc.yaml
@@ -0,0 +1,140 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/fsl/cpm_qe/fsl,ucc-hdlc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: High-Level Data Link Control(HDLC)
+
+description: HDLC part in Universal communication controllers (UCCs)
+
+maintainers:
+  - Frank Li 
+
+properties:
+  compatible:
+const: fsl,ucc-hdlc
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItem

Re: [RFC PATCH 1/6] ALSA: compress: add Sample Rate Converter codec support

2024-08-09 Thread Pierre-Louis Bossart



>>> And metadata
>>> ioctl can be called many times which can meet the ratio modifier
>>> requirement (ratio may be drift on the fly)
>>
>> Interesting, that's yet another way of handling the drift with userspace
>> modifying the ratio dynamically. That's different to what I've seen
>> before.
> 
> Note that the "timing" is managed by the user space with this scheme.
> 
>>> And compress API uses codec as the unit for capability query and
>>> parameter setting,  So I think need to define "SND_AUDIOCODEC_SRC'
>>> and 'struct snd_dec_src',  for the 'snd_dec_src' just defined output
>>> format and output rate, channels definition just reuse the
>>> snd_codec.ch_in.
>>
>> The capability query is an interesting point as well, it's not clear how
>> to expose to userspace what this specific implementation can do, while
>> at the same time *requiring* userpace to update the ratio dynamically.
>> For something like this to work, userspace needs to have pre-existing
>> information on how the SRC works.
> 
> Yes, it's about abstraction. The user space wants to push data, read
> data back converted to the target rate and eventually modify the drift
> using a control managing clocks using own way. We can eventually assume,
> that if this control does not exist, the drift cannot be controlled.
> Also, nice thing is that the control has min and max values (range), so
> driver can specify the drift range, too.

This mode of operation would be fine, but if you add the SRC as part of
the processing allowed in a compressed stream, it might be used in a
'regular' real-time pipeline and the arguments  and implementation could
be very different.

In other words, this SRC support looks like an extension of the compress
API for a very narrow usage restricted to the memory-to-memory case
only. I worry a bit about the next steps... Are there going to be other
types of PCM processing like this, and if yes, how would we know if they
are intended to be used for the 'regular' compress API or for the
memory-to-memory scheme?


Re: [PATCHSET 00/10] perf tools: Sync tools and kernel headers for v6.11

2024-08-09 Thread Namhyung Kim
On Fri, Aug 9, 2024 at 8:39 AM Athira Rajeev
 wrote:
>
>
>
> > On 9 Aug 2024, at 12:14 AM, Namhyung Kim  wrote:
> >
> > Hello,
> >
> > On Thu, Aug 08, 2024 at 12:14:12PM +0530, Athira Rajeev wrote:
> >>
> >>
> >>> On 7 Aug 2024, at 11:42 PM, Namhyung Kim  wrote:
> >>>
> >>> Hello folks,
> >>>
> >>> On Tue, Aug 06, 2024 at 03:50:03PM -0700, Namhyung Kim wrote:
>  Hello,
> 
>  This is the usual sync up in header files we keep in tools directory.
>  I put a file to give the reason of this work and not to repeat it in
>  every commit message.  The changes will be carried in the perf-tools
>  tree.
> >>>
> >>> Could you please double check what's in the tmp.perf-tools branch at the
> >>> perf-tools tree so I don't break build and perf trace for arm64, powerpc
> >>> and s390?  It has this patchset + arm64 unistd header revert (according
> >>> to the discussion on patch 6/10) on top of v6.11-rc2.
> >>>
> >>> Thanks,
> >>> Namhyung
> >> Hi Namhyung,
> >>
> >> Can you please point to the tree. I checked in 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git as well as 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git , 
> >> but didn’t find the changes. May be I am missing something. I am trying to 
> >> check the build in powerpc.
> >
> > Oh, sorry about that.  It's in:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git
> >
> > (no -next at the end)
>
> Hi,
>
> I did compile test on powerpc and results are good.
>
> Tested-by: Athira Rajeev 

Thanks for doing this!
Namhyung